* [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 2:13 [PATCH net 0/1] NULL ptr dereference in tcp_rearm_rto Josh Hunt
@ 2024-08-23 2:13 ` Josh Hunt
2024-08-23 3:27 ` Jason Xing
2024-08-23 6:42 ` Eric Dumazet
0 siblings, 2 replies; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 2:13 UTC (permalink / raw)
To: edumazet, davem, kuba, pabeni, netdev; +Cc: linux-kernel, johunt
There have been multiple occassions where we have crashed in this path
because packets_out suggested there were packets on the write or retransmit
queues, but in fact there weren't leading to a NULL skb being dereferenced.
While we should fix that root cause we should also just make sure the skb
is not NULL before dereferencing it. Also add a warn once here to capture
some information if/when the problem case is hit again.
Signed-off-by: Josh Hunt <johunt@akamai.com>
---
include/net/tcp.h | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 2aac11e7e1cc..19ea6ed87880 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2433,10 +2433,19 @@ void tcp_plb_update_state_upon_rto(struct sock *sk, struct tcp_plb_state *plb);
static inline s64 tcp_rto_delta_us(const struct sock *sk)
{
const struct sk_buff *skb = tcp_rtx_queue_head(sk);
- u32 rto = inet_csk(sk)->icsk_rto;
- u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto);
+ u32 rto = jiffies_to_usecs(inet_csk(sk)->icsk_rto);
+
+ if (likely(skb)) {
+ u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + rto;
+
+ return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
+ } else {
+ WARN_ONCE(1,
+ "rtx queue emtpy: inflight %u tlp_high_seq %u state %u\n",
+ tcp_sk(sk)->packets_out, tcp_sk(sk)->tlp_high_seq, sk->sk_state);
+ return rto;
+ }
- return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 2:13 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
@ 2024-08-23 3:27 ` Jason Xing
2024-08-23 3:33 ` Josh Hunt
2024-08-23 6:42 ` Eric Dumazet
1 sibling, 1 reply; 12+ messages in thread
From: Jason Xing @ 2024-08-23 3:27 UTC (permalink / raw)
To: Josh Hunt; +Cc: edumazet, davem, kuba, pabeni, netdev, linux-kernel
Hello Josh,
On Fri, Aug 23, 2024 at 11:02 AM Josh Hunt <johunt@akamai.com> wrote:
>
> There have been multiple occassions where we have crashed in this path
> because packets_out suggested there were packets on the write or retransmit
> queues, but in fact there weren't leading to a NULL skb being dereferenced.
Could you show us the detailed splats and more information about it so
that we can know what exactly happened?
Thanks,
Jason
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 3:27 ` Jason Xing
@ 2024-08-23 3:33 ` Josh Hunt
2024-08-23 4:21 ` Josh Hunt
0 siblings, 1 reply; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 3:33 UTC (permalink / raw)
To: Jason Xing; +Cc: edumazet, davem, kuba, pabeni, netdev, linux-kernel
On 8/22/24 8:27 PM, Jason Xing wrote:
>
> Hello Josh,
>
> On Fri, Aug 23, 2024 at 11:02 AM Josh Hunt <johunt@akamai.com> wrote:
>>
>> There have been multiple occassions where we have crashed in this path
>> because packets_out suggested there were packets on the write or retransmit
>> queues, but in fact there weren't leading to a NULL skb being dereferenced.
>
> Could you show us the detailed splats and more information about it so
> that we can know what exactly happened?
Hey Jason
Yeah for some reason my cover letter did not come through which has the
oops info that we hit. I'll resend it now. Fingers crossed it goes
through this time :)
Josh
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 0/1] NULL ptr dereference in tcp_rearm_rto
@ 2024-08-23 3:34 Josh Hunt
2024-08-23 3:34 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
0 siblings, 1 reply; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 3:34 UTC (permalink / raw)
To: edumazet, davem, kuba, pabeni, netdev; +Cc: linux-kernel, johunt
Resending as the first attempt did not go through.
We have some machines running stock Ubuntu 20.04.6 which is their 5.4.0-174-generic
kernel that are running ceph and recently hit a null ptr dereference in
tcp_rearm_rto(). Initially hitting it from the TLP path, but then later we also
saw it getting hit from the RACK case as well. Here are examples of the oops
messages we saw in each of those cases:
Jul 26 15:05:02 rx [11061395.780353] BUG: kernel NULL pointer dereference, address: 0000000000000020
Jul 26 15:05:02 rx [11061395.787572] #PF: supervisor read access in kernel mode
Jul 26 15:05:02 rx [11061395.792971] #PF: error_code(0x0000) - not-present page
Jul 26 15:05:02 rx [11061395.798362] PGD 0 P4D 0
Jul 26 15:05:02 rx [11061395.801164] Oops: 0000 [#1] SMP NOPTI
Jul 26 15:05:02 rx [11061395.805091] CPU: 0 PID: 9180 Comm: msgr-worker-1 Tainted: G W 5.4.0-174-generic #193-Ubuntu
Jul 26 15:05:02 rx [11061395.814996] Hardware name: Supermicro SMC 2x26 os-gen8 64C NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
Jul 26 15:05:02 rx [11061395.825952] RIP: 0010:tcp_rearm_rto+0xe4/0x160
Jul 26 15:05:02 rx [11061395.830656] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 e3
Jul 26 15:05:02 rx [11061395.849665] RSP: 0018:ffffb75d40003e08 EFLAGS: 00010246
Jul 26 15:05:02 rx [11061395.855149] RAX: 0000000000000000 RBX: 20c49ba5e353f7cf RCX: 0000000000000000
Jul 26 15:05:02 rx [11061395.862542] RDX: 0000000062177c30 RSI: 000000000000231c RDI: ffff9874ad283a60
Jul 26 15:05:02 rx [11061395.869933] RBP: ffffb75d40003e20 R08: 0000000000000000 R09: ffff987605e20aa8
Jul 26 15:05:02 rx [11061395.877318] R10: ffffb75d40003f00 R11: ffffb75d4460f740 R12: ffff9874ad283900
Jul 26 15:05:02 rx [11061395.884710] R13: ffff9874ad283a60 R14: ffff9874ad283980 R15: ffff9874ad283d30
Jul 26 15:05:02 rx [11061395.892095] FS: 00007f1ef4a2e700(0000) GS:ffff987605e00000(0000) knlGS:0000000000000000
Jul 26 15:05:02 rx [11061395.900438] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 26 15:05:02 rx [11061395.906435] CR2: 0000000000000020 CR3: 0000003e450ba003 CR4: 0000000000760ef0
Jul 26 15:05:02 rx [11061395.913822] PKRU: 55555554
Jul 26 15:05:02 rx [11061395.916786] Call Trace:
Jul 26 15:05:02 rx [11061395.919488]
Jul 26 15:05:02 rx [11061395.921765] ? show_regs.cold+0x1a/0x1f
Jul 26 15:05:02 rx [11061395.925859] ? __die+0x90/0xd9
Jul 26 15:05:02 rx [11061395.929169] ? no_context+0x196/0x380
Jul 26 15:05:02 rx [11061395.933088] ? ip6_protocol_deliver_rcu+0x4e0/0x4e0
Jul 26 15:05:02 rx [11061395.938216] ? ip6_sublist_rcv_finish+0x3d/0x50
Jul 26 15:05:02 rx [11061395.943000] ? __bad_area_nosemaphore+0x50/0x1a0
Jul 26 15:05:02 rx [11061395.947873] ? bad_area_nosemaphore+0x16/0x20
Jul 26 15:05:02 rx [11061395.952486] ? do_user_addr_fault+0x267/0x450
Jul 26 15:05:02 rx [11061395.957104] ? ipv6_list_rcv+0x112/0x140
Jul 26 15:05:02 rx [11061395.961279] ? __do_page_fault+0x58/0x90
Jul 26 15:05:02 rx [11061395.965458] ? do_page_fault+0x2c/0xe0
Jul 26 15:05:02 rx [11061395.969465] ? page_fault+0x34/0x40
Jul 26 15:05:02 rx [11061395.973217] ? tcp_rearm_rto+0xe4/0x160
Jul 26 15:05:02 rx [11061395.977313] ? tcp_rearm_rto+0xe4/0x160
Jul 26 15:05:02 rx [11061395.981408] tcp_send_loss_probe+0x10b/0x220
Jul 26 15:05:02 rx [11061395.985937] tcp_write_timer_handler+0x1b4/0x240
Jul 26 15:05:02 rx [11061395.990809] tcp_write_timer+0x9e/0xe0
Jul 26 15:05:02 rx [11061395.994814] ? tcp_write_timer_handler+0x240/0x240
Jul 26 15:05:02 rx [11061395.999866] call_timer_fn+0x32/0x130
Jul 26 15:05:02 rx [11061396.003782] __run_timers.part.0+0x180/0x280
Jul 26 15:05:02 rx [11061396.008309] ? recalibrate_cpu_khz+0x10/0x10
Jul 26 15:05:02 rx [11061396.012841] ? native_x2apic_icr_write+0x30/0x30
Jul 26 15:05:02 rx [11061396.017718] ? lapic_next_event+0x21/0x30
Jul 26 15:05:02 rx [11061396.021984] ? clockevents_program_event+0x8f/0xe0
Jul 26 15:05:02 rx [11061396.027035] run_timer_softirq+0x2a/0x50
Jul 26 15:05:02 rx [11061396.031212] __do_softirq+0xd1/0x2c1
Jul 26 15:05:02 rx [11061396.035044] do_softirq_own_stack+0x2a/0x40
Jul 26 15:05:02 rx [11061396.039480]
Jul 26 15:05:02 rx [11061396.041840] do_softirq.part.0+0x46/0x50
Jul 26 15:05:02 rx [11061396.046022] __local_bh_enable_ip+0x50/0x60
Jul 26 15:05:02 rx [11061396.050460] _raw_spin_unlock_bh+0x1e/0x20
Jul 26 15:05:02 rx [11061396.054817] nf_conntrack_tcp_packet+0x29e/0xbe0 [nf_conntrack]
Jul 26 15:05:02 rx [11061396.060994] ? get_l4proto+0xe7/0x190 [nf_conntrack]
Jul 26 15:05:02 rx [11061396.066220] nf_conntrack_in+0xe9/0x670 [nf_conntrack]
Jul 26 15:05:02 rx [11061396.071618] ipv6_conntrack_local+0x14/0x20 [nf_conntrack]
Jul 26 15:05:02 rx [11061396.077356] nf_hook_slow+0x45/0xb0
Jul 26 15:05:02 rx [11061396.081098] ip6_xmit+0x3f0/0x5d0
Jul 26 15:05:02 rx [11061396.084670] ? ipv6_anycast_cleanup+0x50/0x50
Jul 26 15:05:02 rx [11061396.089282] ? __sk_dst_check+0x38/0x70
Jul 26 15:05:02 rx [11061396.093381] ? inet6_csk_route_socket+0x13b/0x200
Jul 26 15:05:02 rx [11061396.098346] inet6_csk_xmit+0xa7/0xf0
Jul 26 15:05:02 rx [11061396.102263] __tcp_transmit_skb+0x550/0xb30
Jul 26 15:05:02 rx [11061396.106701] tcp_write_xmit+0x3c6/0xc20
Jul 26 15:05:02 rx [11061396.110792] ? __alloc_skb+0x98/0x1d0
Jul 26 15:05:02 rx [11061396.114708] __tcp_push_pending_frames+0x37/0x100
Jul 26 15:05:02 rx [11061396.119667] tcp_push+0xfd/0x100
Jul 26 15:05:02 rx [11061396.123150] tcp_sendmsg_locked+0xc70/0xdd0
Jul 26 15:05:02 rx [11061396.127588] tcp_sendmsg+0x2d/0x50
Jul 26 15:05:02 rx [11061396.131245] inet6_sendmsg+0x43/0x70
Jul 26 15:05:02 rx [11061396.135075] __sock_sendmsg+0x48/0x70
Jul 26 15:05:02 rx [11061396.138994] ____sys_sendmsg+0x212/0x280
Jul 26 15:05:02 rx [11061396.143172] ___sys_sendmsg+0x88/0xd0
Jul 26 15:05:02 rx [11061396.147098] ? __seccomp_filter+0x7e/0x6b0
Jul 26 15:05:02 rx [11061396.151446] ? __switch_to+0x39c/0x460
Jul 26 15:05:02 rx [11061396.155453] ? __switch_to_asm+0x42/0x80
Jul 26 15:05:02 rx [11061396.159636] ? __switch_to_asm+0x5a/0x80
Jul 26 15:05:02 rx [11061396.163816] __sys_sendmsg+0x5c/0xa0
Jul 26 15:05:02 rx [11061396.167647] __x64_sys_sendmsg+0x1f/0x30
Jul 26 15:05:02 rx [11061396.171832] do_syscall_64+0x57/0x190
Jul 26 15:05:02 rx [11061396.175748] entry_SYSCALL_64_after_hwframe+0x5c/0xc1
Jul 26 15:05:02 rx [11061396.181055] RIP: 0033:0x7f1ef692618d
Jul 26 15:05:02 rx [11061396.184893] Code: 28 89 54 24 1c 48 89 74 24 10 89 7c 24 08 e8 ca ee ff ff 8b 54 24 1c 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2f 44 89 c7 48 89 44 24 08 e8 fe ee ff ff 48
Jul 26 15:05:02 rx [11061396.203889] RSP: 002b:00007f1ef4a26aa0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
Jul 26 15:05:02 rx [11061396.211708] RAX: ffffffffffffffda RBX: 000000000000084b RCX: 00007f1ef692618d
Jul 26 15:05:02 rx [11061396.219091] RDX: 0000000000004000 RSI: 00007f1ef4a26b10 RDI: 0000000000000275
Jul 26 15:05:02 rx [11061396.226475] RBP: 0000000000004000 R08: 0000000000000000 R09: 0000000000000020
Jul 26 15:05:02 rx [11061396.233859] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000084b
Jul 26 15:05:02 rx [11061396.241243] R13: 00007f1ef4a26b10 R14: 0000000000000275 R15: 000055592030f1e8
Jul 26 15:05:02 rx [11061396.248628] Modules linked in: vrf bridge stp llc vxlan ip6_udp_tunnel udp_tunnel nls_iso8859_1 amd64_edac_mod edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof ipmi_ssif input_leds joydev rndis_host cdc_ether usbnet mii ast drm_vram_helper ttm drm_kms_helper i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt ccp mac_hid ipmi_si ipmi_devintf ipmi_msghandler nft_ct sch_fq_codel nf_tables_set nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ramoops reed_solomon efi_pstore drm ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib_core raid1 mlx5_core hid_generic pci_hyperv_intf crc32_pclmul tls usbhid ahci mlxfw bnxt_en libahci hid nvme i2c_piix4 nvme_core wmi
Jul 26 15:05:02 rx [11061396.324334] CR2: 0000000000000020
Jul 26 15:05:02 rx [11061396.327944] ---[ end trace 68a2b679d1cfb4f1 ]---
Jul 26 15:05:02 rx [11061396.433435] RIP: 0010:tcp_rearm_rto+0xe4/0x160
Jul 26 15:05:02 rx [11061396.438137] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 e3
Jul 26 15:05:02 rx [11061396.457144] RSP: 0018:ffffb75d40003e08 EFLAGS: 00010246
Jul 26 15:05:02 rx [11061396.462629] RAX: 0000000000000000 RBX: 20c49ba5e353f7cf RCX: 0000000000000000
Jul 26 15:05:02 rx [11061396.470012] RDX: 0000000062177c30 RSI: 000000000000231c RDI: ffff9874ad283a60
Jul 26 15:05:02 rx [11061396.477396] RBP: ffffb75d40003e20 R08: 0000000000000000 R09: ffff987605e20aa8
Jul 26 15:05:02 rx [11061396.484779] R10: ffffb75d40003f00 R11: ffffb75d4460f740 R12: ffff9874ad283900
Jul 26 15:05:02 rx [11061396.492164] R13: ffff9874ad283a60 R14: ffff9874ad283980 R15: ffff9874ad283d30
Jul 26 15:05:02 rx [11061396.499547] FS: 00007f1ef4a2e700(0000) GS:ffff987605e00000(0000) knlGS:0000000000000000
Jul 26 15:05:02 rx [11061396.507886] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 26 15:05:02 rx [11061396.513884] CR2: 0000000000000020 CR3: 0000003e450ba003 CR4: 0000000000760ef0
Jul 26 15:05:02 rx [11061396.521267] PKRU: 55555554
Jul 26 15:05:02 rx [11061396.524230] Kernel panic - not syncing: Fatal exception in interrupt
Jul 26 15:05:02 rx [11061396.530885] Kernel Offset: 0x1b200000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
Jul 26 15:05:03 rx [11061396.660181] ---[ end Kernel panic - not syncing: Fatal
exception in interrupt ]---
After we saw this we disabled TLP by setting tcp_early_retrans to 0 and then hit the crash in the RACK case:
Aug 7 07:26:16 rx [1006006.265582] BUG: kernel NULL pointer dereference, address: 0000000000000020
Aug 7 07:26:16 rx [1006006.272719] #PF: supervisor read access in kernel mode
Aug 7 07:26:16 rx [1006006.278030] #PF: error_code(0x0000) - not-present page
Aug 7 07:26:16 rx [1006006.283343] PGD 0 P4D 0
Aug 7 07:26:16 rx [1006006.286057] Oops: 0000 [#1] SMP NOPTI
Aug 7 07:26:16 rx [1006006.289896] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 5.4.0-174-generic #193-Ubuntu
Aug 7 07:26:16 rx [1006006.299107] Hardware name: Supermicro SMC 2x26 os-gen8 64C NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
Aug 7 07:26:16 rx [1006006.309970] RIP: 0010:tcp_rearm_rto+0xe4/0x160
Aug 7 07:26:16 rx [1006006.314584] Code: 87 ca 04 00 00 00 5b 41 5c 41 5d 5d c3 c3 49 8b bc 24 40 06 00 00 eb 8d 48 bb cf f7 53 e3 a5 9b c4 20 4c 89 ef e8 0c fe 0e 00 <48> 8b 78 20 48 c1 ef 03 48 89 f8 41 8b bc 24 80 04 00 00 48 f7 e3
Aug 7 07:26:16 rx [1006006.333499] RSP: 0018:ffffb42600a50960 EFLAGS: 00010246
Aug 7 07:26:16 rx [1006006.338895] RAX: 0000000000000000 RBX: 20c49ba5e353f7cf RCX: 0000000000000000
Aug 7 07:26:16 rx [1006006.346193] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff92d687ed8160
Aug 7 07:26:16 rx [1006006.353489] RBP: ffffb42600a50978 R08: 0000000000000000 R09: 00000000cd896dcc
Aug 7 07:26:16 rx [1006006.360786] R10: ffff92dc3404f400 R11: 0000000000000001 R12: ffff92d687ed8000
Aug 7 07:26:16 rx [1006006.368084] R13: ffff92d687ed8160 R14: 00000000cd896dcc R15: 00000000cd8fca81
Aug 7 07:26:16 rx [1006006.375381] FS: 0000000000000000(0000) GS:ffff93158ad40000(0000) knlGS:0000000000000000
Aug 7 07:26:16 rx [1006006.383632] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 7 07:26:16 rx [1006006.389544] CR2: 0000000000000020 CR3: 0000003e775ce006 CR4: 0000000000760ee0
Aug 7 07:26:16 rx [1006006.396839] PKRU: 55555554
Aug 7 07:26:16 rx [1006006.399717] Call Trace:
Aug 7 07:26:16 rx [1006006.402335]
Aug 7 07:26:16 rx [1006006.404525] ? show_regs.cold+0x1a/0x1f
Aug 7 07:26:16 rx [1006006.408532] ? __die+0x90/0xd9
Aug 7 07:26:16 rx [1006006.411760] ? no_context+0x196/0x380
Aug 7 07:26:16 rx [1006006.415599] ? __bad_area_nosemaphore+0x50/0x1a0
Aug 7 07:26:16 rx [1006006.420392] ? _raw_spin_lock+0x1e/0x30
Aug 7 07:26:16 rx [1006006.424401] ? bad_area_nosemaphore+0x16/0x20
Aug 7 07:26:16 rx [1006006.428927] ? do_user_addr_fault+0x267/0x450
Aug 7 07:26:16 rx [1006006.433450] ? __do_page_fault+0x58/0x90
Aug 7 07:26:16 rx [1006006.437542] ? do_page_fault+0x2c/0xe0
Aug 7 07:26:16 rx [1006006.441470] ? page_fault+0x34/0x40
Aug 7 07:26:16 rx [1006006.445134] ? tcp_rearm_rto+0xe4/0x160
Aug 7 07:26:16 rx [1006006.449145] tcp_ack+0xa32/0xb30
Aug 7 07:26:16 rx [1006006.452542] tcp_rcv_established+0x13c/0x670
Aug 7 07:26:16 rx [1006006.456981] ? sk_filter_trim_cap+0x48/0x220
Aug 7 07:26:16 rx [1006006.461419] tcp_v6_do_rcv+0xdb/0x450
Aug 7 07:26:16 rx [1006006.465257] tcp_v6_rcv+0xc2b/0xd10
Aug 7 07:26:16 rx [1006006.468918] ip6_protocol_deliver_rcu+0xd3/0x4e0
Aug 7 07:26:16 rx [1006006.473706] ip6_input_finish+0x15/0x20
Aug 7 07:26:16 rx [1006006.477710] ip6_input+0xa2/0xb0
Aug 7 07:26:16 rx [1006006.481109] ? ip6_protocol_deliver_rcu+0x4e0/0x4e0
Aug 7 07:26:16 rx [1006006.486151] ip6_sublist_rcv_finish+0x3d/0x50
Aug 7 07:26:16 rx [1006006.490679] ip6_sublist_rcv+0x1aa/0x250
Aug 7 07:26:16 rx [1006006.494779] ? ip6_rcv_finish_core.isra.0+0xa0/0xa0
Aug 7 07:26:16 rx [1006006.499828] ipv6_list_rcv+0x112/0x140
Aug 7 07:26:16 rx [1006006.503748] __netif_receive_skb_list_core+0x1a4/0x250
Aug 7 07:26:16 rx [1006006.509057] netif_receive_skb_list_internal+0x1a1/0x2b0
Aug 7 07:26:16 rx [1006006.514538] gro_normal_list.part.0+0x1e/0x40
Aug 7 07:26:16 rx [1006006.519068] napi_complete_done+0x91/0x130
Aug 7 07:26:16 rx [1006006.523352] mlx5e_napi_poll+0x18e/0x610 [mlx5_core]
Aug 7 07:26:16 rx [1006006.528481] net_rx_action+0x142/0x390
Aug 7 07:26:16 rx [1006006.532398] __do_softirq+0xd1/0x2c1
Aug 7 07:26:16 rx [1006006.536142] irq_exit+0xae/0xb0
Aug 7 07:26:16 rx [1006006.539452] do_IRQ+0x5a/0xf0
Aug 7 07:26:16 rx [1006006.542590] common_interrupt+0xf/0xf
Aug 7 07:26:16 rx [1006006.546421]
Aug 7 07:26:16 rx [1006006.548695] RIP: 0010:native_safe_halt+0xe/0x10
Aug 7 07:26:16 rx [1006006.553399] Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 36 2c 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 2c 50 00 fb f4 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 dd 5e 61 ff 65
Aug 7 07:26:16 rx [1006006.572309] RSP: 0018:ffffb42600177e70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffc2
Aug 7 07:26:16 rx [1006006.580040] RAX: ffffffff8ed08b20 RBX: 0000000000000005 RCX: 0000000000000001
Aug 7 07:26:16 rx [1006006.587337] RDX: 00000000f48eeca2 RSI: 0000000000000082 RDI: 0000000000000082
Aug 7 07:26:16 rx [1006006.594635] RBP: ffffb42600177e90 R08: 0000000000000000 R09: 000000000000020f
Aug 7 07:26:16 rx [1006006.601931] R10: 0000000000100000 R11: 0000000000000000 R12: 0000000000000005
Aug 7 07:26:16 rx [1006006.609229] R13: ffff93157deb5f00 R14: 0000000000000000 R15: 0000000000000000
Aug 7 07:26:16 rx [1006006.616530] ? __cpuidle_text_start+0x8/0x8
Aug 7 07:26:16 rx [1006006.620886] ? default_idle+0x20/0x140
Aug 7 07:26:16 rx [1006006.624804] arch_cpu_idle+0x15/0x20
Aug 7 07:26:16 rx [1006006.628545] default_idle_call+0x23/0x30
Aug 7 07:26:16 rx [1006006.632640] do_idle+0x1fb/0x270
Aug 7 07:26:16 rx [1006006.636035] cpu_startup_entry+0x20/0x30
Aug 7 07:26:16 rx [1006006.640126] start_secondary+0x178/0x1d0
Aug 7 07:26:16 rx [1006006.644218] secondary_startup_64+0xa4/0xb0
Aug 7 07:26:17 rx [1006006.648568] Modules linked in: vrf bridge stp llc vxlan ip6_udp_tunnel udp_tunnel nls_iso8859_1 nft_ct amd64_edac_mod edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof ipmi_ssif input_leds joydev rndis_host cdc_ether usbnet ast mii drm_vram_helper ttm drm_kms_helper i2c_algo_bit fb_sys_fops syscopyarea sysfillrect sysimgblt ccp mac_hid ipmi_si ipmi_devintf ipmi_msghandler sch_fq_codel nf_tables_set nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink ramoops reed_solomon efi_pstore drm ip_tables x_tables autofs4 raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid0 multipath linear mlx5_ib ib_uverbs ib_core raid1 hid_generic mlx5_core pci_hyperv_intf crc32_pclmul usbhid ahci tls mlxfw bnxt_en hid libahci nvme i2c_piix4 nvme_core wmi [last unloaded: cpuid]
Aug 7 07:26:17 rx [1006006.726180] CR2: 0000000000000020
Aug 7 07:26:17 rx [1006006.729718] ---[ end trace e0e2e37e4e612984 ]---
Prior to seeing the first crash and on other machines we also see the warning in
tcp_send_loss_probe() where packets_out is non-zero, but both transmit and retrans
queues are empty so we know the box is seeing some accounting issue in this area:
Jul 26 09:15:27 kernel: ------------[ cut here ]------------
Jul 26 09:15:27 kernel: invalid inflight: 2 state 1 cwnd 68 mss 8988
Jul 26 09:15:27 kernel: WARNING: CPU: 16 PID: 0 at net/ipv4/tcp_output.c:2605 tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: Modules linked in: vrf bridge stp llc vxlan ip6_udp_tunnel udp_tunnel nls_iso8859_1 nft_ct amd64_edac_mod edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper wmi_bmof ipmi_ssif joydev input_leds rndis_host cdc_ether usbnet mii ast drm_vram_helper ttm drm_kms_he>
Jul 26 09:15:27 kernel: CPU: 16 PID: 0 Comm: swapper/16 Not tainted 5.4.0-174-generic #193-Ubuntu
Jul 26 09:15:27 kernel: Hardware name: Supermicro SMC 2x26 os-gen8 64C NVME-Y 256G/H12SSW-NTR, BIOS 2.5.V1.2U.NVMe.UEFI 05/09/2023
Jul 26 09:15:27 kernel: RIP: 0010:tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: Code: 08 26 01 00 75 e2 41 0f b6 54 24 12 41 8b 8c 24 c0 06 00 00 45 89 f0 48 c7 c7 e0 b4 20 a7 c6 05 8d 08 26 01 01 e8 4a c0 0f 00 <0f> 0b eb ba 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
Jul 26 09:15:27 kernel: RSP: 0018:ffffb7838088ce00 EFLAGS: 00010286
Jul 26 09:15:27 kernel: RAX: 0000000000000000 RBX: ffff9b84b5630430 RCX: 0000000000000006
Jul 26 09:15:27 kernel: RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9b8e4621c8c0
Jul 26 09:15:27 kernel: RBP: ffffb7838088ce18 R08: 0000000000000927 R09: 0000000000000004
Jul 26 09:15:27 kernel: R10: 0000000000000000 R11: 0000000000000001 R12: ffff9b84b5630000
Jul 26 09:15:27 kernel: R13: 0000000000000000 R14: 000000000000231c R15: ffff9b84b5630430
Jul 26 09:15:27 kernel: FS: 0000000000000000(0000) GS:ffff9b8e46200000(0000) knlGS:0000000000000000
Jul 26 09:15:27 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 26 09:15:27 kernel: CR2: 000056238cec2380 CR3: 0000003e49ede005 CR4: 0000000000760ee0
Jul 26 09:15:27 kernel: PKRU: 55555554
Jul 26 09:15:27 kernel: Call Trace:
Jul 26 09:15:27 kernel: <IRQ>
Jul 26 09:15:27 kernel: ? show_regs.cold+0x1a/0x1f
Jul 26 09:15:27 kernel: ? __warn+0x98/0xe0
Jul 26 09:15:27 kernel: ? tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: ? report_bug+0xd1/0x100
Jul 26 09:15:27 kernel: ? do_error_trap+0x9b/0xc0
Jul 26 09:15:27 kernel: ? do_invalid_op+0x3c/0x50
Jul 26 09:15:27 kernel: ? tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: ? invalid_op+0x1e/0x30
Jul 26 09:15:27 kernel: ? tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: tcp_write_timer_handler+0x1b4/0x240
Jul 26 09:15:27 kernel: tcp_write_timer+0x9e/0xe0
Jul 26 09:15:27 kernel: ? tcp_write_timer_handler+0x240/0x240
Jul 26 09:15:27 kernel: call_timer_fn+0x32/0x130
Jul 26 09:15:27 kernel: __run_timers.part.0+0x180/0x280
Jul 26 09:15:27 kernel: ? timerqueue_add+0x9b/0xb0
Jul 26 09:15:27 kernel: ? enqueue_hrtimer+0x3d/0x90
Jul 26 09:15:27 kernel: ? do_error_trap+0x9b/0xc0
Jul 26 09:15:27 kernel: ? do_invalid_op+0x3c/0x50
Jul 26 09:15:27 kernel: ? tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: ? invalid_op+0x1e/0x30
Jul 26 09:15:27 kernel: ? tcp_send_loss_probe+0x214/0x220
Jul 26 09:15:27 kernel: tcp_write_timer_handler+0x1b4/0x240
Jul 26 09:15:27 kernel: tcp_write_timer+0x9e/0xe0
Jul 26 09:15:27 kernel: ? tcp_write_timer_handler+0x240/0x240
Jul 26 09:15:27 kernel: call_timer_fn+0x32/0x130
Jul 26 09:15:27 kernel: __run_timers.part.0+0x180/0x280
Jul 26 09:15:27 kernel: ? timerqueue_add+0x9b/0xb0
Jul 26 09:15:27 kernel: ? enqueue_hrtimer+0x3d/0x90
Jul 26 09:15:27 kernel: ? recalibrate_cpu_khz+0x10/0x10
Jul 26 09:15:27 kernel: ? ktime_get+0x3e/0xa0
Jul 26 09:15:27 kernel: ? native_x2apic_icr_write+0x30/0x30
Jul 26 09:15:27 kernel: run_timer_softirq+0x2a/0x50
Jul 26 09:15:27 kernel: __do_softirq+0xd1/0x2c1
Jul 26 09:15:27 kernel: irq_exit+0xae/0xb0
Jul 26 09:15:27 kernel: smp_apic_timer_interrupt+0x7b/0x140
Jul 26 09:15:27 kernel: apic_timer_interrupt+0xf/0x20
Jul 26 09:15:27 kernel: </IRQ>
Jul 26 09:15:27 kernel: RIP: 0010:native_safe_halt+0xe/0x10
Jul 26 09:15:27 kernel: Code: 7b ff ff ff eb bd 90 90 90 90 90 90 e9 07 00 00 00 0f 00 2d 36 2c 50 00 f4 c3 66 90 e9 07 00 00 00 0f 00 2d 26 2c 50 00 fb f4 <c3> 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 e8 dd 5e 61 ff 65
Jul 26 09:15:27 kernel: RSP: 0018:ffffb783801cfe70 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
Jul 26 09:15:27 kernel: RAX: ffffffffa6908b20 RBX: 0000000000000010 RCX: 0000000000000001
Jul 26 09:15:27 kernel: RDX: 000000006fc0c97e RSI: 0000000000000082 RDI: 0000000000000082
Jul 26 09:15:27 kernel: RBP: ffffb783801cfe90 R08: 0000000000000000 R09: 0000000000000225
Jul 26 09:15:27 kernel: R10: 0000000000100000 R11: 0000000000000000 R12: 0000000000000010
Jul 26 09:15:27 kernel: R13: ffff9b8e390b0000 R14: 0000000000000000 R15: 0000000000000000
Jul 26 09:15:27 kernel: ? __cpuidle_text_start+0x8/0x8
Jul 26 09:15:27 kernel: ? default_idle+0x20/0x140
Jul 26 09:15:27 kernel: arch_cpu_idle+0x15/0x20
Jul 26 09:15:27 kernel: default_idle_call+0x23/0x30
Jul 26 09:15:27 kernel: do_idle+0x1fb/0x270
Jul 26 09:15:27 kernel: cpu_startup_entry+0x20/0x30
Jul 26 09:15:27 kernel: start_secondary+0x178/0x1d0
Jul 26 09:15:27 kernel: secondary_startup_64+0xa4/0xb0
Jul 26 09:15:27 kernel: ---[ end trace e7ac822987e33be1 ]---
The NULL ptr deref is coming from tcp_rto_delta_us() attempting to pull an skb
off the head of the retransmit queue and then dereferencing that skb to get the
skb_mstamp_ns value via tcp_skb_timestamp_us(skb).
The crash is the same one that was reported a # of years ago here:
https://lore.kernel.org/netdev/86c0f836-9a7c-438b-d81a-839be45f1f58@gmail.com/T/#t
and the kernel we're running has the fix which was added to resolve this issue.
Unfortunately we've been unsuccessful so far in reproducing this problem in the
lab and do not have the luxury of pushing out a new kernel to try and test if
newer kernels resolve this issue at the moment. I realize this is a report
against both an Ubuntu kernel and also an older 5.4 kernel. I have reported this
issue to Ubuntu here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2077657
however I feel like since this issue has possibly cropped up again it makes
sense to build in some protection in this path (even on the latest kernel
versions) since the code in question just blindly assumes there's a valid skb
without testing if it's NULL b/f it looks at the timestamp. The patch I've sent
attempts to catch this problem and also dump some information when we hit the
case to help us debug the issue.
I've not seen anyone else report this problem and so wanted to solicit feedback
from the list to see if there are any thoughts as to what could have possibly
been introduced (maybe through a stable update?) that could be causing this to
resurface.
Josh
Josh Hunt (1):
tcp: check skb is non-NULL in tcp_rto_delta_us()
include/net/tcp.h | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 3:34 [PATCH net 0/1] NULL ptr dereference in tcp_rearm_rto Josh Hunt
@ 2024-08-23 3:34 ` Josh Hunt
2024-08-23 6:55 ` Eric Dumazet
0 siblings, 1 reply; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 3:34 UTC (permalink / raw)
To: edumazet, davem, kuba, pabeni, netdev; +Cc: linux-kernel, johunt
There have been multiple occassions where we have crashed in this path
because packets_out suggested there were packets on the write or retransmit
queues, but in fact there weren't leading to a NULL skb being dereferenced.
While we should fix that root cause we should also just make sure the skb
is not NULL before dereferencing it. Also add a warn once here to capture
some information if/when the problem case is hit again.
Signed-off-by: Josh Hunt <johunt@akamai.com>
---
include/net/tcp.h | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 2aac11e7e1cc..19ea6ed87880 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -2433,10 +2433,19 @@ void tcp_plb_update_state_upon_rto(struct sock *sk, struct tcp_plb_state *plb);
static inline s64 tcp_rto_delta_us(const struct sock *sk)
{
const struct sk_buff *skb = tcp_rtx_queue_head(sk);
- u32 rto = inet_csk(sk)->icsk_rto;
- u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto);
+ u32 rto = jiffies_to_usecs(inet_csk(sk)->icsk_rto);
+
+ if (likely(skb)) {
+ u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + rto;
+
+ return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
+ } else {
+ WARN_ONCE(1,
+ "rtx queue emtpy: inflight %u tlp_high_seq %u state %u\n",
+ tcp_sk(sk)->packets_out, tcp_sk(sk)->tlp_high_seq, sk->sk_state);
+ return rto;
+ }
- return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
}
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 3:33 ` Josh Hunt
@ 2024-08-23 4:21 ` Josh Hunt
0 siblings, 0 replies; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 4:21 UTC (permalink / raw)
To: Jason Xing; +Cc: edumazet, davem, kuba, pabeni, netdev, linux-kernel
On 8/22/24 8:33 PM, Josh Hunt wrote:
> On 8/22/24 8:27 PM, Jason Xing wrote:
>>
>> Hello Josh,
>>
>> On Fri, Aug 23, 2024 at 11:02 AM Josh Hunt <johunt@akamai.com> wrote:
>>>
>>> There have been multiple occassions where we have crashed in this path
>>> because packets_out suggested there were packets on the write or
>>> retransmit
>>> queues, but in fact there weren't leading to a NULL skb being
>>> dereferenced.
>>
>> Could you show us the detailed splats and more information about it so
>> that we can know what exactly happened?
>
> Hey Jason
>
> Yeah for some reason my cover letter did not come through which has the
> oops info that we hit. I'll resend it now. Fingers crossed it goes
> through this time :)
>
> Josh
Seems like our mail server is block the cover letter for some reason
right now. I'll have to figure out why tomorrow. I filed a bug with
Ubuntu as well as sending this patch upstream b/c the kernel we're
running is a stock Ubuntu kernel. The bug report there has most of what
I put in the cover letter:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2077657
Josh
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 2:13 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
2024-08-23 3:27 ` Jason Xing
@ 2024-08-23 6:42 ` Eric Dumazet
2024-08-23 7:36 ` Jason Xing
1 sibling, 1 reply; 12+ messages in thread
From: Eric Dumazet @ 2024-08-23 6:42 UTC (permalink / raw)
To: Josh Hunt, Neal Cardwell; +Cc: davem, kuba, pabeni, netdev, linux-kernel
On Fri, Aug 23, 2024 at 4:14 AM Josh Hunt <johunt@akamai.com> wrote:
>
> There have been multiple occassions where we have crashed in this path
> because packets_out suggested there were packets on the write or retransmit
> queues, but in fact there weren't leading to a NULL skb being dereferenced.
> While we should fix that root cause we should also just make sure the skb
> is not NULL before dereferencing it. Also add a warn once here to capture
> some information if/when the problem case is hit again.
>
> Signed-off-by: Josh Hunt <johunt@akamai.com>
> ---
> include/net/tcp.h | 15 ++++++++++++---
> 1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/include/net/tcp.h b/include/net/tcp.h
> index 2aac11e7e1cc..19ea6ed87880 100644
> --- a/include/net/tcp.h
> +++ b/include/net/tcp.h
> @@ -2433,10 +2433,19 @@ void tcp_plb_update_state_upon_rto(struct sock *sk, struct tcp_plb_state *plb);
> static inline s64 tcp_rto_delta_us(const struct sock *sk)
> {
> const struct sk_buff *skb = tcp_rtx_queue_head(sk);
> - u32 rto = inet_csk(sk)->icsk_rto;
> - u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto);
> + u32 rto = jiffies_to_usecs(inet_csk(sk)->icsk_rto);
> +
> + if (likely(skb)) {
> + u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + rto;
> +
> + return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> + } else {
> + WARN_ONCE(1,
> + "rtx queue emtpy: inflight %u tlp_high_seq %u state %u\n",
> + tcp_sk(sk)->packets_out, tcp_sk(sk)->tlp_high_seq, sk->sk_state);
> + return rto;
> + }
>
> - return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> }
>
> /*
> --
> 2.34.1
>
Are you using a recent linux kernel version ?
I am asking because sometimes patches are submitted while the authors
are using very old kernels,
and they do not state this clearly.
I have never seen such a state.
Please CC Neal on your next submission.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 3:34 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
@ 2024-08-23 6:55 ` Eric Dumazet
2024-08-23 19:43 ` Josh Hunt
2024-09-06 23:16 ` Josh Hunt
0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2024-08-23 6:55 UTC (permalink / raw)
To: Josh Hunt, Neal Cardwell; +Cc: davem, kuba, pabeni, netdev, linux-kernel
On Fri, Aug 23, 2024 at 5:34 AM Josh Hunt <johunt@akamai.com> wrote:
>
> There have been multiple occassions where we have crashed in this path
> because packets_out suggested there were packets on the write or retransmit
> queues, but in fact there weren't leading to a NULL skb being dereferenced.
> While we should fix that root cause we should also just make sure the skb
> is not NULL before dereferencing it. Also add a warn once here to capture
> some information if/when the problem case is hit again.
>
> Signed-off-by: Josh Hunt <johunt@akamai.com>
Hi Josh
We do not want a patch series of one patch, with the stack trace in
the cover letter.
Please send a standalone patch, with all the information in its changelog.
1) Add Neal Cardwell in the CC list.
2) Are you using TCP_REPAIR by any chance ?
3) Please double check your kernel has these fixes.
commit 1f85e6267caca44b30c54711652b0726fadbb131 tcp: do not send
empty skb from tcp_write_xmit()
commit 0c175da7b0378445f5ef53904247cfbfb87e0b78 tcp: prohibit
TCP_REPAIR_OPTIONS if data was already sent
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 6:42 ` Eric Dumazet
@ 2024-08-23 7:36 ` Jason Xing
0 siblings, 0 replies; 12+ messages in thread
From: Jason Xing @ 2024-08-23 7:36 UTC (permalink / raw)
To: Eric Dumazet
Cc: Josh Hunt, Neal Cardwell, davem, kuba, pabeni, netdev,
linux-kernel
On Fri, Aug 23, 2024 at 2:44 PM Eric Dumazet <edumazet@google.com> wrote:
>
> On Fri, Aug 23, 2024 at 4:14 AM Josh Hunt <johunt@akamai.com> wrote:
> >
> > There have been multiple occassions where we have crashed in this path
> > because packets_out suggested there were packets on the write or retransmit
> > queues, but in fact there weren't leading to a NULL skb being dereferenced.
> > While we should fix that root cause we should also just make sure the skb
> > is not NULL before dereferencing it. Also add a warn once here to capture
> > some information if/when the problem case is hit again.
> >
> > Signed-off-by: Josh Hunt <johunt@akamai.com>
> > ---
> > include/net/tcp.h | 15 ++++++++++++---
> > 1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/net/tcp.h b/include/net/tcp.h
> > index 2aac11e7e1cc..19ea6ed87880 100644
> > --- a/include/net/tcp.h
> > +++ b/include/net/tcp.h
> > @@ -2433,10 +2433,19 @@ void tcp_plb_update_state_upon_rto(struct sock *sk, struct tcp_plb_state *plb);
> > static inline s64 tcp_rto_delta_us(const struct sock *sk)
> > {
> > const struct sk_buff *skb = tcp_rtx_queue_head(sk);
> > - u32 rto = inet_csk(sk)->icsk_rto;
> > - u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + jiffies_to_usecs(rto);
> > + u32 rto = jiffies_to_usecs(inet_csk(sk)->icsk_rto);
> > +
> > + if (likely(skb)) {
> > + u64 rto_time_stamp_us = tcp_skb_timestamp_us(skb) + rto;
> > +
> > + return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> > + } else {
> > + WARN_ONCE(1,
> > + "rtx queue emtpy: inflight %u tlp_high_seq %u state %u\n",
> > + tcp_sk(sk)->packets_out, tcp_sk(sk)->tlp_high_seq, sk->sk_state);
> > + return rto;
> > + }
> >
> > - return rto_time_stamp_us - tcp_sk(sk)->tcp_mstamp;
> > }
> >
> > /*
> > --
> > 2.34.1
> >
>
> Are you using a recent linux kernel version ?
I'm afraid that he doesn't. What that link shows to me is the
5.4.0-174-generic ubuntu kernel.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 6:55 ` Eric Dumazet
@ 2024-08-23 19:43 ` Josh Hunt
2024-08-26 20:32 ` Jakub Kicinski
2024-09-06 23:16 ` Josh Hunt
1 sibling, 1 reply; 12+ messages in thread
From: Josh Hunt @ 2024-08-23 19:43 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell; +Cc: davem, kuba, pabeni, netdev, linux-kernel
On 8/22/24 11:55 PM, Eric Dumazet wrote:
> !-------------------------------------------------------------------|
> This Message Is From an External Sender
> This message came from outside your organization.
> |-------------------------------------------------------------------!
>
> On Fri, Aug 23, 2024 at 5:34 AM Josh Hunt <johunt@akamai.com> wrote:
>>
>> There have been multiple occassions where we have crashed in this path
>> because packets_out suggested there were packets on the write or retransmit
>> queues, but in fact there weren't leading to a NULL skb being dereferenced.
>> While we should fix that root cause we should also just make sure the skb
>> is not NULL before dereferencing it. Also add a warn once here to capture
>> some information if/when the problem case is hit again.
>>
>> Signed-off-by: Josh Hunt <johunt@akamai.com>
>
> Hi Josh
>
> We do not want a patch series of one patch, with the stack trace in
> the cover letter.
> Please send a standalone patch, with all the information in its changelog.
>
> 1) Add Neal Cardwell in the CC list.
>
> 2) Are you using TCP_REPAIR by any chance ?
>
> 3) Please double check your kernel has these fixes.
>
> commit 1f85e6267caca44b30c54711652b0726fadbb131 tcp: do not send
> empty skb from tcp_write_xmit()
> commit 0c175da7b0378445f5ef53904247cfbfb87e0b78 tcp: prohibit
> TCP_REPAIR_OPTIONS if data was already sent
>
Thanks Eric. I will resend and also check the commits you mentioned. I
didn't include the writeup in the patch submission b/c it was rather
long and detailed, but will include it in a v2.
Josh
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 19:43 ` Josh Hunt
@ 2024-08-26 20:32 ` Jakub Kicinski
0 siblings, 0 replies; 12+ messages in thread
From: Jakub Kicinski @ 2024-08-26 20:32 UTC (permalink / raw)
To: Josh Hunt
Cc: Eric Dumazet, Neal Cardwell, davem, pabeni, netdev, linux-kernel
On Fri, 23 Aug 2024 12:43:36 -0700 Josh Hunt wrote:
> Thanks Eric. I will resend and also check the commits you mentioned. I
> didn't include the writeup in the patch submission b/c it was rather
> long and detailed, but will include it in a v2.
FWIW in linux networking we use the cover letter as the merge message,
no matter how many patches there are. So both end up in git logs.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us()
2024-08-23 6:55 ` Eric Dumazet
2024-08-23 19:43 ` Josh Hunt
@ 2024-09-06 23:16 ` Josh Hunt
1 sibling, 0 replies; 12+ messages in thread
From: Josh Hunt @ 2024-09-06 23:16 UTC (permalink / raw)
To: Eric Dumazet, Neal Cardwell; +Cc: davem, kuba, pabeni, netdev, linux-kernel
On 8/22/24 11:55 PM, Eric Dumazet wrote:
> On Fri, Aug 23, 2024 at 5:34 AM Josh Hunt <johunt@akamai.com> wrote:
>>
>> There have been multiple occassions where we have crashed in this path
>> because packets_out suggested there were packets on the write or retransmit
>> queues, but in fact there weren't leading to a NULL skb being dereferenced.
>> While we should fix that root cause we should also just make sure the skb
>> is not NULL before dereferencing it. Also add a warn once here to capture
>> some information if/when the problem case is hit again.
>>
>> Signed-off-by: Josh Hunt <johunt@akamai.com>
>
> Hi Josh
>
> We do not want a patch series of one patch, with the stack trace in
> the cover letter.
> Please send a standalone patch, with all the information in its changelog.
>
> 1) Add Neal Cardwell in the CC list.
Sending v2 now with Neal included.
>
> 2) Are you using TCP_REPAIR by any chance ?
>
No, we're not using TCP_REPAIR on these machines.
> 3) Please double check your kernel has these fixes.
>
> commit 1f85e6267caca44b30c54711652b0726fadbb131 tcp: do not send
> empty skb from tcp_write_xmit()
> commit 0c175da7b0378445f5ef53904247cfbfb87e0b78 tcp: prohibit
> TCP_REPAIR_OPTIONS if data was already sent
>
We have the first commit, but not the second.
Josh
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-09-06 23:38 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-23 3:34 [PATCH net 0/1] NULL ptr dereference in tcp_rearm_rto Josh Hunt
2024-08-23 3:34 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
2024-08-23 6:55 ` Eric Dumazet
2024-08-23 19:43 ` Josh Hunt
2024-08-26 20:32 ` Jakub Kicinski
2024-09-06 23:16 ` Josh Hunt
-- strict thread matches above, loose matches on Subject: below --
2024-08-23 2:13 [PATCH net 0/1] NULL ptr dereference in tcp_rearm_rto Josh Hunt
2024-08-23 2:13 ` [PATCH net 1/1] tcp: check skb is non-NULL in tcp_rto_delta_us() Josh Hunt
2024-08-23 3:27 ` Jason Xing
2024-08-23 3:33 ` Josh Hunt
2024-08-23 4:21 ` Josh Hunt
2024-08-23 6:42 ` Eric Dumazet
2024-08-23 7:36 ` Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).