netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iwl-net 0/3] ice: fix synchronization between .ndo_bpf() and reset
@ 2024-06-10 15:37 Larysa Zaremba
  2024-06-10 15:37 ` [PATCH iwl-net 1/3] ice: synchronize XDP setup with reset Larysa Zaremba
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Larysa Zaremba @ 2024-06-10 15:37 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: Larysa Zaremba, Jesse Brandeburg, Tony Nguyen, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alexei Starovoitov,
	Daniel Borkmann, Jesper Dangaard Brouer, John Fastabend,
	Maciej Fijalkowski, netdev, linux-kernel, bpf, magnus.karlsson,
	Michal Kubiak

Fix the problems that are triggered by tx_timeout and ice_xdp() calls,
including both pool and program operations.

PF reset can be triggered asynchronously, e.g. by tx_timeout. With some
unfortunate timings both reset and .ndo_bpf will try to access and modify
XDP rings at the same time, causing system crash, such as the one below:

[ +1.999878] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 14
[ +2.002992] ice 0000:b1:00.0: Registered XDP mem model MEM_TYPE_XSK_BUFF_POOL on Rx ring 18
[Mar15 18:17] ice 0000:b1:00.0 ens801f0np0: NETDEV WATCHDOG: CPU: 38: transmit queue 14 timed out 80692736 ms
[ +0.000093] ice 0000:b1:00.0 ens801f0np0: tx_timeout: VSI_num: 6, Q 14, NTC: 0x0, HW_HEAD: 0x0, NTU: 0x0, INT: 0x4000001
[ +0.000012] ice 0000:b1:00.0 ens801f0np0: tx_timeout recovery level 1, txqueue 14
[ +0.394718] ice 0000:b1:00.0: PTP reset successful
[ +0.006184] BUG: kernel NULL pointer dereference, address: 0000000000000098
[ +0.000045] #PF: supervisor read access in kernel mode
[ +0.000023] #PF: error_code(0x0000) - not-present page
[ +0.000023] PGD 0 P4D 0
[ +0.000018] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ +0.000023] CPU: 38 PID: 7540 Comm: kworker/38:1 Not tainted 6.8.0-rc7 #1
[ +0.000031] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0014.082620210524 08/26/2021
[ +0.000036] Workqueue: ice ice_service_task [ice]
[ +0.000183] RIP: 0010:ice_clean_tx_ring+0xa/0xd0 [ice]
[...]
[ +0.000013] Call Trace:
[ +0.000016] <TASK>
[ +0.000014] ? __die+0x1f/0x70
[ +0.000029] ? page_fault_oops+0x171/0x4f0
[ +0.000029] ? schedule+0x3b/0xd0
[ +0.000027] ? exc_page_fault+0x7b/0x180
[ +0.000022] ? asm_exc_page_fault+0x22/0x30
[ +0.000031] ? ice_clean_tx_ring+0xa/0xd0 [ice]
[ +0.000194] ice_free_tx_ring+0xe/0x60 [ice]
[ +0.000186] ice_destroy_xdp_rings+0x157/0x310 [ice]
[ +0.000151] ice_vsi_decfg+0x53/0xe0 [ice]
[ +0.000180] ice_vsi_rebuild+0x239/0x540 [ice]
[ +0.000186] ice_vsi_rebuild_by_type+0x76/0x180 [ice]
[ +0.000145] ice_rebuild+0x18c/0x840 [ice]
[ +0.000145] ? delay_tsc+0x4a/0xc0
[ +0.000022] ? delay_tsc+0x92/0xc0
[ +0.000020] ice_do_reset+0x140/0x180 [ice]
[ +0.000886] ice_service_task+0x404/0x1030 [ice]
[ +0.000824] process_one_work+0x171/0x340
[ +0.000685] worker_thread+0x277/0x3a0
[ +0.000675] ? preempt_count_add+0x6a/0xa0
[ +0.000677] ? _raw_spin_lock_irqsave+0x23/0x50
[ +0.000679] ? __pfx_worker_thread+0x10/0x10
[ +0.000653] kthread+0xf0/0x120
[ +0.000635] ? __pfx_kthread+0x10/0x10
[ +0.000616] ret_from_fork+0x2d/0x50
[ +0.000612] ? __pfx_kthread+0x10/0x10
[ +0.000604] ret_from_fork_asm+0x1b/0x30
[ +0.000604] </TASK>

Larysa Zaremba (3):
  ice: synchronize XDP setup with reset
  ice: fix locking in ice_xsk_pool_setup()
  ice: make NAPI setting code aware that rtnl-locked request is waiting

 drivers/net/ethernet/intel/ice/ice.h      |  2 ++
 drivers/net/ethernet/intel/ice/ice_lib.c  | 23 ++++++++++---
 drivers/net/ethernet/intel/ice/ice_main.c | 39 ++++++++++++++++++++---
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 12 ++-----
 4 files changed, 57 insertions(+), 19 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-06-13 15:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-10 15:37 [PATCH iwl-net 0/3] ice: fix synchronization between .ndo_bpf() and reset Larysa Zaremba
2024-06-10 15:37 ` [PATCH iwl-net 1/3] ice: synchronize XDP setup with reset Larysa Zaremba
2024-06-10 15:37 ` [PATCH iwl-net 2/3] ice: fix locking in ice_xsk_pool_setup() Larysa Zaremba
2024-06-10 15:37 ` [PATCH iwl-net 3/3] ice: make NAPI setting code aware that rtnl-locked request is waiting Larysa Zaremba
2024-06-12  2:38 ` [PATCH iwl-net 0/3] ice: fix synchronization between .ndo_bpf() and reset Jakub Kicinski
2024-06-12  6:56   ` Larysa Zaremba
2024-06-12 21:09     ` Jakub Kicinski
2024-06-13  8:54       ` Larysa Zaremba
2024-06-13 10:44         ` [Intel-wired-lan] " Przemek Kitszel
2024-06-13 14:13         ` Jakub Kicinski
2024-06-13 15:36           ` Larysa Zaremba
2024-06-13 15:40             ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).