From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, stable@kernel.org,
LongJun Tang <tanglongjun@kylinos.cn>, Yun Lu <luyun@kylinos.cn>,
Willem de Bruijn <willemb@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 6.1 19/79] af_packet: fix soft lockup issue caused by tpacket_snd()
Date: Tue, 22 Jul 2025 15:44:15 +0200 [thread overview]
Message-ID: <20250722134329.087052773@linuxfoundation.org> (raw)
In-Reply-To: <20250722134328.384139905@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yun Lu <luyun@kylinos.cn>
commit 55f0bfc0370539213202f4ce1a07615327ac4713 upstream.
When MSG_DONTWAIT is not set, the tpacket_snd operation will wait for
pending_refcnt to decrement to zero before returning. The pending_refcnt
is decremented by 1 when the skb->destructor function is called,
indicating that the skb has been successfully sent and needs to be
destroyed.
If an error occurs during this process, the tpacket_snd() function will
exit and return error, but pending_refcnt may not yet have decremented to
zero. Assuming the next send operation is executed immediately, but there
are no available frames to be sent in tx_ring (i.e., packet_current_frame
returns NULL), and skb is also NULL, the function will not execute
wait_for_completion_interruptible_timeout() to yield the CPU. Instead, it
will enter a do-while loop, waiting for pending_refcnt to be zero. Even
if the previous skb has completed transmission, the skb->destructor
function can only be invoked in the ksoftirqd thread (assuming NAPI
threading is enabled). When both the ksoftirqd thread and the tpacket_snd
operation happen to run on the same CPU, and the CPU trapped in the
do-while loop without yielding, the ksoftirqd thread will not get
scheduled to run. As a result, pending_refcnt will never be reduced to
zero, and the do-while loop cannot exit, eventually leading to a CPU soft
lockup issue.
In fact, skb is true for all but the first iterations of that loop, and
as long as pending_refcnt is not zero, even if incremented by a previous
call, wait_for_completion_interruptible_timeout() should be executed to
yield the CPU, allowing the ksoftirqd thread to be scheduled. Therefore,
the execution condition of this function should be modified to check if
pending_refcnt is not zero, instead of check skb.
- if (need_wait && skb) {
+ if (need_wait && packet_read_pending(&po->tx_ring)) {
As a result, the judgment conditions are duplicated with the end code of
the while loop, and packet_read_pending() is a very expensive function.
Actually, this loop can only exit when ph is NULL, so the loop condition
can be changed to while (1), and in the "ph = NULL" branch, if the
subsequent condition of if is not met, the loop can break directly. Now,
the loop logic remains the same as origin but is clearer and more obvious.
Fixes: 89ed5b519004 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
Cc: stable@kernel.org
Suggested-by: LongJun Tang <tanglongjun@kylinos.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/packet/af_packet.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2841,15 +2841,21 @@ static int tpacket_snd(struct packet_soc
ph = packet_current_frame(po, &po->tx_ring,
TP_STATUS_SEND_REQUEST);
if (unlikely(ph == NULL)) {
- if (need_wait && skb) {
+ /* Note: packet_read_pending() might be slow if we
+ * have to call it as it's per_cpu variable, but in
+ * fast-path we don't have to call it, only when ph
+ * is NULL, we need to check the pending_refcnt.
+ */
+ if (need_wait && packet_read_pending(&po->tx_ring)) {
timeo = wait_for_completion_interruptible_timeout(&po->skb_completion, timeo);
if (timeo <= 0) {
err = !timeo ? -ETIMEDOUT : -ERESTARTSYS;
goto out_put;
}
- }
- /* check for additional frames */
- continue;
+ /* check for additional frames */
+ continue;
+ } else
+ break;
}
skb = NULL;
@@ -2939,14 +2945,7 @@ tpacket_error:
}
packet_increment_head(&po->tx_ring);
len_sum += tp_len;
- } while (likely((ph != NULL) ||
- /* Note: packet_read_pending() might be slow if we have
- * to call it as it's per_cpu variable, but in fast-path
- * we already short-circuit the loop with the first
- * condition, and luckily don't have to go that path
- * anyway.
- */
- (need_wait && packet_read_pending(&po->tx_ring))));
+ } while (1);
err = len_sum;
goto out_put;
next prev parent reply other threads:[~2025-07-22 13:46 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-22 13:43 [PATCH 6.1 00/79] 6.1.147-rc1 review Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 01/79] phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 02/79] USB: serial: option: add Telit Cinterion FE910C04 (ECM) composition Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 03/79] USB: serial: option: add Foxconn T99W640 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 04/79] USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 05/79] usb: gadget: configfs: Fix OOB read on empty string write Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 06/79] i2c: stm32: fix the device used for the DMA map Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 07/79] thunderbolt: Fix bit masking in tb_dp_port_set_hops() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 08/79] Input: xpad - set correct controller type for Acer NGR200 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 09/79] pch_uart: Fix dma_sync_sg_for_device() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 10/79] HID: core: ensure the allocated report buffer can contain the reserved report ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 11/79] HID: core: ensure __hid_request reserves the report ID as the first byte Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 12/79] HID: core: do not bypass hid_hw_raw_request Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 13/79] tracing: Add down_write(trace_event_sem) when adding trace event Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 14/79] io_uring/poll: fix POLLERR handling Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 15/79] phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 16/79] net/mlx5: Update the list of the PCI supported devices Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 17/79] arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 18/79] af_packet: fix the SO_SNDTIMEO constraint not effective on tpacked_snd() Greg Kroah-Hartman
2025-07-22 13:44 ` Greg Kroah-Hartman [this message]
2025-07-22 13:44 ` [PATCH 6.1 20/79] dmaengine: nbpfaxi: Fix memory corruption in probe() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 21/79] isofs: Verify inode mode when loading from disk Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 22/79] memstick: core: Zero initialize id_reg in h_memstick_read_dev_id() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 23/79] mmc: bcm2835: Fix dma_unmap_sg() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 24/79] mmc: sdhci-pci: Quirk for broken command queuing on Intel GLK-based Positivo models Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 25/79] mmc: sdhci_am654: Workaround for Errata i2312 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 26/79] pmdomain: governor: Consider CPU latency tolerance from pm_domain_cpu_gov Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 27/79] smb: client: fix use-after-free in crypt_message when using async crypto Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 28/79] soc: aspeed: lpc-snoop: Cleanup resources in stack-order Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 29/79] soc: aspeed: lpc-snoop: Dont disable channels that arent enabled Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 30/79] iio: accel: fxls8962af: Fix use after free in fxls8962af_fifo_flush Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 31/79] iio: adc: max1363: Fix MAX1363_4X_CHANS/MAX1363_8X_CHANS[] Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 32/79] iio: adc: max1363: Reorder mode_list[] entries Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 33/79] iio: adc: stm32-adc: Fix race in installing chained IRQ handler Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 34/79] comedi: pcl812: Fix bit shift out of bounds Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 35/79] comedi: aio_iiro_16: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 36/79] comedi: das16m1: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 37/79] comedi: das6402: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 38/79] comedi: Fail COMEDI_INSNLIST ioctl if n_insns is too large Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 39/79] comedi: Fix some signed shift left operations Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 40/79] comedi: Fix use of uninitialized data in insn_rw_emulate_bits() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 41/79] comedi: Fix initialization of data for instructions that write to subdevice Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 42/79] bpf: Reject %p% format string in bprintf-like helpers Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 43/79] cachefiles: Fix the incorrect return value in __cachefiles_write() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 44/79] net: emaclite: Fix missing pointer increment in aligned_read() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 45/79] net/sched: sch_qfq: Fix race condition on qfq_aggregate Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 46/79] rpl: Fix use-after-free in rpl_do_srh_inline() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 47/79] smb: client: fix use-after-free in cifs_oplock_break Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 48/79] nvme: fix misaccounting of nvme-mpath inflight I/O Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 49/79] selftests: net: increase inter-packet timeout in udpgro.sh Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 50/79] hwmon: (corsair-cpro) Validate the size of the received input buffer Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 51/79] usb: net: sierra: check for no status endpoint Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 52/79] Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 53/79] Bluetooth: hci_sync: fix connectable extended advertising when using static random address Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 54/79] Bluetooth: SMP: If an unallowed command is received consider it a failure Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 55/79] Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 56/79] Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 57/79] net/mlx5: Correctly set gso_size when LRO is used Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 58/79] ipv6: mcast: Delay put pmc->idev in mld_del_delrec() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 59/79] netfilter: nf_conntrack: fix crash due to removal of uninitialised entry Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 60/79] Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 61/79] tls: always refresh the queue when reading sock Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 62/79] net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 63/79] net: bridge: Do not offload IGMP/MLD messages Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 64/79] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 65/79] Revert "cgroup_freezer: cgroup_freezing: Check if not frozen" Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 66/79] sched: Change nr_uninterruptible type to unsigned long Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 67/79] HID: mcp2221: Set driver data before I2C adapter add Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 68/79] clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 69/79] usb: hub: fix detection of high tier USB3 devices behind suspended hubs Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 70/79] usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 71/79] usb: hub: Fix flushing of delayed work used for post resume purposes Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 72/79] usb: hub: Dont try to recover devices lost during warm reset Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 73/79] usb: musb: Add and use inline functions musb_{get,set}_state Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 74/79] usb: musb: fix gadget state on disconnect Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 75/79] usb: dwc3: qcom: Dont leave BCR asserted Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 76/79] ASoC: fsl_sai: Force a software reset when starting in consumer mode Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 77/79] Bluetooth: HCI: Set extended advertising data synchronously Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 78/79] mm/vmalloc: leave lazy MMU mode on PTE mapping error Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 79/79] nvmem: layouts: u-boot-env: remove crc32 endianness conversion Greg Kroah-Hartman
2025-07-22 16:31 ` [PATCH 6.1 00/79] 6.1.147-rc1 review Brett A C Sheffield
2025-07-22 18:48 ` Florian Fainelli
2025-07-22 21:26 ` Shuah Khan
2025-07-22 22:11 ` Miguel Ojeda
2025-07-23 4:57 ` Peter Schneider
2025-07-23 11:15 ` Mark Brown
2025-07-23 11:34 ` Jon Hunter
2025-07-23 13:07 ` Naresh Kamboju
2025-07-24 3:40 ` Hardik Garg
2025-07-24 3:55 ` Ron Economos
2025-07-26 18:01 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250722134329.087052773@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=luyun@kylinos.cn \
--cc=patches@lists.linux.dev \
--cc=stable@kernel.org \
--cc=stable@vger.kernel.org \
--cc=tanglongjun@kylinos.cn \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.