From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, stable@kernel.org,
LongJun Tang <tanglongjun@kylinos.cn>, Yun Lu <luyun@kylinos.cn>,
Willem de Bruijn <willemb@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 6.1 19/79] af_packet: fix soft lockup issue caused by tpacket_snd()
Date: Tue, 22 Jul 2025 15:44:15 +0200 [thread overview]
Message-ID: <20250722134329.087052773@linuxfoundation.org> (raw)
In-Reply-To: <20250722134328.384139905@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yun Lu <luyun@kylinos.cn>
commit 55f0bfc0370539213202f4ce1a07615327ac4713 upstream.
When MSG_DONTWAIT is not set, the tpacket_snd operation will wait for
pending_refcnt to decrement to zero before returning. The pending_refcnt
is decremented by 1 when the skb->destructor function is called,
indicating that the skb has been successfully sent and needs to be
destroyed.
If an error occurs during this process, the tpacket_snd() function will
exit and return error, but pending_refcnt may not yet have decremented to
zero. Assuming the next send operation is executed immediately, but there
are no available frames to be sent in tx_ring (i.e., packet_current_frame
returns NULL), and skb is also NULL, the function will not execute
wait_for_completion_interruptible_timeout() to yield the CPU. Instead, it
will enter a do-while loop, waiting for pending_refcnt to be zero. Even
if the previous skb has completed transmission, the skb->destructor
function can only be invoked in the ksoftirqd thread (assuming NAPI
threading is enabled). When both the ksoftirqd thread and the tpacket_snd
operation happen to run on the same CPU, and the CPU trapped in the
do-while loop without yielding, the ksoftirqd thread will not get
scheduled to run. As a result, pending_refcnt will never be reduced to
zero, and the do-while loop cannot exit, eventually leading to a CPU soft
lockup issue.
In fact, skb is true for all but the first iterations of that loop, and
as long as pending_refcnt is not zero, even if incremented by a previous
call, wait_for_completion_interruptible_timeout() should be executed to
yield the CPU, allowing the ksoftirqd thread to be scheduled. Therefore,
the execution condition of this function should be modified to check if
pending_refcnt is not zero, instead of check skb.
- if (need_wait && skb) {
+ if (need_wait && packet_read_pending(&po->tx_ring)) {
As a result, the judgment conditions are duplicated with the end code of
the while loop, and packet_read_pending() is a very expensive function.
Actually, this loop can only exit when ph is NULL, so the loop condition
can be changed to while (1), and in the "ph = NULL" branch, if the
subsequent condition of if is not met, the loop can break directly. Now,
the loop logic remains the same as origin but is clearer and more obvious.
Fixes: 89ed5b519004 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
Cc: stable@kernel.org
Suggested-by: LongJun Tang <tanglongjun@kylinos.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/packet/af_packet.c | 23 +++++++++++------------
1 file changed, 11 insertions(+), 12 deletions(-)
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2841,15 +2841,21 @@ static int tpacket_snd(struct packet_soc
ph = packet_current_frame(po, &po->tx_ring,
TP_STATUS_SEND_REQUEST);
if (unlikely(ph == NULL)) {
- if (need_wait && skb) {
+ /* Note: packet_read_pending() might be slow if we
+ * have to call it as it's per_cpu variable, but in
+ * fast-path we don't have to call it, only when ph
+ * is NULL, we need to check the pending_refcnt.
+ */
+ if (need_wait && packet_read_pending(&po->tx_ring)) {
timeo = wait_for_completion_interruptible_timeout(&po->skb_completion, timeo);
if (timeo <= 0) {
err = !timeo ? -ETIMEDOUT : -ERESTARTSYS;
goto out_put;
}
- }
- /* check for additional frames */
- continue;
+ /* check for additional frames */
+ continue;
+ } else
+ break;
}
skb = NULL;
@@ -2939,14 +2945,7 @@ tpacket_error:
}
packet_increment_head(&po->tx_ring);
len_sum += tp_len;
- } while (likely((ph != NULL) ||
- /* Note: packet_read_pending() might be slow if we have
- * to call it as it's per_cpu variable, but in fast-path
- * we already short-circuit the loop with the first
- * condition, and luckily don't have to go that path
- * anyway.
- */
- (need_wait && packet_read_pending(&po->tx_ring))));
+ } while (1);
err = len_sum;
goto out_put;
next prev parent reply other threads:[~2025-07-22 13:46 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-22 13:43 [PATCH 6.1 00/79] 6.1.147-rc1 review Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 01/79] phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 02/79] USB: serial: option: add Telit Cinterion FE910C04 (ECM) composition Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 03/79] USB: serial: option: add Foxconn T99W640 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 04/79] USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 05/79] usb: gadget: configfs: Fix OOB read on empty string write Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 06/79] i2c: stm32: fix the device used for the DMA map Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 07/79] thunderbolt: Fix bit masking in tb_dp_port_set_hops() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 08/79] Input: xpad - set correct controller type for Acer NGR200 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 09/79] pch_uart: Fix dma_sync_sg_for_device() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 10/79] HID: core: ensure the allocated report buffer can contain the reserved report ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 11/79] HID: core: ensure __hid_request reserves the report ID as the first byte Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 12/79] HID: core: do not bypass hid_hw_raw_request Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 13/79] tracing: Add down_write(trace_event_sem) when adding trace event Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 14/79] io_uring/poll: fix POLLERR handling Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 15/79] phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 16/79] net/mlx5: Update the list of the PCI supported devices Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 17/79] arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 18/79] af_packet: fix the SO_SNDTIMEO constraint not effective on tpacked_snd() Greg Kroah-Hartman
2025-07-22 13:44 ` Greg Kroah-Hartman [this message]
2025-07-22 13:44 ` [PATCH 6.1 20/79] dmaengine: nbpfaxi: Fix memory corruption in probe() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 21/79] isofs: Verify inode mode when loading from disk Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 22/79] memstick: core: Zero initialize id_reg in h_memstick_read_dev_id() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 23/79] mmc: bcm2835: Fix dma_unmap_sg() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 24/79] mmc: sdhci-pci: Quirk for broken command queuing on Intel GLK-based Positivo models Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 25/79] mmc: sdhci_am654: Workaround for Errata i2312 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 26/79] pmdomain: governor: Consider CPU latency tolerance from pm_domain_cpu_gov Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 27/79] smb: client: fix use-after-free in crypt_message when using async crypto Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 28/79] soc: aspeed: lpc-snoop: Cleanup resources in stack-order Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 29/79] soc: aspeed: lpc-snoop: Dont disable channels that arent enabled Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 30/79] iio: accel: fxls8962af: Fix use after free in fxls8962af_fifo_flush Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 31/79] iio: adc: max1363: Fix MAX1363_4X_CHANS/MAX1363_8X_CHANS[] Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 32/79] iio: adc: max1363: Reorder mode_list[] entries Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 33/79] iio: adc: stm32-adc: Fix race in installing chained IRQ handler Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 34/79] comedi: pcl812: Fix bit shift out of bounds Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 35/79] comedi: aio_iiro_16: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 36/79] comedi: das16m1: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 37/79] comedi: das6402: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 38/79] comedi: Fail COMEDI_INSNLIST ioctl if n_insns is too large Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 39/79] comedi: Fix some signed shift left operations Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 40/79] comedi: Fix use of uninitialized data in insn_rw_emulate_bits() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 41/79] comedi: Fix initialization of data for instructions that write to subdevice Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 42/79] bpf: Reject %p% format string in bprintf-like helpers Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 43/79] cachefiles: Fix the incorrect return value in __cachefiles_write() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 44/79] net: emaclite: Fix missing pointer increment in aligned_read() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 45/79] net/sched: sch_qfq: Fix race condition on qfq_aggregate Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 46/79] rpl: Fix use-after-free in rpl_do_srh_inline() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 47/79] smb: client: fix use-after-free in cifs_oplock_break Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 48/79] nvme: fix misaccounting of nvme-mpath inflight I/O Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 49/79] selftests: net: increase inter-packet timeout in udpgro.sh Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 50/79] hwmon: (corsair-cpro) Validate the size of the received input buffer Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 51/79] usb: net: sierra: check for no status endpoint Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 52/79] Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 53/79] Bluetooth: hci_sync: fix connectable extended advertising when using static random address Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 54/79] Bluetooth: SMP: If an unallowed command is received consider it a failure Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 55/79] Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 56/79] Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 57/79] net/mlx5: Correctly set gso_size when LRO is used Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 58/79] ipv6: mcast: Delay put pmc->idev in mld_del_delrec() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 59/79] netfilter: nf_conntrack: fix crash due to removal of uninitialised entry Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 60/79] Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 61/79] tls: always refresh the queue when reading sock Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 62/79] net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 63/79] net: bridge: Do not offload IGMP/MLD messages Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 64/79] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 65/79] Revert "cgroup_freezer: cgroup_freezing: Check if not frozen" Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 66/79] sched: Change nr_uninterruptible type to unsigned long Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 67/79] HID: mcp2221: Set driver data before I2C adapter add Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 68/79] clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 69/79] usb: hub: fix detection of high tier USB3 devices behind suspended hubs Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 70/79] usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 71/79] usb: hub: Fix flushing of delayed work used for post resume purposes Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 72/79] usb: hub: Dont try to recover devices lost during warm reset Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 73/79] usb: musb: Add and use inline functions musb_{get,set}_state Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 74/79] usb: musb: fix gadget state on disconnect Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 75/79] usb: dwc3: qcom: Dont leave BCR asserted Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 76/79] ASoC: fsl_sai: Force a software reset when starting in consumer mode Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 77/79] Bluetooth: HCI: Set extended advertising data synchronously Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 78/79] mm/vmalloc: leave lazy MMU mode on PTE mapping error Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 79/79] nvmem: layouts: u-boot-env: remove crc32 endianness conversion Greg Kroah-Hartman
2025-07-22 16:31 ` [PATCH 6.1 00/79] 6.1.147-rc1 review Brett A C Sheffield
2025-07-22 18:48 ` Florian Fainelli
2025-07-22 21:26 ` Shuah Khan
2025-07-22 22:11 ` Miguel Ojeda
2025-07-23 4:57 ` Peter Schneider
2025-07-23 11:15 ` Mark Brown
2025-07-23 11:34 ` Jon Hunter
2025-07-23 13:07 ` Naresh Kamboju
2025-07-24 3:40 ` Hardik Garg
2025-07-24 3:55 ` Ron Economos
2025-07-26 18:01 ` Pavel Machek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250722134329.087052773@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=luyun@kylinos.cn \
--cc=patches@lists.linux.dev \
--cc=stable@kernel.org \
--cc=stable@vger.kernel.org \
--cc=tanglongjun@kylinos.cn \
--cc=willemb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).