patches.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, stable@kernel.org,
	LongJun Tang <tanglongjun@kylinos.cn>, Yun Lu <luyun@kylinos.cn>,
	Willem de Bruijn <willemb@google.com>,
	"David S. Miller" <davem@davemloft.net>
Subject: [PATCH 6.1 19/79] af_packet: fix soft lockup issue caused by tpacket_snd()
Date: Tue, 22 Jul 2025 15:44:15 +0200	[thread overview]
Message-ID: <20250722134329.087052773@linuxfoundation.org> (raw)
In-Reply-To: <20250722134328.384139905@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Yun Lu <luyun@kylinos.cn>

commit 55f0bfc0370539213202f4ce1a07615327ac4713 upstream.

When MSG_DONTWAIT is not set, the tpacket_snd operation will wait for
pending_refcnt to decrement to zero before returning. The pending_refcnt
is decremented by 1 when the skb->destructor function is called,
indicating that the skb has been successfully sent and needs to be
destroyed.

If an error occurs during this process, the tpacket_snd() function will
exit and return error, but pending_refcnt may not yet have decremented to
zero. Assuming the next send operation is executed immediately, but there
are no available frames to be sent in tx_ring (i.e., packet_current_frame
returns NULL), and skb is also NULL, the function will not execute
wait_for_completion_interruptible_timeout() to yield the CPU. Instead, it
will enter a do-while loop, waiting for pending_refcnt to be zero. Even
if the previous skb has completed transmission, the skb->destructor
function can only be invoked in the ksoftirqd thread (assuming NAPI
threading is enabled). When both the ksoftirqd thread and the tpacket_snd
operation happen to run on the same CPU, and the CPU trapped in the
do-while loop without yielding, the ksoftirqd thread will not get
scheduled to run. As a result, pending_refcnt will never be reduced to
zero, and the do-while loop cannot exit, eventually leading to a CPU soft
lockup issue.

In fact, skb is true for all but the first iterations of that loop, and
as long as pending_refcnt is not zero, even if incremented by a previous
call, wait_for_completion_interruptible_timeout() should be executed to
yield the CPU, allowing the ksoftirqd thread to be scheduled. Therefore,
the execution condition of this function should be modified to check if
pending_refcnt is not zero, instead of check skb.

-	if (need_wait && skb) {
+	if (need_wait && packet_read_pending(&po->tx_ring)) {

As a result, the judgment conditions are duplicated with the end code of
the while loop, and packet_read_pending() is a very expensive function.
Actually, this loop can only exit when ph is NULL, so the loop condition
can be changed to while (1), and in the "ph = NULL" branch, if the
subsequent condition of if is not met,  the loop can break directly. Now,
the loop logic remains the same as origin but is clearer and more obvious.

Fixes: 89ed5b519004 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
Cc: stable@kernel.org
Suggested-by: LongJun Tang <tanglongjun@kylinos.cn>
Signed-off-by: Yun Lu <luyun@kylinos.cn>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/packet/af_packet.c |   23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2841,15 +2841,21 @@ static int tpacket_snd(struct packet_soc
 		ph = packet_current_frame(po, &po->tx_ring,
 					  TP_STATUS_SEND_REQUEST);
 		if (unlikely(ph == NULL)) {
-			if (need_wait && skb) {
+			/* Note: packet_read_pending() might be slow if we
+			 * have to call it as it's per_cpu variable, but in
+			 * fast-path we don't have to call it, only when ph
+			 * is NULL, we need to check the pending_refcnt.
+			 */
+			if (need_wait && packet_read_pending(&po->tx_ring)) {
 				timeo = wait_for_completion_interruptible_timeout(&po->skb_completion, timeo);
 				if (timeo <= 0) {
 					err = !timeo ? -ETIMEDOUT : -ERESTARTSYS;
 					goto out_put;
 				}
-			}
-			/* check for additional frames */
-			continue;
+				/* check for additional frames */
+				continue;
+			} else
+				break;
 		}
 
 		skb = NULL;
@@ -2939,14 +2945,7 @@ tpacket_error:
 		}
 		packet_increment_head(&po->tx_ring);
 		len_sum += tp_len;
-	} while (likely((ph != NULL) ||
-		/* Note: packet_read_pending() might be slow if we have
-		 * to call it as it's per_cpu variable, but in fast-path
-		 * we already short-circuit the loop with the first
-		 * condition, and luckily don't have to go that path
-		 * anyway.
-		 */
-		 (need_wait && packet_read_pending(&po->tx_ring))));
+	} while (1);
 
 	err = len_sum;
 	goto out_put;



  parent reply	other threads:[~2025-07-22 13:46 UTC|newest]

Thread overview: 91+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-22 13:43 [PATCH 6.1 00/79] 6.1.147-rc1 review Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 01/79] phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 02/79] USB: serial: option: add Telit Cinterion FE910C04 (ECM) composition Greg Kroah-Hartman
2025-07-22 13:43 ` [PATCH 6.1 03/79] USB: serial: option: add Foxconn T99W640 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 04/79] USB: serial: ftdi_sio: add support for NDI EMGUIDE GEMINI Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 05/79] usb: gadget: configfs: Fix OOB read on empty string write Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 06/79] i2c: stm32: fix the device used for the DMA map Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 07/79] thunderbolt: Fix bit masking in tb_dp_port_set_hops() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 08/79] Input: xpad - set correct controller type for Acer NGR200 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 09/79] pch_uart: Fix dma_sync_sg_for_device() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 10/79] HID: core: ensure the allocated report buffer can contain the reserved report ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 11/79] HID: core: ensure __hid_request reserves the report ID as the first byte Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 12/79] HID: core: do not bypass hid_hw_raw_request Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 13/79] tracing: Add down_write(trace_event_sem) when adding trace event Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 14/79] io_uring/poll: fix POLLERR handling Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 15/79] phonet/pep: Move call to pn_skb_get_dst_sockaddr() earlier in pep_sock_accept() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 16/79] net/mlx5: Update the list of the PCI supported devices Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 17/79] arm64: dts: freescale: imx8mm-verdin: Keep LDO5 always on Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 18/79] af_packet: fix the SO_SNDTIMEO constraint not effective on tpacked_snd() Greg Kroah-Hartman
2025-07-22 13:44 ` Greg Kroah-Hartman [this message]
2025-07-22 13:44 ` [PATCH 6.1 20/79] dmaengine: nbpfaxi: Fix memory corruption in probe() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 21/79] isofs: Verify inode mode when loading from disk Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 22/79] memstick: core: Zero initialize id_reg in h_memstick_read_dev_id() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 23/79] mmc: bcm2835: Fix dma_unmap_sg() nents value Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 24/79] mmc: sdhci-pci: Quirk for broken command queuing on Intel GLK-based Positivo models Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 25/79] mmc: sdhci_am654: Workaround for Errata i2312 Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 26/79] pmdomain: governor: Consider CPU latency tolerance from pm_domain_cpu_gov Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 27/79] smb: client: fix use-after-free in crypt_message when using async crypto Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 28/79] soc: aspeed: lpc-snoop: Cleanup resources in stack-order Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 29/79] soc: aspeed: lpc-snoop: Dont disable channels that arent enabled Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 30/79] iio: accel: fxls8962af: Fix use after free in fxls8962af_fifo_flush Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 31/79] iio: adc: max1363: Fix MAX1363_4X_CHANS/MAX1363_8X_CHANS[] Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 32/79] iio: adc: max1363: Reorder mode_list[] entries Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 33/79] iio: adc: stm32-adc: Fix race in installing chained IRQ handler Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 34/79] comedi: pcl812: Fix bit shift out of bounds Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 35/79] comedi: aio_iiro_16: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 36/79] comedi: das16m1: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 37/79] comedi: das6402: " Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 38/79] comedi: Fail COMEDI_INSNLIST ioctl if n_insns is too large Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 39/79] comedi: Fix some signed shift left operations Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 40/79] comedi: Fix use of uninitialized data in insn_rw_emulate_bits() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 41/79] comedi: Fix initialization of data for instructions that write to subdevice Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 42/79] bpf: Reject %p% format string in bprintf-like helpers Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 43/79] cachefiles: Fix the incorrect return value in __cachefiles_write() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 44/79] net: emaclite: Fix missing pointer increment in aligned_read() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 45/79] net/sched: sch_qfq: Fix race condition on qfq_aggregate Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 46/79] rpl: Fix use-after-free in rpl_do_srh_inline() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 47/79] smb: client: fix use-after-free in cifs_oplock_break Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 48/79] nvme: fix misaccounting of nvme-mpath inflight I/O Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 49/79] selftests: net: increase inter-packet timeout in udpgro.sh Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 50/79] hwmon: (corsair-cpro) Validate the size of the received input buffer Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 51/79] usb: net: sierra: check for no status endpoint Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 52/79] Bluetooth: Fix null-ptr-deref in l2cap_sock_resume_cb() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 53/79] Bluetooth: hci_sync: fix connectable extended advertising when using static random address Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 54/79] Bluetooth: SMP: If an unallowed command is received consider it a failure Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 55/79] Bluetooth: SMP: Fix using HCI_ERROR_REMOTE_USER_TERM on timeout Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 56/79] Bluetooth: btusb: QCA: Fix downloading wrong NVM for WCN6855 GF variant without board ID Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 57/79] net/mlx5: Correctly set gso_size when LRO is used Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 58/79] ipv6: mcast: Delay put pmc->idev in mld_del_delrec() Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 59/79] netfilter: nf_conntrack: fix crash due to removal of uninitialised entry Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 60/79] Bluetooth: L2CAP: Fix attempting to adjust outgoing MTU Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 61/79] tls: always refresh the queue when reading sock Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 62/79] net: vlan: fix VLAN 0 refcount imbalance of toggling filtering during runtime Greg Kroah-Hartman
2025-07-22 13:44 ` [PATCH 6.1 63/79] net: bridge: Do not offload IGMP/MLD messages Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 64/79] net/sched: Return NULL when htb_lookup_leaf encounters an empty rbtree Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 65/79] Revert "cgroup_freezer: cgroup_freezing: Check if not frozen" Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 66/79] sched: Change nr_uninterruptible type to unsigned long Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 67/79] HID: mcp2221: Set driver data before I2C adapter add Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 68/79] clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 69/79] usb: hub: fix detection of high tier USB3 devices behind suspended hubs Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 70/79] usb: hub: Fix flushing and scheduling of delayed work that tunes runtime pm Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 71/79] usb: hub: Fix flushing of delayed work used for post resume purposes Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 72/79] usb: hub: Dont try to recover devices lost during warm reset Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 73/79] usb: musb: Add and use inline functions musb_{get,set}_state Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 74/79] usb: musb: fix gadget state on disconnect Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 75/79] usb: dwc3: qcom: Dont leave BCR asserted Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 76/79] ASoC: fsl_sai: Force a software reset when starting in consumer mode Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 77/79] Bluetooth: HCI: Set extended advertising data synchronously Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 78/79] mm/vmalloc: leave lazy MMU mode on PTE mapping error Greg Kroah-Hartman
2025-07-22 13:45 ` [PATCH 6.1 79/79] nvmem: layouts: u-boot-env: remove crc32 endianness conversion Greg Kroah-Hartman
2025-07-22 16:31   ` [PATCH 6.1 00/79] 6.1.147-rc1 review Brett A C Sheffield
2025-07-22 18:48 ` Florian Fainelli
2025-07-22 21:26 ` Shuah Khan
2025-07-22 22:11 ` Miguel Ojeda
2025-07-23  4:57 ` Peter Schneider
2025-07-23 11:15 ` Mark Brown
2025-07-23 11:34 ` Jon Hunter
2025-07-23 13:07 ` Naresh Kamboju
2025-07-24  3:40 ` Hardik Garg
2025-07-24  3:55 ` Ron Economos
2025-07-26 18:01 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250722134329.087052773@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=davem@davemloft.net \
    --cc=luyun@kylinos.cn \
    --cc=patches@lists.linux.dev \
    --cc=stable@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tanglongjun@kylinos.cn \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).