From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, "David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Ben Dooks <ben.dooks@codethink.co.uk>,
Tristram Ha <Tristram.Ha@microchip.com>,
netdev@vger.kernel.org, Ronald Wahl <ronald.wahl@raritan.com>,
Simon Horman <horms@kernel.org>
Subject: [PATCH 5.10 45/75] net: ks8851: Fix TX stall caused by TX buffer overrun
Date: Wed, 3 Jan 2024 17:55:26 +0100 [thread overview]
Message-ID: <20240103164849.940037823@linuxfoundation.org> (raw)
In-Reply-To: <20240103164842.953224409@linuxfoundation.org>
5.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronald Wahl <ronald.wahl@raritan.com>
commit 3dc5d44545453de1de9c53cc529cc960a85933da upstream.
There is a bug in the ks8851 Ethernet driver that more data is written
to the hardware TX buffer than actually available. This is caused by
wrong accounting of the free TX buffer space.
The driver maintains a tx_space variable that represents the TX buffer
space that is deemed to be free. The ks8851_start_xmit_spi() function
adds an SKB to a queue if tx_space is large enough and reduces tx_space
by the amount of buffer space it will later need in the TX buffer and
then schedules a work item. If there is not enough space then the TX
queue is stopped.
The worker function ks8851_tx_work() dequeues all the SKBs and writes
the data into the hardware TX buffer. The last packet will trigger an
interrupt after it was send. Here it is assumed that all data fits into
the TX buffer.
In the interrupt routine (which runs asynchronously because it is a
threaded interrupt) tx_space is updated with the current value from the
hardware. Also the TX queue is woken up again.
Now it could happen that after data was sent to the hardware and before
handling the TX interrupt new data is queued in ks8851_start_xmit_spi()
when the TX buffer space had still some space left. When the interrupt
is actually handled tx_space is updated from the hardware but now we
already have new SKBs queued that have not been written to the hardware
TX buffer yet. Since tx_space has been overwritten by the value from the
hardware the space is not accounted for.
Now we have more data queued then buffer space available in the hardware
and ks8851_tx_work() will potentially overrun the hardware TX buffer. In
many cases it will still work because often the buffer is written out
fast enough so that no overrun occurs but for example if the peer
throttles us via flow control then an overrun may happen.
This can be fixed in different ways. The most simple way would be to set
tx_space to 0 before writing data to the hardware TX buffer preventing
the queuing of more SKBs until the TX interrupt has been handled. I have
chosen a slightly more efficient (and still rather simple) way and
track the amount of data that is already queued and not yet written to
the hardware. When new SKBs are to be queued the already queued amount
of data is honoured when checking free TX buffer space.
I tested this with a setup of two linked KS8851 running iperf3 between
the two in bidirectional mode. Before the fix I got a stall after some
minutes. With the fix I saw now issues anymore after hours.
Fixes: 3ba81f3ece3c ("net: Micrel KS8851 SPI network driver")
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: Tristram Ha <Tristram.Ha@microchip.com>
Cc: netdev@vger.kernel.org
Cc: stable@vger.kernel.org # 5.10+
Signed-off-by: Ronald Wahl <ronald.wahl@raritan.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20231214181112.76052-1-rwahl@gmx.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/micrel/ks8851.h | 3 ++
drivers/net/ethernet/micrel/ks8851_common.c | 20 ++++++-------
drivers/net/ethernet/micrel/ks8851_spi.c | 42 ++++++++++++++++++----------
3 files changed, 40 insertions(+), 25 deletions(-)
--- a/drivers/net/ethernet/micrel/ks8851.h
+++ b/drivers/net/ethernet/micrel/ks8851.h
@@ -350,6 +350,8 @@ union ks8851_tx_hdr {
* @rxd: Space for receiving SPI data, in DMA-able space.
* @txd: Space for transmitting SPI data, in DMA-able space.
* @msg_enable: The message flags controlling driver output (see ethtool).
+ * @tx_space: Free space in the hardware TX buffer (cached copy of KS_TXMIR).
+ * @queued_len: Space required in hardware TX buffer for queued packets in txq.
* @fid: Incrementing frame id tag.
* @rc_ier: Cached copy of KS_IER.
* @rc_ccr: Cached copy of KS_CCR.
@@ -398,6 +400,7 @@ struct ks8851_net {
struct work_struct rxctrl_work;
struct sk_buff_head txq;
+ unsigned int queued_len;
struct eeprom_93cx6 eeprom;
struct regulator *vdd_reg;
--- a/drivers/net/ethernet/micrel/ks8851_common.c
+++ b/drivers/net/ethernet/micrel/ks8851_common.c
@@ -363,16 +363,18 @@ static irqreturn_t ks8851_irq(int irq, v
handled |= IRQ_RXPSI;
if (status & IRQ_TXI) {
- handled |= IRQ_TXI;
+ unsigned short tx_space = ks8851_rdreg16(ks, KS_TXMIR);
- /* no lock here, tx queue should have been stopped */
+ netif_dbg(ks, intr, ks->netdev,
+ "%s: txspace %d\n", __func__, tx_space);
- /* update our idea of how much tx space is available to the
- * system */
- ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
+ spin_lock(&ks->statelock);
+ ks->tx_space = tx_space;
+ if (netif_queue_stopped(ks->netdev))
+ netif_wake_queue(ks->netdev);
+ spin_unlock(&ks->statelock);
- netif_dbg(ks, intr, ks->netdev,
- "%s: txspace %d\n", __func__, ks->tx_space);
+ handled |= IRQ_TXI;
}
if (status & IRQ_RXI)
@@ -415,9 +417,6 @@ static irqreturn_t ks8851_irq(int irq, v
if (status & IRQ_LCI)
mii_check_link(&ks->mii);
- if (status & IRQ_TXI)
- netif_wake_queue(ks->netdev);
-
return IRQ_HANDLED;
}
@@ -501,6 +500,7 @@ static int ks8851_net_open(struct net_de
ks8851_wrreg16(ks, KS_ISR, ks->rc_ier);
ks8851_wrreg16(ks, KS_IER, ks->rc_ier);
+ ks->queued_len = 0;
netif_start_queue(ks->netdev);
netif_dbg(ks, ifup, ks->netdev, "network device up\n");
--- a/drivers/net/ethernet/micrel/ks8851_spi.c
+++ b/drivers/net/ethernet/micrel/ks8851_spi.c
@@ -289,6 +289,18 @@ static void ks8851_wrfifo_spi(struct ks8
}
/**
+ * calc_txlen - calculate size of message to send packet
+ * @len: Length of data
+ *
+ * Returns the size of the TXFIFO message needed to send
+ * this packet.
+ */
+static unsigned int calc_txlen(unsigned int len)
+{
+ return ALIGN(len + 4, 4);
+}
+
+/**
* ks8851_rx_skb_spi - receive skbuff
* @ks: The device state
* @skb: The skbuff
@@ -307,7 +319,9 @@ static void ks8851_rx_skb_spi(struct ks8
*/
static void ks8851_tx_work(struct work_struct *work)
{
+ unsigned int dequeued_len = 0;
struct ks8851_net_spi *kss;
+ unsigned short tx_space;
struct ks8851_net *ks;
unsigned long flags;
struct sk_buff *txb;
@@ -324,6 +338,8 @@ static void ks8851_tx_work(struct work_s
last = skb_queue_empty(&ks->txq);
if (txb) {
+ dequeued_len += calc_txlen(txb->len);
+
ks8851_wrreg16_spi(ks, KS_RXQCR,
ks->rc_rxqcr | RXQCR_SDA);
ks8851_wrfifo_spi(ks, txb, last);
@@ -334,6 +350,13 @@ static void ks8851_tx_work(struct work_s
}
}
+ tx_space = ks8851_rdreg16_spi(ks, KS_TXMIR);
+
+ spin_lock(&ks->statelock);
+ ks->queued_len -= dequeued_len;
+ ks->tx_space = tx_space;
+ spin_unlock(&ks->statelock);
+
ks8851_unlock_spi(ks, &flags);
}
@@ -349,18 +372,6 @@ static void ks8851_flush_tx_work_spi(str
}
/**
- * calc_txlen - calculate size of message to send packet
- * @len: Length of data
- *
- * Returns the size of the TXFIFO message needed to send
- * this packet.
- */
-static unsigned int calc_txlen(unsigned int len)
-{
- return ALIGN(len + 4, 4);
-}
-
-/**
* ks8851_start_xmit_spi - transmit packet using SPI
* @skb: The buffer to transmit
* @dev: The device used to transmit the packet.
@@ -388,16 +399,17 @@ static netdev_tx_t ks8851_start_xmit_spi
spin_lock(&ks->statelock);
- if (needed > ks->tx_space) {
+ if (ks->queued_len + needed > ks->tx_space) {
netif_stop_queue(dev);
ret = NETDEV_TX_BUSY;
} else {
- ks->tx_space -= needed;
+ ks->queued_len += needed;
skb_queue_tail(&ks->txq, skb);
}
spin_unlock(&ks->statelock);
- schedule_work(&kss->tx_work);
+ if (ret == NETDEV_TX_OK)
+ schedule_work(&kss->tx_work);
return ret;
}
next prev parent reply other threads:[~2024-01-03 17:10 UTC|newest]
Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-03 16:54 [PATCH 5.10 00/75] 5.10.206-rc1 review Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 01/75] ksmbd: fix wrong name of SMB2_CREATE_ALLOCATION_SIZE Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 02/75] smb: client: fix OOB in smb2_query_reparse_point() Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 03/75] ARM: OMAP2+: Fix null pointer dereference and memory leak in omap_soc_device_init Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 04/75] reset: Fix crash when freeing non-existent optional resets Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 05/75] s390/vx: fix save/restore of fpu kernel context Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 06/75] wifi: mac80211: mesh_plink: fix matches_local logic Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 07/75] Revert "net/mlx5e: fix double free of encap_header" Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 08/75] net/mlx5e: Fix slab-out-of-bounds in mlx5_query_nic_vport_mac_list() Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 09/75] net/mlx5: Fix fw tracer first block check Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 10/75] net/mlx5e: Correct snprintf truncation handling for fw_version buffer used by representors Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 11/75] net: sched: ife: fix potential use-after-free Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 12/75] ethernet: atheros: fix a memleak in atl1e_setup_ring_resources Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 13/75] net/rose: fix races in rose_kill_by_device() Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 14/75] net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 15/75] afs: Fix the dynamic roots d_delete to always delete unused dentries Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 16/75] afs: Fix dynamic root lookup DNS check Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 17/75] net: warn if gso_type isnt set for a GSO SKB Greg Kroah-Hartman
2024-01-03 16:54 ` [PATCH 5.10 18/75] net: check dev->gso_max_size in gso_features_check() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 19/75] keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 20/75] afs: Fix overwriting of result of DNS query Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 21/75] i2c: aspeed: Handle the coalesced stop conditions with the start conditions Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 22/75] pinctrl: at91-pio4: use dedicated lock class for IRQ Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 23/75] ALSA: hda/hdmi: Add quirk to force pin connectivity on NUC10 Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 24/75] ALSA: hda/hdmi: add force-connect quirk for NUC5CPYB Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 25/75] smb: client: fix NULL deref in asn1_ber_decoder() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 26/75] btrfs: do not allow non subvolume root targets for snapshot Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 27/75] interconnect: Treat xlate() returning NULL node as an error Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 28/75] iio: imu: inv_mpu6050: fix an error code problem in inv_mpu6050_read_raw Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 29/75] interconnect: qcom: sm8250: Enable sync_state Greg Kroah-Hartman
2024-01-03 17:29 ` Konrad Dybcio
2024-01-03 16:55 ` [PATCH 5.10 30/75] Input: ipaq-micro-keys - add error handling for devm_kmemdup Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 31/75] scsi: bnx2fc: Fix skb double free in bnx2fc_rcv() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 32/75] iio: common: ms_sensors: ms_sensors_i2c: fix humidity conversion time table Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 33/75] iio: adc: ti_am335x_adc: Fix return value check of tiadc_request_dma() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 34/75] wifi: cfg80211: Add my certificate Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 35/75] wifi: cfg80211: fix certs build to not depend on file order Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 36/75] USB: serial: ftdi_sio: update Actisense PIDs constant names Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 37/75] USB: serial: option: add Quectel EG912Y module support Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 38/75] USB: serial: option: add Foxconn T99W265 with new baseline Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 39/75] USB: serial: option: add Quectel RM500Q R13 firmware support Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 40/75] Bluetooth: hci_event: Fix not checking if HCI_OP_INQUIRY has been sent Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 41/75] Bluetooth: L2CAP: Send reject on command corrupted request Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 42/75] Input: soc_button_array - add mapping for airplane mode button Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 43/75] net: 9p: avoid freeing uninit memory in p9pdu_vreadf Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 44/75] net: rfkill: gpio: set GPIO direction Greg Kroah-Hartman
2024-01-03 16:55 ` Greg Kroah-Hartman [this message]
2024-01-03 16:55 ` [PATCH 5.10 46/75] dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 47/75] tracing / synthetic: Disable events after testing in synth_event_gen_test_init() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 48/75] bus: ti-sysc: Flush posted write only after srst_udelay Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 49/75] lib/vsprintf: Fix %pfwf when current node refcount == 0 Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 50/75] x86/alternatives: Sync core before enabling interrupts Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 51/75] 9p/net: fix possible memory leak in p9_check_errors() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 52/75] ARM: dts: Fix occasional boot hang for am3 usb Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 53/75] Bluetooth: SMP: Convert BT_ERR/BT_DBG to bt_dev_err/bt_dev_dbg Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 54/75] Bluetooth: use inclusive language in SMP Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 55/75] Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 56/75] usb: fotg210-hcd: delete an incorrect bounds test Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 57/75] smb: client: fix OOB in SMB2_query_info_init() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 58/75] smb: client: fix OOB in smbCalcSize() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 59/75] Bluetooth: af_bluetooth: Fix Use-After-Free in bt_sock_recvmsg Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 60/75] spi: atmel: Switch to transfer_one transfer method Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 61/75] spi: atmel: Fix CS and initialization bug Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 62/75] scsi: core: Add scsi_prot_ref_tag() helper Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 63/75] scsi: core: Introduce scsi_get_sector() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 64/75] scsi: core: Make scsi_get_lba() return the LBA Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 65/75] scsi: core: Use scsi_cmd_to_rq() instead of scsi_cmnd.request Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 66/75] scsi: core: Use a structure member to track the SCSI command submitter Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 67/75] scsi: core: Always send batch on reset or error handling command Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 68/75] ring-buffer: Fix wake ups when buffer_percent is set to 100 Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 69/75] tracing: Fix blocked reader of snapshot buffer Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 70/75] netfilter: nf_tables: skip set commit for deleted/destroyed sets Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 71/75] dm-integrity: dont modify bios immutable bio_vec in integrity_metadata() Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 72/75] tracing/kprobes: Return EADDRNOTAVAIL when func matches several symbols Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 73/75] Revert "MIPS: Loongson64: Enable DMA noncoherent support" Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 74/75] Bluetooth: SMP: Fix crash when receiving new connection when debug is enabled Greg Kroah-Hartman
2024-01-03 16:55 ` [PATCH 5.10 75/75] spi: atmel: Fix PDC transfer setup bug Greg Kroah-Hartman
2024-01-03 19:15 ` [PATCH 5.10 00/75] 5.10.206-rc1 review Florian Fainelli
2024-01-04 0:40 ` Dominique Martinet
2024-01-04 9:16 ` Daniel Díaz
2024-01-04 9:25 ` Greg Kroah-Hartman
2024-01-08 17:47 ` Francis Laniel
2024-01-04 12:11 ` Jon Hunter
2024-01-04 12:15 ` Pavel Machek
2024-01-05 1:03 ` Guenter Roeck
2024-01-05 11:12 ` Shreeya Patel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240103164849.940037823@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=Tristram.Ha@microchip.com \
--cc=ben.dooks@codethink.co.uk \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=patches@lists.linux.dev \
--cc=ronald.wahl@raritan.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox