netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses
@ 2023-04-11  1:33 Jakub Kicinski
  2023-04-11  1:33 ` [PATCH net-next 1/3] net: docs: update the sample code in driver.rst Jakub Kicinski
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Jakub Kicinski @ 2023-04-11  1:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, Jakub Kicinski

Small follow up to the lockless ring stop/start macros.
Update the doc and the drivers suggested by Eric:
https://lore.kernel.org/all/CANn89iJrBGSybMX1FqrhCEMWT3Nnz2=2+aStsbbwpWzKHjk51g@mail.gmail.com/

Jakub Kicinski (3):
  net: docs: update the sample code in driver.rst
  bnxt: use READ_ONCE/WRITE_ONCE for ring indexes
  mlx4: use READ_ONCE/WRITE_ONCE for ring indexes

 Documentation/networking/driver.rst        | 61 ++++++++++------------
 drivers/net/ethernet/broadcom/bnxt/bnxt.c  |  6 +--
 drivers/net/ethernet/broadcom/bnxt/bnxt.h  |  9 ++--
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |  8 +--
 4 files changed, 39 insertions(+), 45 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH net-next 1/3] net: docs: update the sample code in driver.rst
  2023-04-11  1:33 [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Jakub Kicinski
@ 2023-04-11  1:33 ` Jakub Kicinski
  2023-04-11 15:52   ` Jesse Brandeburg
  2023-04-11  1:33 ` [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes Jakub Kicinski
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2023-04-11  1:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, Jakub Kicinski

The sample code talks about single-queue devices and uses locks.
Update it to something resembling more modern code.
Make sure we mention use of READ_ONCE() / WRITE_ONCE().

Change the comment which talked about consumer on the xmit side.
AFAIU xmit is the producer and completions are a consumer.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 Documentation/networking/driver.rst | 61 +++++++++++++----------------
 1 file changed, 27 insertions(+), 34 deletions(-)

diff --git a/Documentation/networking/driver.rst b/Documentation/networking/driver.rst
index 4071f2c00f8b..4f5dfa9c022e 100644
--- a/Documentation/networking/driver.rst
+++ b/Documentation/networking/driver.rst
@@ -47,30 +47,43 @@ Instead it must maintain the queue properly.  For example,
 
 .. code-block:: c
 
+	static u32 drv_tx_avail(struct drv_ring *dr)
+	{
+		u32 used = READ_ONCE(dr->prod) - READ_ONCE(dr->cons);
+
+		return dr->tx_ring_size - (used & bp->tx_ring_mask);
+	}
+
 	static netdev_tx_t drv_hard_start_xmit(struct sk_buff *skb,
 					       struct net_device *dev)
 	{
 		struct drv *dp = netdev_priv(dev);
+		struct netdev_queue *txq;
+		struct drv_ring *dr;
+		int idx;
+
+		idx = skb_get_queue_mapping(skb);
+		dr = dp->tx_rings[idx];
+		txq = netdev_get_tx_queue(dev, idx);
 
-		lock_tx(dp);
 		//...
-		/* This is a hard error log it. */
-		if (TX_BUFFS_AVAIL(dp) <= (skb_shinfo(skb)->nr_frags + 1)) {
+		/* This should be a very rare race - log it. */
+		if (drv_tx_avail(dr) <= skb_shinfo(skb)->nr_frags + 1) {
 			netif_stop_queue(dev);
-			unlock_tx(dp);
-			printk(KERN_ERR PFX "%s: BUG! Tx Ring full when queue awake!\n",
-			       dev->name);
+			netdev_warn(dev, "Tx Ring full when queue awake!\n");
 			return NETDEV_TX_BUSY;
 		}
 
 		//... queue packet to card ...
-		//... update tx consumer index ...
 
-		if (TX_BUFFS_AVAIL(dp) <= (MAX_SKB_FRAGS + 1))
-			netif_stop_queue(dev);
+		netdev_tx_sent_queue(txq, skb->len);
+
+		//... update tx producer index using WRITE_ONCE() ...
+
+		if (!netif_txq_maybe_stop(txq, drv_tx_avail(dr),
+					  MAX_SKB_FRAGS + 1, 2 * MAX_SKB_FRAGS))
+			dr->stats.stopped++;
 
-		//...
-		unlock_tx(dp);
 		//...
 		return NETDEV_TX_OK;
 	}
@@ -79,30 +92,10 @@ Instead it must maintain the queue properly.  For example,
 
 .. code-block:: c
 
-	if (netif_queue_stopped(dp->dev) &&
-	    TX_BUFFS_AVAIL(dp) > (MAX_SKB_FRAGS + 1))
-		netif_wake_queue(dp->dev);
-
-For a non-scatter-gather supporting card, the three tests simply become:
-
-.. code-block:: c
-
-		/* This is a hard error log it. */
-		if (TX_BUFFS_AVAIL(dp) <= 0)
-
-and:
-
-.. code-block:: c
-
-		if (TX_BUFFS_AVAIL(dp) == 0)
-
-and:
-
-.. code-block:: c
+	//... update tx consumer index using WRITE_ONCE() ...
 
-	if (netif_queue_stopped(dp->dev) &&
-	    TX_BUFFS_AVAIL(dp) > 0)
-		netif_wake_queue(dp->dev);
+	netif_txq_completed_wake(txq, cmpl_pkts, cmpl_bytes,
+				 drv_tx_avail(dr), 2 * MAX_SKB_FRAGS);
 
 Lockless queue stop / wake helper macros
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes
  2023-04-11  1:33 [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Jakub Kicinski
  2023-04-11  1:33 ` [PATCH net-next 1/3] net: docs: update the sample code in driver.rst Jakub Kicinski
@ 2023-04-11  1:33 ` Jakub Kicinski
  2023-04-11 15:52   ` Jesse Brandeburg
  2023-04-11 23:09   ` Michael Chan
  2023-04-11  1:33 ` [PATCH net-next 3/3] mlx4: " Jakub Kicinski
  2023-04-11 15:55 ` [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Eric Dumazet
  3 siblings, 2 replies; 9+ messages in thread
From: Jakub Kicinski @ 2023-04-11  1:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, Jakub Kicinski, michael.chan

Eric points out that we should make sure that ring index updates
are wrapped in the appropriate READ_ONCE/WRITE_ONCE macros.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: michael.chan@broadcom.com
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 +++---
 drivers/net/ethernet/broadcom/bnxt/bnxt.h | 9 ++++-----
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index f7602d8d79e3..92289ab2f34a 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -472,7 +472,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		prod = NEXT_TX(prod);
 		tx_push->doorbell =
 			cpu_to_le32(DB_KEY_TX_PUSH | DB_LONG_TX_PUSH | prod);
-		txr->tx_prod = prod;
+		WRITE_ONCE(txr->tx_prod, prod);
 
 		tx_buf->is_push = 1;
 		netdev_tx_sent_queue(txq, skb->len);
@@ -583,7 +583,7 @@ static netdev_tx_t bnxt_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	wmb();
 
 	prod = NEXT_TX(prod);
-	txr->tx_prod = prod;
+	WRITE_ONCE(txr->tx_prod, prod);
 
 	if (!netdev_xmit_more() || netif_xmit_stopped(txq))
 		bnxt_txr_db_kick(bp, txr, prod);
@@ -688,7 +688,7 @@ static void bnxt_tx_int(struct bnxt *bp, struct bnxt_napi *bnapi, int nr_pkts)
 		dev_kfree_skb_any(skb);
 	}
 
-	txr->tx_cons = cons;
+	WRITE_ONCE(txr->tx_cons, cons);
 
 	__netif_txq_completed_wake(txq, nr_pkts, tx_bytes,
 				   bnxt_tx_avail(bp, txr), bp->tx_wake_thresh,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index 18cac98ba58e..080e73496066 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2231,13 +2231,12 @@ struct bnxt {
 #define SFF_MODULE_ID_QSFP28			0x11
 #define BNXT_MAX_PHY_I2C_RESP_SIZE		64
 
-static inline u32 bnxt_tx_avail(struct bnxt *bp, struct bnxt_tx_ring_info *txr)
+static inline u32 bnxt_tx_avail(struct bnxt *bp,
+				const struct bnxt_tx_ring_info *txr)
 {
-	/* Tell compiler to fetch tx indices from memory. */
-	barrier();
+	u32 used = READ_ONCE(txr->tx_prod) - READ_ONCE(txr->tx_cons);
 
-	return bp->tx_ring_size -
-		((txr->tx_prod - txr->tx_cons) & bp->tx_ring_mask);
+	return bp->tx_ring_size - (used & bp->tx_ring_mask);
 }
 
 static inline void bnxt_writeq(struct bnxt *bp, u64 val,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH net-next 3/3] mlx4: use READ_ONCE/WRITE_ONCE for ring indexes
  2023-04-11  1:33 [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Jakub Kicinski
  2023-04-11  1:33 ` [PATCH net-next 1/3] net: docs: update the sample code in driver.rst Jakub Kicinski
  2023-04-11  1:33 ` [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes Jakub Kicinski
@ 2023-04-11  1:33 ` Jakub Kicinski
  2023-04-11 15:52   ` Jesse Brandeburg
  2023-04-11 15:55 ` [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Eric Dumazet
  3 siblings, 1 reply; 9+ messages in thread
From: Jakub Kicinski @ 2023-04-11  1:33 UTC (permalink / raw)
  To: davem; +Cc: netdev, edumazet, pabeni, Jakub Kicinski, tariqt, linux-rdma

Eric points out that we should make sure that ring index updates
are wrapped in the appropriate READ_ONCE/WRITE_ONCE macros.

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: tariqt@nvidia.com
CC: linux-rdma@vger.kernel.org
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 2f79378fbf6e..65cb63f6c465 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -228,7 +228,9 @@ void mlx4_en_deactivate_tx_ring(struct mlx4_en_priv *priv,
 
 static inline bool mlx4_en_is_tx_ring_full(struct mlx4_en_tx_ring *ring)
 {
-	return ring->prod - ring->cons > ring->full_size;
+	u32 used = READ_ONCE(ring->prod) - READ_ONCE(ring->cons);
+
+	return used > ring->full_size;
 }
 
 static void mlx4_en_stamp_wqe(struct mlx4_en_priv *priv,
@@ -1083,7 +1085,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 			op_own |= cpu_to_be32(MLX4_WQE_CTRL_IIP);
 	}
 
-	ring->prod += nr_txbb;
+	WRITE_ONCE(ring->prod, ring->prod + nr_txbb);
 
 	/* If we used a bounce buffer then copy descriptor back into place */
 	if (unlikely(bounce))
@@ -1214,7 +1216,7 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 
 	rx_ring->xdp_tx++;
 
-	ring->prod += MLX4_EN_XDP_TX_NRTXBB;
+	WRITE_ONCE(ring->prod, ring->prod + MLX4_EN_XDP_TX_NRTXBB);
 
 	/* Ensure new descriptor hits memory
 	 * before setting ownership of this descriptor to HW
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 1/3] net: docs: update the sample code in driver.rst
  2023-04-11  1:33 ` [PATCH net-next 1/3] net: docs: update the sample code in driver.rst Jakub Kicinski
@ 2023-04-11 15:52   ` Jesse Brandeburg
  0 siblings, 0 replies; 9+ messages in thread
From: Jesse Brandeburg @ 2023-04-11 15:52 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni

On 4/10/2023 6:33 PM, Jakub Kicinski wrote:
> The sample code talks about single-queue devices and uses locks.
> Update it to something resembling more modern code.
> Make sure we mention use of READ_ONCE() / WRITE_ONCE().
> 
> Change the comment which talked about consumer on the xmit side.
> AFAIU xmit is the producer and completions are a consumer.
> 
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

much better docs, and I'm really happy to see more documentation about
using (READ|WRITE)_ONCE()

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes
  2023-04-11  1:33 ` [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes Jakub Kicinski
@ 2023-04-11 15:52   ` Jesse Brandeburg
  2023-04-11 23:09   ` Michael Chan
  1 sibling, 0 replies; 9+ messages in thread
From: Jesse Brandeburg @ 2023-04-11 15:52 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni, michael.chan

On 4/10/2023 6:33 PM, Jakub Kicinski wrote:
> Eric points out that we should make sure that ring index updates
> are wrapped in the appropriate READ_ONCE/WRITE_ONCE macros.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: michael.chan@broadcom.com

Looks good, thanks!

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 3/3] mlx4: use READ_ONCE/WRITE_ONCE for ring indexes
  2023-04-11  1:33 ` [PATCH net-next 3/3] mlx4: " Jakub Kicinski
@ 2023-04-11 15:52   ` Jesse Brandeburg
  0 siblings, 0 replies; 9+ messages in thread
From: Jesse Brandeburg @ 2023-04-11 15:52 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, edumazet, pabeni, tariqt, linux-rdma

On 4/10/2023 6:33 PM, Jakub Kicinski wrote:
> Eric points out that we should make sure that ring index updates
> are wrapped in the appropriate READ_ONCE/WRITE_ONCE macros.
> 
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: tariqt@nvidia.com
> CC: linux-rdma@vger.kernel.org

There are a bunch of these issues in the intel directory too, KCSAN was
able to start pointing out some of them but I haven't gotten to fixing
them all. :-(

For this one, looks great!

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses
  2023-04-11  1:33 [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Jakub Kicinski
                   ` (2 preceding siblings ...)
  2023-04-11  1:33 ` [PATCH net-next 3/3] mlx4: " Jakub Kicinski
@ 2023-04-11 15:55 ` Eric Dumazet
  3 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2023-04-11 15:55 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, pabeni

On Tue, Apr 11, 2023 at 3:33 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Small follow up to the lockless ring stop/start macros.
> Update the doc and the drivers suggested by Eric:
> https://lore.kernel.org/all/CANn89iJrBGSybMX1FqrhCEMWT3Nnz2=2+aStsbbwpWzKHjk51g@mail.gmail.com/
>
> Jakub Kicinski (3):
>   net: docs: update the sample code in driver.rst
>   bnxt: use READ_ONCE/WRITE_ONCE for ring indexes
>   mlx4: use READ_ONCE/WRITE_ONCE for ring indexes
>

Reviewed-by: Eric Dumazet <edumazet@google.com>

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes
  2023-04-11  1:33 ` [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes Jakub Kicinski
  2023-04-11 15:52   ` Jesse Brandeburg
@ 2023-04-11 23:09   ` Michael Chan
  1 sibling, 0 replies; 9+ messages in thread
From: Michael Chan @ 2023-04-11 23:09 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, edumazet, pabeni

[-- Attachment #1: Type: text/plain, Size: 483 bytes --]

On Mon, Apr 10, 2023 at 6:33 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> Eric points out that we should make sure that ring index updates
> are wrapped in the appropriate READ_ONCE/WRITE_ONCE macros.
>
> Suggested-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Looks good.  But there is similar logic to update and check TX ring
index in bnxt_xdp.c.  I think we should make the same change there for
consistency.

Thanks.

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4209 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-04-11 23:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-11  1:33 [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Jakub Kicinski
2023-04-11  1:33 ` [PATCH net-next 1/3] net: docs: update the sample code in driver.rst Jakub Kicinski
2023-04-11 15:52   ` Jesse Brandeburg
2023-04-11  1:33 ` [PATCH net-next 2/3] bnxt: use READ_ONCE/WRITE_ONCE for ring indexes Jakub Kicinski
2023-04-11 15:52   ` Jesse Brandeburg
2023-04-11 23:09   ` Michael Chan
2023-04-11  1:33 ` [PATCH net-next 3/3] mlx4: " Jakub Kicinski
2023-04-11 15:52   ` Jesse Brandeburg
2023-04-11 15:55 ` [PATCH net-next 0/3] net: use READ_ONCE/WRITE_ONCE for ring index accesses Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).