netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues
@ 2024-06-04 13:21 Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski

Hi,

changes included in this patchset address an issue that customer has
been facing when AF_XDP ZC Tx sockets were used in combination with flow
control and regular Tx traffic.

After executing:
ethtool --set-priv-flags $dev link-down-on-close on
ethtool -A $dev rx on tx on

launching multiple ZC Tx sockets on $dev + pinging remote interface (so
that regular Tx traffic is present) and then going through down/up of
$dev, Tx timeout occured and then most of the time ice driver was unable
to recover from that state.

These patches combined together solve the described above issue on
customer side. Main focus here is to forbid producing Tx descriptors
when either carrier is not yet initialized or process of bringing
interface down has already started.

Thanks,
Maciej

v2->v3:
- drop redundant braces in patch 3 [Shannon]
- fix naming of ice_xsk_pool() in patch 6 [Shannon]
- add review and test tags

v1->v2:
- fix kdoc issues in patch 6 and 8
- drop Larysa's patches for now

Maciej Fijalkowski (7):
  ice: don't busy wait for Rx queue disable in ice_qp_dis()
  ice: replace synchronize_rcu with synchronize_net
  ice: modify error handling when setting XSK pool in ndo_bpf
  ice: toggle netif_carrier when setting up XSK pool
  ice: improve updating ice_{t,r}x_ring::xsk_pool
  ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
  ice: xsk: fix txq interrupt mapping

Michal Kubiak (1):
  ice: respect netif readiness in AF_XDP ZC related ndo's

 drivers/net/ethernet/intel/ice/ice.h      |  11 +-
 drivers/net/ethernet/intel/ice/ice_base.c |   4 +-
 drivers/net/ethernet/intel/ice/ice_main.c |   2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c |   6 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 153 +++++++++++++---------
 drivers/net/ethernet/intel/ice/ice_xsk.h  |   4 +-
 6 files changed, 104 insertions(+), 76 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-11 11:59   ` [Intel-wired-lan] " Alexander Lobakin
  2024-06-04 13:21 ` [PATCH v3 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Shannon Nelson,
	Chandan Kumar Rout, Maciej Fijalkowski

From: Michal Kubiak <michal.kubiak@intel.com>

Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
ring when link is either not yet fully initialized or process of
stopping the netdev has already started. To avoid this, add checks
against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
One could argue that bailing out early in ice_xsk_wakeup() would be
sufficient but given the fact that we produce Tx descriptors on behalf
of NAPI that is triggered for Rx traffic, the latter is also needed.

Bringing link up is an asynchronous event executed within
ice_service_task so even though interface has been brought up there is
still a time frame where link is not yet ok.

Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
Tx, Tx timeouts occur after going through link flap (admin brings
interface down then up again). HW seem to be unable to transmit
descriptor to the wire after HW tail register bump which in turn causes
bit __QUEUE_STATE_STACK_XOFF to be set forever as
netdev_tx_completed_queue() sees no cleaned bytes on the input.

Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 2015f66b0cf9..1bd4b054dd80 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
 
 	ice_clean_xdp_irq_zc(xdp_ring);
 
+	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
+	    !netif_running(xdp_ring->vsi->netdev))
+		return true;
+
 	budget = ICE_DESC_UNUSED(xdp_ring);
 	budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
 
@@ -1091,7 +1095,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
 	struct ice_vsi *vsi = np->vsi;
 	struct ice_tx_ring *ring;
 
-	if (test_bit(ICE_VSI_DOWN, vsi->state))
+	if (test_bit(ICE_VSI_DOWN, vsi->state) || !netif_carrier_ok(netdev))
 		return -ENETDOWN;
 
 	if (!ice_is_xdp_ena_vsi(vsi))
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis()
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

When ice driver is spammed with multiple xdpsock instances and flow
control is enabled, there are cases when Rx queue gets stuck and unable
to reflect the disable state in QRX_CTRL register. Similar issue has
previously been addressed in commit 13a6233b033f ("ice: Add support to
enable/disable all Rx queues before waiting").

To workaround this, let us simply not wait for a disabled state as later
patch will make sure that regardless of the encountered error in the
process of disabling a queue pair, the Rx queue will be enabled.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 1bd4b054dd80..4f606a1055b0 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -199,10 +199,8 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 		if (err)
 			return err;
 	}
-	err = ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, true);
-	if (err)
-		return err;
 
+	ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, false);
 	ice_qp_clean_rings(vsi, q_idx);
 	ice_qp_reset_stats(vsi, q_idx);
 
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can
be called instead of synchronize_rcu() so that XDP rings can finish its
job in a faster way. Also let us do this as earlier in XSK queue disable
flow.

Additionally, turn off regular Tx queue before disabling irqs and NAPI.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 4f606a1055b0..b6f4ddb744d4 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -52,10 +52,8 @@ static void ice_qp_reset_stats(struct ice_vsi *vsi, u16 q_idx)
 static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx)
 {
 	ice_clean_tx_ring(vsi->tx_rings[q_idx]);
-	if (ice_is_xdp_ena_vsi(vsi)) {
-		synchronize_rcu();
+	if (ice_is_xdp_ena_vsi(vsi))
 		ice_clean_tx_ring(vsi->xdp_rings[q_idx]);
-	}
 	ice_clean_rx_ring(vsi->rx_rings[q_idx]);
 }
 
@@ -180,11 +178,12 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 		usleep_range(1000, 2000);
 	}
 
+	synchronize_net();
+	netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
+
 	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
 	ice_qvec_toggle_napi(vsi, q_vector, false);
 
-	netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
-
 	ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
 	err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
 	if (err)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
                   ` (2 preceding siblings ...)
  2024-06-04 13:21 ` [PATCH v3 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

Don't bail out right when spotting an error within ice_qp_{dis,ena}()
but rather track error and go through whole flow of disabling and
enabling queue pair.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 30 +++++++++++++-----------
 1 file changed, 16 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index b6f4ddb744d4..8ca7ff1a7e7f 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -162,6 +162,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 	struct ice_tx_ring *tx_ring;
 	struct ice_rx_ring *rx_ring;
 	int timeout = 50;
+	int fail = 0;
 	int err;
 
 	if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
@@ -186,8 +187,8 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 
 	ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
 	err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
-	if (err)
-		return err;
+	if (!fail)
+		fail = err;
 	if (ice_is_xdp_ena_vsi(vsi)) {
 		struct ice_tx_ring *xdp_ring = vsi->xdp_rings[q_idx];
 
@@ -195,15 +196,15 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 		ice_fill_txq_meta(vsi, xdp_ring, &txq_meta);
 		err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, xdp_ring,
 					   &txq_meta);
-		if (err)
-			return err;
+		if (!fail)
+			fail = err;
 	}
 
 	ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, false);
 	ice_qp_clean_rings(vsi, q_idx);
 	ice_qp_reset_stats(vsi, q_idx);
 
-	return 0;
+	return fail;
 }
 
 /**
@@ -216,32 +217,33 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 {
 	struct ice_q_vector *q_vector;
+	int fail = 0;
 	int err;
 
 	err = ice_vsi_cfg_single_txq(vsi, vsi->tx_rings, q_idx);
-	if (err)
-		return err;
+	if (!fail)
+		fail = err;
 
 	if (ice_is_xdp_ena_vsi(vsi)) {
 		struct ice_tx_ring *xdp_ring = vsi->xdp_rings[q_idx];
 
 		err = ice_vsi_cfg_single_txq(vsi, vsi->xdp_rings, q_idx);
-		if (err)
-			return err;
+		if (!fail)
+			fail = err;
 		ice_set_ring_xdp(xdp_ring);
 		ice_tx_xsk_pool(vsi, q_idx);
 	}
 
 	err = ice_vsi_cfg_single_rxq(vsi, q_idx);
-	if (err)
-		return err;
+	if (!fail)
+		fail = err;
 
 	q_vector = vsi->rx_rings[q_idx]->q_vector;
 	ice_qvec_cfg_msix(vsi, q_vector);
 
 	err = ice_vsi_ctrl_one_rx_ring(vsi, true, q_idx, true);
-	if (err)
-		return err;
+	if (!fail)
+		fail = err;
 
 	ice_qvec_toggle_napi(vsi, q_vector, true);
 	ice_qvec_ena_irq(vsi, q_vector);
@@ -249,7 +251,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
 	clear_bit(ICE_CFG_BUSY, vsi->state);
 
-	return 0;
+	return fail;
 }
 
 /**
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
                   ` (3 preceding siblings ...)
  2024-06-04 13:21 ` [PATCH v3 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-26 12:21   ` Michal Kubiak
  2024-06-04 13:21 ` [PATCH v3 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

This so we prevent Tx timeout issues. One of conditions checked on
running in the background dev_watchdog() is netif_carrier_ok(), so let
us turn it off when we disable the queues that belong to a q_vector
where XSK pool is being configured.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 8ca7ff1a7e7f..21df6888961b 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -180,6 +180,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
 	}
 
 	synchronize_net();
+	netif_carrier_off(vsi->netdev);
 	netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
 
 	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
@@ -249,6 +250,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 	ice_qvec_ena_irq(vsi, q_vector);
 
 	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
+	netif_carrier_on(vsi->netdev);
 	clear_bit(ICE_CFG_BUSY, vsi->state);
 
 	return fail;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
                   ` (4 preceding siblings ...)
  2024-06-04 13:21 ` [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

xsk_buff_pool pointers that ice ring structs hold are updated via
ndo_bpf that is executed in process context while it can be read by
remote CPU at the same time within NAPI poll. Use synchronize_net()
after pointer update and {READ,WRITE}_ONCE() when working with mentioned
pointer.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice.h      | 11 ++--
 drivers/net/ethernet/intel/ice/ice_base.c |  4 +-
 drivers/net/ethernet/intel/ice/ice_main.c |  2 +-
 drivers/net/ethernet/intel/ice/ice_txrx.c |  4 +-
 drivers/net/ethernet/intel/ice/ice_xsk.c  | 78 ++++++++++++++---------
 drivers/net/ethernet/intel/ice/ice_xsk.h  |  4 +-
 6 files changed, 61 insertions(+), 42 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index da8c8afebc93..9f275f398802 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -765,18 +765,17 @@ static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
 }
 
 /**
- * ice_xsk_pool - get XSK buffer pool bound to a ring
+ * ice_rx_xsk_pool - assign XSK buff pool to Rx ring
  * @ring: Rx ring to use
  *
- * Returns a pointer to xsk_buff_pool structure if there is a buffer pool
- * present, NULL otherwise.
+ * Sets XSK buff pool pointer on Rx ring.
  */
-static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
+static inline void ice_rx_xsk_pool(struct ice_rx_ring *ring)
 {
 	struct ice_vsi *vsi = ring->vsi;
 	u16 qid = ring->q_index;
 
-	return ice_get_xp_from_qid(vsi, qid);
+	WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
 }
 
 /**
@@ -801,7 +800,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
 	if (!ring)
 		return;
 
-	ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
+	WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
 }
 
 /**
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index 5d396c1a7731..1facf179a96f 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -536,7 +536,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
 				return err;
 		}
 
-		ring->xsk_pool = ice_xsk_pool(ring);
+		ice_rx_xsk_pool(ring);
 		if (ring->xsk_pool) {
 			xdp_rxq_info_unreg(&ring->xdp_rxq);
 
@@ -597,7 +597,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
 			return 0;
 		}
 
-		ok = ice_alloc_rx_bufs_zc(ring, num_bufs);
+		ok = ice_alloc_rx_bufs_zc(ring, ring->xsk_pool, num_bufs);
 		if (!ok) {
 			u16 pf_q = ring->vsi->rxq_map[ring->q_index];
 
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 1b61ca3a6eb6..15a6805ac2a1 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2946,7 +2946,7 @@ static void ice_vsi_rx_napi_schedule(struct ice_vsi *vsi)
 	ice_for_each_rxq(vsi, i) {
 		struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
 
-		if (rx_ring->xsk_pool)
+		if (READ_ONCE(rx_ring->xsk_pool))
 			napi_schedule(&rx_ring->q_vector->napi);
 	}
 }
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 8bb743f78fcb..f4b2b1bca234 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1523,7 +1523,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
 	ice_for_each_tx_ring(tx_ring, q_vector->tx) {
 		bool wd;
 
-		if (tx_ring->xsk_pool)
+		if (READ_ONCE(tx_ring->xsk_pool))
 			wd = ice_xmit_zc(tx_ring);
 		else if (ice_ring_is_xdp(tx_ring))
 			wd = true;
@@ -1556,7 +1556,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
 		 * comparison in the irq context instead of many inside the
 		 * ice_clean_rx_irq function and makes the codebase cleaner.
 		 */
-		cleaned = rx_ring->xsk_pool ?
+		cleaned = READ_ONCE(rx_ring->xsk_pool) ?
 			  ice_clean_rx_irq_zc(rx_ring, budget_per_ring) :
 			  ice_clean_rx_irq(rx_ring, budget_per_ring);
 		work_done += cleaned;
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 21df6888961b..b7c291fce53c 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -249,6 +249,8 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 	ice_qvec_toggle_napi(vsi, q_vector, true);
 	ice_qvec_ena_irq(vsi, q_vector);
 
+	/* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */
+	synchronize_net();
 	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
 	netif_carrier_on(vsi->netdev);
 	clear_bit(ICE_CFG_BUSY, vsi->state);
@@ -460,6 +462,7 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
 /**
  * __ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
  * @rx_ring: Rx ring
+ * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
  * @count: The number of buffers to allocate
  *
  * Place the @count of descriptors onto Rx ring. Handle the ring wrap
@@ -468,7 +471,8 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
  *
  * Returns true if all allocations were successful, false if any fail.
  */
-static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
+static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+				   struct xsk_buff_pool *xsk_pool, u16 count)
 {
 	u32 nb_buffs_extra = 0, nb_buffs = 0;
 	union ice_32b_rx_flex_desc *rx_desc;
@@ -480,8 +484,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
 	xdp = ice_xdp_buf(rx_ring, ntu);
 
 	if (ntu + count >= rx_ring->count) {
-		nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
-						   rx_desc,
+		nb_buffs_extra = ice_fill_rx_descs(xsk_pool, xdp, rx_desc,
 						   rx_ring->count - ntu);
 		if (nb_buffs_extra != rx_ring->count - ntu) {
 			ntu += nb_buffs_extra;
@@ -494,7 +497,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
 		ice_release_rx_desc(rx_ring, 0);
 	}
 
-	nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
+	nb_buffs = ice_fill_rx_descs(xsk_pool, xdp, rx_desc, count);
 
 	ntu += nb_buffs;
 	if (ntu == rx_ring->count)
@@ -510,6 +513,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
 /**
  * ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
  * @rx_ring: Rx ring
+ * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
  * @count: The number of buffers to allocate
  *
  * Wrapper for internal allocation routine; figure out how many tail
@@ -517,7 +521,8 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
  *
  * Returns true if all calls to internal alloc routine succeeded
  */
-bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
+bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+			  struct xsk_buff_pool *xsk_pool, u16 count)
 {
 	u16 rx_thresh = ICE_RING_QUARTER(rx_ring);
 	u16 leftover, i, tail_bumps;
@@ -526,9 +531,9 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
 	leftover = count - (tail_bumps * rx_thresh);
 
 	for (i = 0; i < tail_bumps; i++)
-		if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh))
+		if (!__ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, rx_thresh))
 			return false;
-	return __ice_alloc_rx_bufs_zc(rx_ring, leftover);
+	return __ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, leftover);
 }
 
 /**
@@ -649,7 +654,7 @@ static u32 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring)
 	if (xdp_ring->next_to_clean >= cnt)
 		xdp_ring->next_to_clean -= cnt;
 	if (xsk_frames)
-		xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames);
+		xsk_tx_completed(READ_ONCE(xdp_ring->xsk_pool), xsk_frames);
 
 	return completed_frames;
 }
@@ -701,7 +706,8 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
 		dma_addr_t dma;
 
 		dma = xsk_buff_xdp_get_dma(xdp);
-		xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, size);
+		xsk_buff_raw_dma_sync_for_device(READ_ONCE(xdp_ring->xsk_pool),
+						 dma, size);
 
 		tx_buf->xdp = xdp;
 		tx_buf->type = ICE_TX_BUF_XSK_TX;
@@ -759,7 +765,8 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
 		err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
 		if (!err)
 			return ICE_XDP_REDIR;
-		if (xsk_uses_need_wakeup(rx_ring->xsk_pool) && err == -ENOBUFS)
+		if (xsk_uses_need_wakeup(READ_ONCE(rx_ring->xsk_pool)) &&
+		    err == -ENOBUFS)
 			result = ICE_XDP_EXIT;
 		else
 			result = ICE_XDP_CONSUMED;
@@ -828,8 +835,8 @@ ice_add_xsk_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *first,
  */
 int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 {
+	struct xsk_buff_pool *xsk_pool = READ_ONCE(rx_ring->xsk_pool);
 	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
-	struct xsk_buff_pool *xsk_pool = rx_ring->xsk_pool;
 	u32 ntc = rx_ring->next_to_clean;
 	u32 ntu = rx_ring->next_to_use;
 	struct xdp_buff *first = NULL;
@@ -941,7 +948,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 	rx_ring->next_to_clean = ntc;
 	entries_to_alloc = ICE_RX_DESC_UNUSED(rx_ring);
 	if (entries_to_alloc > ICE_RING_QUARTER(rx_ring))
-		failure |= !ice_alloc_rx_bufs_zc(rx_ring, entries_to_alloc);
+		failure |= !ice_alloc_rx_bufs_zc(rx_ring, xsk_pool,
+						 entries_to_alloc);
 
 	ice_finalize_xdp_rx(xdp_ring, xdp_xmit, 0);
 	ice_update_rx_ring_stats(rx_ring, total_rx_packets, total_rx_bytes);
@@ -964,17 +972,19 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
 /**
  * ice_xmit_pkt - produce a single HW Tx descriptor out of AF_XDP descriptor
  * @xdp_ring: XDP ring to produce the HW Tx descriptor on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
  * @desc: AF_XDP descriptor to pull the DMA address and length from
  * @total_bytes: bytes accumulator that will be used for stats update
  */
-static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
+static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring,
+			 struct xsk_buff_pool *xsk_pool, struct xdp_desc *desc,
 			 unsigned int *total_bytes)
 {
 	struct ice_tx_desc *tx_desc;
 	dma_addr_t dma;
 
-	dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr);
-	xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len);
+	dma = xsk_buff_raw_get_dma(xsk_pool, desc->addr);
+	xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, desc->len);
 
 	tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_to_use++);
 	tx_desc->buf_addr = cpu_to_le64(dma);
@@ -987,10 +997,13 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
 /**
  * ice_xmit_pkt_batch - produce a batch of HW Tx descriptors out of AF_XDP descriptors
  * @xdp_ring: XDP ring to produce the HW Tx descriptors on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
  * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
  * @total_bytes: bytes accumulator that will be used for stats update
  */
-static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
+static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring,
+			       struct xsk_buff_pool *xsk_pool,
+			       struct xdp_desc *descs,
 			       unsigned int *total_bytes)
 {
 	u16 ntu = xdp_ring->next_to_use;
@@ -1000,8 +1013,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
 	loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) {
 		dma_addr_t dma;
 
-		dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, descs[i].addr);
-		xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, descs[i].len);
+		dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr);
+		xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, descs[i].len);
 
 		tx_desc = ICE_TX_DESC(xdp_ring, ntu++);
 		tx_desc->buf_addr = cpu_to_le64(dma);
@@ -1017,21 +1030,24 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
 /**
  * ice_fill_tx_hw_ring - produce the number of Tx descriptors onto ring
  * @xdp_ring: XDP ring to produce the HW Tx descriptors on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
  * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
  * @nb_pkts: count of packets to be send
  * @total_bytes: bytes accumulator that will be used for stats update
  */
-static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
-				u32 nb_pkts, unsigned int *total_bytes)
+static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring,
+				struct xsk_buff_pool *xsk_pool,
+				struct xdp_desc *descs, u32 nb_pkts,
+				unsigned int *total_bytes)
 {
 	u32 batched, leftover, i;
 
 	batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH);
 	leftover = nb_pkts & (PKTS_PER_BATCH - 1);
 	for (i = 0; i < batched; i += PKTS_PER_BATCH)
-		ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes);
+		ice_xmit_pkt_batch(xdp_ring, xsk_pool, &descs[i], total_bytes);
 	for (; i < batched + leftover; i++)
-		ice_xmit_pkt(xdp_ring, &descs[i], total_bytes);
+		ice_xmit_pkt(xdp_ring, xsk_pool, &descs[i], total_bytes);
 }
 
 /**
@@ -1042,7 +1058,8 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d
  */
 bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
 {
-	struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
+	struct xsk_buff_pool *xsk_pool = READ_ONCE(xdp_ring->xsk_pool);
+	struct xdp_desc *descs = xsk_pool->tx_descs;
 	u32 nb_pkts, nb_processed = 0;
 	unsigned int total_bytes = 0;
 	int budget;
@@ -1056,25 +1073,26 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
 	budget = ICE_DESC_UNUSED(xdp_ring);
 	budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
 
-	nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
+	nb_pkts = xsk_tx_peek_release_desc_batch(xsk_pool, budget);
 	if (!nb_pkts)
 		return true;
 
 	if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
 		nb_processed = xdp_ring->count - xdp_ring->next_to_use;
-		ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes);
+		ice_fill_tx_hw_ring(xdp_ring, xsk_pool, descs, nb_processed,
+				    &total_bytes);
 		xdp_ring->next_to_use = 0;
 	}
 
-	ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed,
-			    &total_bytes);
+	ice_fill_tx_hw_ring(xdp_ring, xsk_pool, &descs[nb_processed],
+			    nb_pkts - nb_processed, &total_bytes);
 
 	ice_set_rs_bit(xdp_ring);
 	ice_xdp_ring_update_tail(xdp_ring);
 	ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
 
-	if (xsk_uses_need_wakeup(xdp_ring->xsk_pool))
-		xsk_set_tx_need_wakeup(xdp_ring->xsk_pool);
+	if (xsk_uses_need_wakeup(xsk_pool))
+		xsk_set_tx_need_wakeup(xsk_pool);
 
 	return nb_pkts < budget;
 }
@@ -1107,7 +1125,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
 
 	ring = vsi->rx_rings[queue_id]->xdp_ring;
 
-	if (!ring->xsk_pool)
+	if (!READ_ONCE(ring->xsk_pool))
 		return -EINVAL;
 
 	/* The idea here is that if NAPI is running, mark a miss, so
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h
index 6fa181f080ef..4cd2d62a0836 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.h
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.h
@@ -22,7 +22,8 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool,
 		       u16 qid);
 int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget);
 int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
-bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count);
+bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+			  struct xsk_buff_pool *xsk_pool, u16 count);
 bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
 void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring);
 void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring);
@@ -51,6 +52,7 @@ ice_clean_rx_irq_zc(struct ice_rx_ring __always_unused *rx_ring,
 
 static inline bool
 ice_alloc_rx_bufs_zc(struct ice_rx_ring __always_unused *rx_ring,
+		     struct xsk_buff_pool __always_unused *xsk_pool,
 		     u16 __always_unused count)
 {
 	return false;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
                   ` (5 preceding siblings ...)
  2024-06-04 13:21 ` [PATCH v3 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  2024-06-04 13:21 ` [PATCH v3 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

It is read by data path and modified from process context on remote cpu
so it is needed to use WRITE_ONCE to clear the pointer.

Fixes: efc2214b6047 ("ice: Add support for XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index f4b2b1bca234..4c115531beba 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -456,7 +456,7 @@ void ice_free_rx_ring(struct ice_rx_ring *rx_ring)
 	if (rx_ring->vsi->type == ICE_VSI_PF)
 		if (xdp_rxq_info_is_reg(&rx_ring->xdp_rxq))
 			xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
-	rx_ring->xdp_prog = NULL;
+	WRITE_ONCE(rx_ring->xdp_prog, NULL);
 	if (rx_ring->xsk_pool) {
 		kfree(rx_ring->xdp_buf);
 		rx_ring->xdp_buf = NULL;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 iwl-net 8/8] ice: xsk: fix txq interrupt mapping
  2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
                   ` (6 preceding siblings ...)
  2024-06-04 13:21 ` [PATCH v3 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
@ 2024-06-04 13:21 ` Maciej Fijalkowski
  7 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 13:21 UTC (permalink / raw)
  To: intel-wired-lan
  Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
	larysa.zaremba, jacob.e.keller, Maciej Fijalkowski,
	Shannon Nelson, Chandan Kumar Rout

ice_cfg_txq_interrupt() internally handles XDP Tx ring. Do not use
ice_for_each_tx_ring() in ice_qvec_cfg_msix() as this causing us to
treat XDP ring that belongs to queue vector as Tx ring and therefore
misconfiguring the interrupts.

Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index b7c291fce53c..c064634b932f 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -110,25 +110,29 @@ ice_qvec_dis_irq(struct ice_vsi *vsi, struct ice_rx_ring *rx_ring,
  * ice_qvec_cfg_msix - Enable IRQ for given queue vector
  * @vsi: the VSI that contains queue vector
  * @q_vector: queue vector
+ * @qid: queue index
  */
 static void
-ice_qvec_cfg_msix(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
+ice_qvec_cfg_msix(struct ice_vsi *vsi, struct ice_q_vector *q_vector, u16 qid)
 {
 	u16 reg_idx = q_vector->reg_idx;
 	struct ice_pf *pf = vsi->back;
 	struct ice_hw *hw = &pf->hw;
-	struct ice_tx_ring *tx_ring;
-	struct ice_rx_ring *rx_ring;
+	int q, _qid = qid;
 
 	ice_cfg_itr(hw, q_vector);
 
-	ice_for_each_tx_ring(tx_ring, q_vector->tx)
-		ice_cfg_txq_interrupt(vsi, tx_ring->reg_idx, reg_idx,
-				      q_vector->tx.itr_idx);
+	for (q = 0; q < q_vector->num_ring_tx; q++) {
+		ice_cfg_txq_interrupt(vsi, _qid, reg_idx, q_vector->tx.itr_idx);
+		_qid++;
+	}
 
-	ice_for_each_rx_ring(rx_ring, q_vector->rx)
-		ice_cfg_rxq_interrupt(vsi, rx_ring->reg_idx, reg_idx,
-				      q_vector->rx.itr_idx);
+	_qid = qid;
+
+	for (q = 0; q < q_vector->num_ring_rx; q++) {
+		ice_cfg_rxq_interrupt(vsi, _qid, reg_idx, q_vector->rx.itr_idx);
+		_qid++;
+	}
 
 	ice_flush(hw);
 }
@@ -240,7 +244,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
 		fail = err;
 
 	q_vector = vsi->rx_rings[q_idx]->q_vector;
-	ice_qvec_cfg_msix(vsi, q_vector);
+	ice_qvec_cfg_msix(vsi, q_vector, q_idx);
 
 	err = ice_vsi_ctrl_one_rx_ring(vsi, true, q_idx, true);
 	if (!fail)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-04 13:21 ` [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
@ 2024-06-11 11:59   ` Alexander Lobakin
  2024-06-11 14:21     ` Maciej Fijalkowski
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2024-06-11 11:59 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Tue,  4 Jun 2024 15:21:48 +0200

> From: Michal Kubiak <michal.kubiak@intel.com>
> 
> Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
> ring when link is either not yet fully initialized or process of
> stopping the netdev has already started. To avoid this, add checks
> against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
> One could argue that bailing out early in ice_xsk_wakeup() would be
> sufficient but given the fact that we produce Tx descriptors on behalf
> of NAPI that is triggered for Rx traffic, the latter is also needed.
> 
> Bringing link up is an asynchronous event executed within
> ice_service_task so even though interface has been brought up there is
> still a time frame where link is not yet ok.
> 
> Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
> Tx, Tx timeouts occur after going through link flap (admin brings
> interface down then up again). HW seem to be unable to transmit
> descriptor to the wire after HW tail register bump which in turn causes
> bit __QUEUE_STATE_STACK_XOFF to be set forever as
> netdev_tx_completed_queue() sees no cleaned bytes on the input.
> 
> Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 2015f66b0cf9..1bd4b054dd80 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
>  
>  	ice_clean_xdp_irq_zc(xdp_ring);
>  
> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
> +	    !netif_running(xdp_ring->vsi->netdev))
> +		return true;

Why is it checked after clean_xdp_irq_zc()?
Also, unlikely()?

> +
>  	budget = ICE_DESC_UNUSED(xdp_ring);
>  	budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
>  
> @@ -1091,7 +1095,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
>  	struct ice_vsi *vsi = np->vsi;
>  	struct ice_tx_ring *ring;
>  
> -	if (test_bit(ICE_VSI_DOWN, vsi->state))
> +	if (test_bit(ICE_VSI_DOWN, vsi->state) || !netif_carrier_ok(netdev))

Same for unlikely()?

>  		return -ENETDOWN;
>  
>  	if (!ice_is_xdp_ena_vsi(vsi))

Thanks,
Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-11 11:59   ` [Intel-wired-lan] " Alexander Lobakin
@ 2024-06-11 14:21     ` Maciej Fijalkowski
  2024-06-11 20:13       ` Tony Nguyen
  2024-06-12  9:09       ` Alexander Lobakin
  0 siblings, 2 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-11 14:21 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

On Tue, Jun 11, 2024 at 01:59:37PM +0200, Alexander Lobakin wrote:
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Tue,  4 Jun 2024 15:21:48 +0200
> 
> > From: Michal Kubiak <michal.kubiak@intel.com>
> > 
> > Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
> > ring when link is either not yet fully initialized or process of
> > stopping the netdev has already started. To avoid this, add checks
> > against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
> > One could argue that bailing out early in ice_xsk_wakeup() would be
> > sufficient but given the fact that we produce Tx descriptors on behalf
> > of NAPI that is triggered for Rx traffic, the latter is also needed.
> > 
> > Bringing link up is an asynchronous event executed within
> > ice_service_task so even though interface has been brought up there is
> > still a time frame where link is not yet ok.
> > 
> > Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
> > Tx, Tx timeouts occur after going through link flap (admin brings
> > interface down then up again). HW seem to be unable to transmit
> > descriptor to the wire after HW tail register bump which in turn causes
> > bit __QUEUE_STATE_STACK_XOFF to be set forever as
> > netdev_tx_completed_queue() sees no cleaned bytes on the input.
> > 
> > Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
> > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
> > Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
> > Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index 2015f66b0cf9..1bd4b054dd80 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> >  
> >  	ice_clean_xdp_irq_zc(xdp_ring);
> >  
> > +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
> > +	    !netif_running(xdp_ring->vsi->netdev))
> > +		return true;
> 
> Why is it checked after clean_xdp_irq_zc()?

There's nothing wrong with cleaning descriptors that have been sent
previously. We don't touch anything HW nor netstack related there, just
bumping ntc and producing CQ descriptors, both ops are pure SW things.

> Also, unlikely()?

Thought about that as well but we played it safe first. I wouldn't like to
resubmit whole series due to this so maybe Tony/Jake could address this
when sending to netdev, or am I asking for too much?:)

> 
> > +
> >  	budget = ICE_DESC_UNUSED(xdp_ring);
> >  	budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
> >  
> > @@ -1091,7 +1095,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
> >  	struct ice_vsi *vsi = np->vsi;
> >  	struct ice_tx_ring *ring;
> >  
> > -	if (test_bit(ICE_VSI_DOWN, vsi->state))
> > +	if (test_bit(ICE_VSI_DOWN, vsi->state) || !netif_carrier_ok(netdev))
> 
> Same for unlikely()?
> 
> >  		return -ENETDOWN;
> >  
> >  	if (!ice_is_xdp_ena_vsi(vsi))
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-11 14:21     ` Maciej Fijalkowski
@ 2024-06-11 20:13       ` Tony Nguyen
  2024-06-12  9:09       ` Alexander Lobakin
  1 sibling, 0 replies; 19+ messages in thread
From: Tony Nguyen @ 2024-06-11 20:13 UTC (permalink / raw)
  To: Maciej Fijalkowski, Alexander Lobakin
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	jacob.e.keller, Chandan Kumar Rout, magnus.karlsson,
	Shannon Nelson



On 6/11/2024 10:21 AM, Maciej Fijalkowski wrote:
> On Tue, Jun 11, 2024 at 01:59:37PM +0200, Alexander Lobakin wrote:

>>>   	ice_clean_xdp_irq_zc(xdp_ring);
>>>   
>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
>>> +	    !netif_running(xdp_ring->vsi->netdev))
>>> +		return true;

...

>> Also, unlikely()?
> 
> Thought about that as well but we played it safe first. I wouldn't like to
> resubmit whole series due to this so maybe Tony/Jake could address this
> when sending to netdev, or am I asking for too much?:)

I can adjust when sending it on to netdev.

Thanks,
Tony

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-11 14:21     ` Maciej Fijalkowski
  2024-06-11 20:13       ` Tony Nguyen
@ 2024-06-12  9:09       ` Alexander Lobakin
  2024-06-12  9:15         ` Alexander Lobakin
  1 sibling, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2024-06-12  9:09 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Tue, 11 Jun 2024 16:21:27 +0200

> On Tue, Jun 11, 2024 at 01:59:37PM +0200, Alexander Lobakin wrote:
>> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>> Date: Tue,  4 Jun 2024 15:21:48 +0200
>>
>>> From: Michal Kubiak <michal.kubiak@intel.com>
>>>
>>> Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
>>> ring when link is either not yet fully initialized or process of
>>> stopping the netdev has already started. To avoid this, add checks
>>> against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
>>> One could argue that bailing out early in ice_xsk_wakeup() would be
>>> sufficient but given the fact that we produce Tx descriptors on behalf
>>> of NAPI that is triggered for Rx traffic, the latter is also needed.
>>>
>>> Bringing link up is an asynchronous event executed within
>>> ice_service_task so even though interface has been brought up there is
>>> still a time frame where link is not yet ok.
>>>
>>> Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
>>> Tx, Tx timeouts occur after going through link flap (admin brings
>>> interface down then up again). HW seem to be unable to transmit
>>> descriptor to the wire after HW tail register bump which in turn causes
>>> bit __QUEUE_STATE_STACK_XOFF to be set forever as
>>> netdev_tx_completed_queue() sees no cleaned bytes on the input.
>>>
>>> Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
>>> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>>> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
>>> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
>>> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
>>> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>>> ---
>>>  drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
>>>  1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>> index 2015f66b0cf9..1bd4b054dd80 100644
>>> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
>>> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
>>>  
>>>  	ice_clean_xdp_irq_zc(xdp_ring);
>>>  
>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
>>> +	    !netif_running(xdp_ring->vsi->netdev))
>>> +		return true;
>>
>> Why is it checked after clean_xdp_irq_zc()?
> 
> There's nothing wrong with cleaning descriptors that have been sent
> previously. We don't touch anything HW nor netstack related there, just
> bumping ntc and producing CQ descriptors, both ops are pure SW things.

Sure, but do we need to do that if we don't send anything this time?
Lazy cleaning and all that :p

Thanks,
Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-12  9:09       ` Alexander Lobakin
@ 2024-06-12  9:15         ` Alexander Lobakin
  2024-06-13 15:51           ` Maciej Fijalkowski
  0 siblings, 1 reply; 19+ messages in thread
From: Alexander Lobakin @ 2024-06-12  9:15 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Wed, 12 Jun 2024 11:09:10 +0200

> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Date: Tue, 11 Jun 2024 16:21:27 +0200

[...]

>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>> index 2015f66b0cf9..1bd4b054dd80 100644
>>>> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
>>>>  
>>>>  	ice_clean_xdp_irq_zc(xdp_ring);
>>>>  
>>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
>>>> +	    !netif_running(xdp_ring->vsi->netdev))

Oh BTW, I noticed some time ago that netif_running() is less precise
than checking for %IFF_UP.
For example, in this piece (main netdev ifup function)[0],
netif_running() will start returning true *before* driver's .ndo_open()
is called, but %IFF_UP will be set only after .ndo_open() is done (with
no issues).
That means, I'd check for %IFF_UP honestly (maybe even before checking
the carrier).

[0] https://elixir.bootlin.com/linux/v6.10-rc3/source/net/core/dev.c#L1466

Thanks,
Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-12  9:15         ` Alexander Lobakin
@ 2024-06-13 15:51           ` Maciej Fijalkowski
  2024-06-13 16:07             ` Maciej Fijalkowski
  0 siblings, 1 reply; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-13 15:51 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

On Wed, Jun 12, 2024 at 11:15:31AM +0200, Alexander Lobakin wrote:
> From: Alexander Lobakin <aleksander.lobakin@intel.com>
> Date: Wed, 12 Jun 2024 11:09:10 +0200
> 
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Date: Tue, 11 Jun 2024 16:21:27 +0200
> 
> [...]
> 
> >>>> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> >>>> index 2015f66b0cf9..1bd4b054dd80 100644
> >>>> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> >>>> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> >>>> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> >>>>  
> >>>>  	ice_clean_xdp_irq_zc(xdp_ring);
> >>>>  
> >>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
> >>>> +	    !netif_running(xdp_ring->vsi->netdev))
> 
> Oh BTW, I noticed some time ago that netif_running() is less precise
> than checking for %IFF_UP.
> For example, in this piece (main netdev ifup function)[0],
> netif_running() will start returning true *before* driver's .ndo_open()
> is called, but %IFF_UP will be set only after .ndo_open() is done (with
> no issues).

I see, thanks for bringing this up! I'd like to try this out. Tony sorry
for the noise, but it seems I'll go with v4 and will decorate the
mentioned statements with unlikely().

> That means, I'd check for %IFF_UP honestly (maybe even before checking
> the carrier).

I wonder whether it is the ultimate check and two existing ones (that we
are adding in this patch) could be dropped?

> 
> [0] https://elixir.bootlin.com/linux/v6.10-rc3/source/net/core/dev.c#L1466
> 
> Thanks,
> Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-13 15:51           ` Maciej Fijalkowski
@ 2024-06-13 16:07             ` Maciej Fijalkowski
  2024-06-14  9:34               ` Alexander Lobakin
  0 siblings, 1 reply; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-13 16:07 UTC (permalink / raw)
  To: Alexander Lobakin
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

On Thu, Jun 13, 2024 at 05:51:25PM +0200, Maciej Fijalkowski wrote:
> On Wed, Jun 12, 2024 at 11:15:31AM +0200, Alexander Lobakin wrote:
> > From: Alexander Lobakin <aleksander.lobakin@intel.com>
> > Date: Wed, 12 Jun 2024 11:09:10 +0200
> > 
> > > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > Date: Tue, 11 Jun 2024 16:21:27 +0200
> > 
> > [...]
> > 
> > >>>> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > >>>> index 2015f66b0cf9..1bd4b054dd80 100644
> > >>>> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > >>>> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > >>>> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> > >>>>  
> > >>>>  	ice_clean_xdp_irq_zc(xdp_ring);
> > >>>>  
> > >>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
> > >>>> +	    !netif_running(xdp_ring->vsi->netdev))
> > 
> > Oh BTW, I noticed some time ago that netif_running() is less precise
> > than checking for %IFF_UP.
> > For example, in this piece (main netdev ifup function)[0],
> > netif_running() will start returning true *before* driver's .ndo_open()
> > is called, but %IFF_UP will be set only after .ndo_open() is done (with
> > no issues).
> 
> I see, thanks for bringing this up! I'd like to try this out. Tony sorry
> for the noise, but it seems I'll go with v4 and will decorate the
> mentioned statements with unlikely().
> 
> > That means, I'd check for %IFF_UP honestly (maybe even before checking
> > the carrier).
> 
> I wonder whether it is the ultimate check and two existing ones (that we
> are adding in this patch) could be dropped?

In netdev closing path [1], __LINK_STATE_START is cleared before IFF_UP.
What we were initially experiencing when netif_running() check wasn't in
place was that xsk was producing a bunch of Tx descs when device was being
brought down. So let me keep what we have here and add IFF_UP check on
top. Better be safe than sorry as the bug we were dealing with was pretty
nasty.

> 
> > 
> > [0] https://elixir.bootlin.com/linux/v6.10-rc3/source/net/core/dev.c#L1466

[1]: https://elixir.bootlin.com/linux/v6.10-rc3/source/net/core/dev.c#L1532

> > 
> > Thanks,
> > Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Intel-wired-lan] [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
  2024-06-13 16:07             ` Maciej Fijalkowski
@ 2024-06-14  9:34               ` Alexander Lobakin
  0 siblings, 0 replies; 19+ messages in thread
From: Alexander Lobakin @ 2024-06-14  9:34 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: intel-wired-lan, larysa.zaremba, netdev, michal.kubiak,
	anthony.l.nguyen, jacob.e.keller, Chandan Kumar Rout,
	magnus.karlsson, Shannon Nelson

From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Date: Thu, 13 Jun 2024 18:07:53 +0200

> On Thu, Jun 13, 2024 at 05:51:25PM +0200, Maciej Fijalkowski wrote:
>> On Wed, Jun 12, 2024 at 11:15:31AM +0200, Alexander Lobakin wrote:
>>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Date: Wed, 12 Jun 2024 11:09:10 +0200
>>>
>>>> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>>>> Date: Tue, 11 Jun 2024 16:21:27 +0200
>>>
>>> [...]
>>>
>>>>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>>>>> index 2015f66b0cf9..1bd4b054dd80 100644
>>>>>>> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>>>>> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
>>>>>>> @@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
>>>>>>>  
>>>>>>>  	ice_clean_xdp_irq_zc(xdp_ring);
>>>>>>>  
>>>>>>> +	if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
>>>>>>> +	    !netif_running(xdp_ring->vsi->netdev))
>>>
>>> Oh BTW, I noticed some time ago that netif_running() is less precise
>>> than checking for %IFF_UP.
>>> For example, in this piece (main netdev ifup function)[0],
>>> netif_running() will start returning true *before* driver's .ndo_open()
>>> is called, but %IFF_UP will be set only after .ndo_open() is done (with
>>> no issues).
>>
>> I see, thanks for bringing this up! I'd like to try this out. Tony sorry
>> for the noise, but it seems I'll go with v4 and will decorate the
>> mentioned statements with unlikely().
>>
>>> That means, I'd check for %IFF_UP honestly (maybe even before checking
>>> the carrier).
>>
>> I wonder whether it is the ultimate check and two existing ones (that we
>> are adding in this patch) could be dropped?
> 
> In netdev closing path [1], __LINK_STATE_START is cleared before IFF_UP.

Oh man, inconsistency in its best :D

> What we were initially experiencing when netif_running() check wasn't in
> place was that xsk was producing a bunch of Tx descs when device was being
> brought down. So let me keep what we have here and add IFF_UP check on
> top. Better be safe than sorry as the bug we were dealing with was pretty
> nasty.

Sure, I didn't know %IFF_UP's not that reliable :s

Thanks,
Olek

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool
  2024-06-04 13:21 ` [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
@ 2024-06-26 12:21   ` Michal Kubiak
  2024-06-26 13:52     ` Maciej Fijalkowski
  0 siblings, 1 reply; 19+ messages in thread
From: Michal Kubiak @ 2024-06-26 12:21 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
	larysa.zaremba, jacob.e.keller, Shannon Nelson,
	Chandan Kumar Rout

On Tue, Jun 04, 2024 at 03:21:52PM +0200, Maciej Fijalkowski wrote:
> This so we prevent Tx timeout issues. One of conditions checked on
> running in the background dev_watchdog() is netif_carrier_ok(), so let
> us turn it off when we disable the queues that belong to a q_vector
> where XSK pool is being configured.
> 
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
>  drivers/net/ethernet/intel/ice/ice_xsk.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 8ca7ff1a7e7f..21df6888961b 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -180,6 +180,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
>  	}
>  
>  	synchronize_net();
> +	netif_carrier_off(vsi->netdev);
>  	netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
>  
>  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> @@ -249,6 +250,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
>  	ice_qvec_ena_irq(vsi, q_vector);
>  
>  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> +	netif_carrier_on(vsi->netdev);

I would add checking the physical link state before calling
'netif_carrier_on()'. Please consider the scenario that the link is
going physically *down* during the XSk pool configuration.
In such a case we can wrongly set "carrier_on" uncoditionally.
Please take a look at the followng suggestion:

	ice_get_link_status(vsi->port_info, &link_up);
	if (link_up)
 		netif_carrier_on(vsi->netdev);

Thanks,
Michal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool
  2024-06-26 12:21   ` Michal Kubiak
@ 2024-06-26 13:52     ` Maciej Fijalkowski
  0 siblings, 0 replies; 19+ messages in thread
From: Maciej Fijalkowski @ 2024-06-26 13:52 UTC (permalink / raw)
  To: Michal Kubiak
  Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
	larysa.zaremba, jacob.e.keller, Shannon Nelson,
	Chandan Kumar Rout

On Wed, Jun 26, 2024 at 02:21:39PM +0200, Michal Kubiak wrote:
> On Tue, Jun 04, 2024 at 03:21:52PM +0200, Maciej Fijalkowski wrote:
> > This so we prevent Tx timeout issues. One of conditions checked on
> > running in the background dev_watchdog() is netif_carrier_ok(), so let
> > us turn it off when we disable the queues that belong to a q_vector
> > where XSK pool is being configured.
> > 
> > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
> > Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_xsk.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index 8ca7ff1a7e7f..21df6888961b 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -180,6 +180,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> >  	}
> >  
> >  	synchronize_net();
> > +	netif_carrier_off(vsi->netdev);
> >  	netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> >  
> >  	ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> > @@ -249,6 +250,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> >  	ice_qvec_ena_irq(vsi, q_vector);
> >  
> >  	netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > +	netif_carrier_on(vsi->netdev);
> 
> I would add checking the physical link state before calling
> 'netif_carrier_on()'. Please consider the scenario that the link is
> going physically *down* during the XSk pool configuration.
> In such a case we can wrongly set "carrier_on" uncoditionally.
> Please take a look at the followng suggestion:
> 
> 	ice_get_link_status(vsi->port_info, &link_up);
> 	if (link_up)
>  		netif_carrier_on(vsi->netdev);

Will include this in v4. Worth mentioning that it is safe to do so as
there is always a service task that will turn carrier on for us.

> 
> Thanks,
> Michal

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-06-26 13:53 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-04 13:21 [PATCH v3 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
2024-06-11 11:59   ` [Intel-wired-lan] " Alexander Lobakin
2024-06-11 14:21     ` Maciej Fijalkowski
2024-06-11 20:13       ` Tony Nguyen
2024-06-12  9:09       ` Alexander Lobakin
2024-06-12  9:15         ` Alexander Lobakin
2024-06-13 15:51           ` Maciej Fijalkowski
2024-06-13 16:07             ` Maciej Fijalkowski
2024-06-14  9:34               ` Alexander Lobakin
2024-06-04 13:21 ` [PATCH v3 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
2024-06-26 12:21   ` Michal Kubiak
2024-06-26 13:52     ` Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
2024-06-04 13:21 ` [PATCH v3 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).