* [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues
@ 2024-05-29 11:23 Maciej Fijalkowski
2024-05-29 11:23 ` [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
` (8 more replies)
0 siblings, 9 replies; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
Hi,
changes included in this patchset address an issue that customer has
been facing when AF_XDP ZC Tx sockets were used in combination with flow
control and regular Tx traffic.
After executing:
ethtool --set-priv-flags $dev link-down-on-close on
ethtool -A $dev rx on tx on
launching multiple ZC Tx sockets on $dev + pinging remote interface (so
that regular Tx traffic is present) and then going through down/up of
$dev, Tx timeout occured and then most of the time ice driver was unable
to recover from that state.
These patches combined together solve the described above issue on
customer side. Main focus here is to forbid producing Tx descriptors
when either carrier is not yet initialized or process of bringing
interface down has already started.
Thanks,
Maciej
v1->v2:
- fix kdoc issues in patch 6 and 8
- drop Larysa's patches for now
Maciej Fijalkowski (7):
ice: don't busy wait for Rx queue disable in ice_qp_dis()
ice: replace synchronize_rcu with synchronize_net
ice: modify error handling when setting XSK pool in ndo_bpf
ice: toggle netif_carrier when setting up XSK pool
ice: improve updating ice_{t,r}x_ring::xsk_pool
ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
ice: xsk: fix txq interrupt mapping
Michal Kubiak (1):
ice: respect netif readiness in AF_XDP ZC related ndo's
drivers/net/ethernet/intel/ice/ice.h | 6 +-
drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 6 +-
drivers/net/ethernet/intel/ice/ice_xsk.c | 150 +++++++++++++---------
drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
6 files changed, 101 insertions(+), 71 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:47 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
` (7 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
From: Michal Kubiak <michal.kubiak@intel.com>
Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx
ring when link is either not yet fully initialized or process of
stopping the netdev has already started. To avoid this, add checks
against carrier readiness in ice_xsk_wakeup() and in ice_xmit_zc().
One could argue that bailing out early in ice_xsk_wakeup() would be
sufficient but given the fact that we produce Tx descriptors on behalf
of NAPI that is triggered for Rx traffic, the latter is also needed.
Bringing link up is an asynchronous event executed within
ice_service_task so even though interface has been brought up there is
still a time frame where link is not yet ok.
Without this patch, when AF_XDP ZC Tx is used simultaneously with stack
Tx, Tx timeouts occur after going through link flap (admin brings
interface down then up again). HW seem to be unable to transmit
descriptor to the wire after HW tail register bump which in turn causes
bit __QUEUE_STATE_STACK_XOFF to be set forever as
netdev_tx_completed_queue() sees no cleaned bytes on the input.
Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 2015f66b0cf9..1bd4b054dd80 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -1048,6 +1048,10 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
ice_clean_xdp_irq_zc(xdp_ring);
+ if (!netif_carrier_ok(xdp_ring->vsi->netdev) ||
+ !netif_running(xdp_ring->vsi->netdev))
+ return true;
+
budget = ICE_DESC_UNUSED(xdp_ring);
budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
@@ -1091,7 +1095,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
struct ice_vsi *vsi = np->vsi;
struct ice_tx_ring *ring;
- if (test_bit(ICE_VSI_DOWN, vsi->state))
+ if (test_bit(ICE_VSI_DOWN, vsi->state) || !netif_carrier_ok(netdev))
return -ENETDOWN;
if (!ice_is_xdp_ena_vsi(vsi))
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis()
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
2024-05-29 11:23 ` [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:53 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
` (6 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
When ice driver is spammed with multiple xdpsock instances and flow
control is enabled, there are cases when Rx queue gets stuck and unable
to reflect the disable state in QRX_CTRL register. Similar issue has
previously been addressed in commit 13a6233b033f ("ice: Add support to
enable/disable all Rx queues before waiting").
To workaround this, let us simply not wait for a disabled state as later
patch will make sure that regardless of the encountered error in the
process of disabling a queue pair, the Rx queue will be enabled.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 1bd4b054dd80..4f606a1055b0 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -199,10 +199,8 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
if (err)
return err;
}
- err = ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, true);
- if (err)
- return err;
+ ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, false);
ice_qp_clean_rings(vsi, q_idx);
ice_qp_reset_stats(vsi, q_idx);
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
2024-05-29 11:23 ` [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
2024-05-29 11:23 ` [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 6:49 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
` (5 subsequent siblings)
8 siblings, 2 replies; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can
be called instead of synchronize_rcu() so that XDP rings can finish its
job in a faster way. Also let us do this as earlier in XSK queue disable
flow.
Additionally, turn off regular Tx queue before disabling irqs and NAPI.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 4f606a1055b0..e93cb0ca4106 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -53,7 +53,6 @@ static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx)
{
ice_clean_tx_ring(vsi->tx_rings[q_idx]);
if (ice_is_xdp_ena_vsi(vsi)) {
- synchronize_rcu();
ice_clean_tx_ring(vsi->xdp_rings[q_idx]);
}
ice_clean_rx_ring(vsi->rx_rings[q_idx]);
@@ -180,11 +179,12 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
usleep_range(1000, 2000);
}
+ synchronize_net();
+ netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
+
ice_qvec_dis_irq(vsi, rx_ring, q_vector);
ice_qvec_toggle_napi(vsi, q_vector, false);
- netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
-
ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
if (err)
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (2 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:54 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
` (4 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
Don't bail out right when spotting an error within ice_qp_{dis,ena}()
but rather track error and go through whole flow of disabling and
enabling queue pair.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 30 +++++++++++++-----------
1 file changed, 16 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index e93cb0ca4106..3dcab89be256 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -163,6 +163,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
struct ice_tx_ring *tx_ring;
struct ice_rx_ring *rx_ring;
int timeout = 50;
+ int fail = 0;
int err;
if (q_idx >= vsi->num_rxq || q_idx >= vsi->num_txq)
@@ -187,8 +188,8 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
if (ice_is_xdp_ena_vsi(vsi)) {
struct ice_tx_ring *xdp_ring = vsi->xdp_rings[q_idx];
@@ -196,15 +197,15 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
ice_fill_txq_meta(vsi, xdp_ring, &txq_meta);
err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, xdp_ring,
&txq_meta);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
}
ice_vsi_ctrl_one_rx_ring(vsi, false, q_idx, false);
ice_qp_clean_rings(vsi, q_idx);
ice_qp_reset_stats(vsi, q_idx);
- return 0;
+ return fail;
}
/**
@@ -217,32 +218,33 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
{
struct ice_q_vector *q_vector;
+ int fail = 0;
int err;
err = ice_vsi_cfg_single_txq(vsi, vsi->tx_rings, q_idx);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
if (ice_is_xdp_ena_vsi(vsi)) {
struct ice_tx_ring *xdp_ring = vsi->xdp_rings[q_idx];
err = ice_vsi_cfg_single_txq(vsi, vsi->xdp_rings, q_idx);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
ice_set_ring_xdp(xdp_ring);
ice_tx_xsk_pool(vsi, q_idx);
}
err = ice_vsi_cfg_single_rxq(vsi, q_idx);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
q_vector = vsi->rx_rings[q_idx]->q_vector;
ice_qvec_cfg_msix(vsi, q_vector);
err = ice_vsi_ctrl_one_rx_ring(vsi, true, q_idx, true);
- if (err)
- return err;
+ if (!fail)
+ fail = err;
ice_qvec_toggle_napi(vsi, q_vector, true);
ice_qvec_ena_irq(vsi, q_vector);
@@ -250,7 +252,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
clear_bit(ICE_CFG_BUSY, vsi->state);
- return 0;
+ return fail;
}
/**
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (3 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:45 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
` (3 subsequent siblings)
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
This so we prevent Tx timeout issues. One of conditions checked on
running in the background dev_watchdog() is netif_carrier_ok(), so let
us turn it off when we disable the queues that belong to a q_vector
where XSK pool is being configured.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 3dcab89be256..8c5006f37310 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -181,6 +181,7 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
}
synchronize_net();
+ netif_carrier_off(vsi->netdev);
netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
ice_qvec_dis_irq(vsi, rx_ring, q_vector);
@@ -250,6 +251,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
ice_qvec_ena_irq(vsi, q_vector);
netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
+ netif_carrier_on(vsi->netdev);
clear_bit(ICE_CFG_BUSY, vsi->state);
return fail;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (4 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 6:50 ` [Intel-wired-lan] [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t, r}x_ring::xsk_pool Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
` (2 subsequent siblings)
8 siblings, 2 replies; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
xsk_buff_pool pointers that ice ring structs hold are updated via
ndo_bpf that is executed in process context while it can be read by
remote CPU at the same time within NAPI poll. Use synchronize_net()
after pointer update and {READ,WRITE}_ONCE() when working with mentioned
pointer.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice.h | 6 +-
drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
drivers/net/ethernet/intel/ice/ice_txrx.c | 4 +-
drivers/net/ethernet/intel/ice/ice_xsk.c | 78 ++++++++++++++---------
drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
6 files changed, 59 insertions(+), 39 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
index da8c8afebc93..701a61d791dd 100644
--- a/drivers/net/ethernet/intel/ice/ice.h
+++ b/drivers/net/ethernet/intel/ice/ice.h
@@ -771,12 +771,12 @@ static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
* Returns a pointer to xsk_buff_pool structure if there is a buffer pool
* present, NULL otherwise.
*/
-static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
+static inline void ice_xsk_pool(struct ice_rx_ring *ring)
{
struct ice_vsi *vsi = ring->vsi;
u16 qid = ring->q_index;
- return ice_get_xp_from_qid(vsi, qid);
+ WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
}
/**
@@ -801,7 +801,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
if (!ring)
return;
- ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
+ WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
}
/**
diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
index 5d396c1a7731..f3dfce136106 100644
--- a/drivers/net/ethernet/intel/ice/ice_base.c
+++ b/drivers/net/ethernet/intel/ice/ice_base.c
@@ -536,7 +536,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
return err;
}
- ring->xsk_pool = ice_xsk_pool(ring);
+ ice_xsk_pool(ring);
if (ring->xsk_pool) {
xdp_rxq_info_unreg(&ring->xdp_rxq);
@@ -597,7 +597,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
return 0;
}
- ok = ice_alloc_rx_bufs_zc(ring, num_bufs);
+ ok = ice_alloc_rx_bufs_zc(ring, ring->xsk_pool, num_bufs);
if (!ok) {
u16 pf_q = ring->vsi->rxq_map[ring->q_index];
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
index 1b61ca3a6eb6..15a6805ac2a1 100644
--- a/drivers/net/ethernet/intel/ice/ice_main.c
+++ b/drivers/net/ethernet/intel/ice/ice_main.c
@@ -2946,7 +2946,7 @@ static void ice_vsi_rx_napi_schedule(struct ice_vsi *vsi)
ice_for_each_rxq(vsi, i) {
struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
- if (rx_ring->xsk_pool)
+ if (READ_ONCE(rx_ring->xsk_pool))
napi_schedule(&rx_ring->q_vector->napi);
}
}
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 8bb743f78fcb..f4b2b1bca234 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1523,7 +1523,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
ice_for_each_tx_ring(tx_ring, q_vector->tx) {
bool wd;
- if (tx_ring->xsk_pool)
+ if (READ_ONCE(tx_ring->xsk_pool))
wd = ice_xmit_zc(tx_ring);
else if (ice_ring_is_xdp(tx_ring))
wd = true;
@@ -1556,7 +1556,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
* comparison in the irq context instead of many inside the
* ice_clean_rx_irq function and makes the codebase cleaner.
*/
- cleaned = rx_ring->xsk_pool ?
+ cleaned = READ_ONCE(rx_ring->xsk_pool) ?
ice_clean_rx_irq_zc(rx_ring, budget_per_ring) :
ice_clean_rx_irq(rx_ring, budget_per_ring);
work_done += cleaned;
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 8c5006f37310..693f0e3a37ce 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -250,6 +250,8 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
ice_qvec_toggle_napi(vsi, q_vector, true);
ice_qvec_ena_irq(vsi, q_vector);
+ /* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */
+ synchronize_net();
netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
netif_carrier_on(vsi->netdev);
clear_bit(ICE_CFG_BUSY, vsi->state);
@@ -461,6 +463,7 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
/**
* __ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
* @rx_ring: Rx ring
+ * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
* @count: The number of buffers to allocate
*
* Place the @count of descriptors onto Rx ring. Handle the ring wrap
@@ -469,7 +472,8 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
*
* Returns true if all allocations were successful, false if any fail.
*/
-static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
+static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+ struct xsk_buff_pool *xsk_pool, u16 count)
{
u32 nb_buffs_extra = 0, nb_buffs = 0;
union ice_32b_rx_flex_desc *rx_desc;
@@ -481,8 +485,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
xdp = ice_xdp_buf(rx_ring, ntu);
if (ntu + count >= rx_ring->count) {
- nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
- rx_desc,
+ nb_buffs_extra = ice_fill_rx_descs(xsk_pool, xdp, rx_desc,
rx_ring->count - ntu);
if (nb_buffs_extra != rx_ring->count - ntu) {
ntu += nb_buffs_extra;
@@ -495,7 +498,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
ice_release_rx_desc(rx_ring, 0);
}
- nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
+ nb_buffs = ice_fill_rx_descs(xsk_pool, xdp, rx_desc, count);
ntu += nb_buffs;
if (ntu == rx_ring->count)
@@ -511,6 +514,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
/**
* ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
* @rx_ring: Rx ring
+ * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
* @count: The number of buffers to allocate
*
* Wrapper for internal allocation routine; figure out how many tail
@@ -518,7 +522,8 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
*
* Returns true if all calls to internal alloc routine succeeded
*/
-bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
+bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+ struct xsk_buff_pool *xsk_pool, u16 count)
{
u16 rx_thresh = ICE_RING_QUARTER(rx_ring);
u16 leftover, i, tail_bumps;
@@ -527,9 +532,9 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
leftover = count - (tail_bumps * rx_thresh);
for (i = 0; i < tail_bumps; i++)
- if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh))
+ if (!__ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, rx_thresh))
return false;
- return __ice_alloc_rx_bufs_zc(rx_ring, leftover);
+ return __ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, leftover);
}
/**
@@ -650,7 +655,7 @@ static u32 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring)
if (xdp_ring->next_to_clean >= cnt)
xdp_ring->next_to_clean -= cnt;
if (xsk_frames)
- xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames);
+ xsk_tx_completed(READ_ONCE(xdp_ring->xsk_pool), xsk_frames);
return completed_frames;
}
@@ -702,7 +707,8 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
dma_addr_t dma;
dma = xsk_buff_xdp_get_dma(xdp);
- xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, size);
+ xsk_buff_raw_dma_sync_for_device(READ_ONCE(xdp_ring->xsk_pool),
+ dma, size);
tx_buf->xdp = xdp;
tx_buf->type = ICE_TX_BUF_XSK_TX;
@@ -760,7 +766,8 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
if (!err)
return ICE_XDP_REDIR;
- if (xsk_uses_need_wakeup(rx_ring->xsk_pool) && err == -ENOBUFS)
+ if (xsk_uses_need_wakeup(READ_ONCE(rx_ring->xsk_pool)) &&
+ err == -ENOBUFS)
result = ICE_XDP_EXIT;
else
result = ICE_XDP_CONSUMED;
@@ -829,8 +836,8 @@ ice_add_xsk_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *first,
*/
int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
{
+ struct xsk_buff_pool *xsk_pool = READ_ONCE(rx_ring->xsk_pool);
unsigned int total_rx_bytes = 0, total_rx_packets = 0;
- struct xsk_buff_pool *xsk_pool = rx_ring->xsk_pool;
u32 ntc = rx_ring->next_to_clean;
u32 ntu = rx_ring->next_to_use;
struct xdp_buff *first = NULL;
@@ -942,7 +949,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
rx_ring->next_to_clean = ntc;
entries_to_alloc = ICE_RX_DESC_UNUSED(rx_ring);
if (entries_to_alloc > ICE_RING_QUARTER(rx_ring))
- failure |= !ice_alloc_rx_bufs_zc(rx_ring, entries_to_alloc);
+ failure |= !ice_alloc_rx_bufs_zc(rx_ring, xsk_pool,
+ entries_to_alloc);
ice_finalize_xdp_rx(xdp_ring, xdp_xmit, 0);
ice_update_rx_ring_stats(rx_ring, total_rx_packets, total_rx_bytes);
@@ -965,17 +973,19 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
/**
* ice_xmit_pkt - produce a single HW Tx descriptor out of AF_XDP descriptor
* @xdp_ring: XDP ring to produce the HW Tx descriptor on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
* @desc: AF_XDP descriptor to pull the DMA address and length from
* @total_bytes: bytes accumulator that will be used for stats update
*/
-static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
+static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring,
+ struct xsk_buff_pool *xsk_pool, struct xdp_desc *desc,
unsigned int *total_bytes)
{
struct ice_tx_desc *tx_desc;
dma_addr_t dma;
- dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr);
- xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len);
+ dma = xsk_buff_raw_get_dma(xsk_pool, desc->addr);
+ xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, desc->len);
tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_to_use++);
tx_desc->buf_addr = cpu_to_le64(dma);
@@ -988,10 +998,13 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
/**
* ice_xmit_pkt_batch - produce a batch of HW Tx descriptors out of AF_XDP descriptors
* @xdp_ring: XDP ring to produce the HW Tx descriptors on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
* @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
* @total_bytes: bytes accumulator that will be used for stats update
*/
-static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
+static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring,
+ struct xsk_buff_pool *xsk_pool,
+ struct xdp_desc *descs,
unsigned int *total_bytes)
{
u16 ntu = xdp_ring->next_to_use;
@@ -1001,8 +1014,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) {
dma_addr_t dma;
- dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, descs[i].addr);
- xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, descs[i].len);
+ dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr);
+ xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, descs[i].len);
tx_desc = ICE_TX_DESC(xdp_ring, ntu++);
tx_desc->buf_addr = cpu_to_le64(dma);
@@ -1018,21 +1031,24 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
/**
* ice_fill_tx_hw_ring - produce the number of Tx descriptors onto ring
* @xdp_ring: XDP ring to produce the HW Tx descriptors on
+ * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
* @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
* @nb_pkts: count of packets to be send
* @total_bytes: bytes accumulator that will be used for stats update
*/
-static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
- u32 nb_pkts, unsigned int *total_bytes)
+static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring,
+ struct xsk_buff_pool *xsk_pool,
+ struct xdp_desc *descs, u32 nb_pkts,
+ unsigned int *total_bytes)
{
u32 batched, leftover, i;
batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH);
leftover = nb_pkts & (PKTS_PER_BATCH - 1);
for (i = 0; i < batched; i += PKTS_PER_BATCH)
- ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes);
+ ice_xmit_pkt_batch(xdp_ring, xsk_pool, &descs[i], total_bytes);
for (; i < batched + leftover; i++)
- ice_xmit_pkt(xdp_ring, &descs[i], total_bytes);
+ ice_xmit_pkt(xdp_ring, xsk_pool, &descs[i], total_bytes);
}
/**
@@ -1043,7 +1059,8 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d
*/
bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
{
- struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
+ struct xsk_buff_pool *xsk_pool = READ_ONCE(xdp_ring->xsk_pool);
+ struct xdp_desc *descs = xsk_pool->tx_descs;
u32 nb_pkts, nb_processed = 0;
unsigned int total_bytes = 0;
int budget;
@@ -1057,25 +1074,26 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
budget = ICE_DESC_UNUSED(xdp_ring);
budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
- nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
+ nb_pkts = xsk_tx_peek_release_desc_batch(xsk_pool, budget);
if (!nb_pkts)
return true;
if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
nb_processed = xdp_ring->count - xdp_ring->next_to_use;
- ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes);
+ ice_fill_tx_hw_ring(xdp_ring, xsk_pool, descs, nb_processed,
+ &total_bytes);
xdp_ring->next_to_use = 0;
}
- ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed,
- &total_bytes);
+ ice_fill_tx_hw_ring(xdp_ring, xsk_pool, &descs[nb_processed],
+ nb_pkts - nb_processed, &total_bytes);
ice_set_rs_bit(xdp_ring);
ice_xdp_ring_update_tail(xdp_ring);
ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
- if (xsk_uses_need_wakeup(xdp_ring->xsk_pool))
- xsk_set_tx_need_wakeup(xdp_ring->xsk_pool);
+ if (xsk_uses_need_wakeup(xsk_pool))
+ xsk_set_tx_need_wakeup(xsk_pool);
return nb_pkts < budget;
}
@@ -1108,7 +1126,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
ring = vsi->rx_rings[queue_id]->xdp_ring;
- if (!ring->xsk_pool)
+ if (!READ_ONCE(ring->xsk_pool))
return -EINVAL;
/* The idea here is that if NAPI is running, mark a miss, so
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h
index 6fa181f080ef..4cd2d62a0836 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.h
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.h
@@ -22,7 +22,8 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool,
u16 qid);
int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget);
int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
-bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count);
+bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
+ struct xsk_buff_pool *xsk_pool, u16 count);
bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring);
void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring);
@@ -51,6 +52,7 @@ ice_clean_rx_irq_zc(struct ice_rx_ring __always_unused *rx_ring,
static inline bool
ice_alloc_rx_bufs_zc(struct ice_rx_ring __always_unused *rx_ring,
+ struct xsk_buff_pool __always_unused *xsk_pool,
u16 __always_unused count)
{
return false;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (5 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:52 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
2024-05-31 23:52 ` [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Nelson, Shannon
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
It is read by data path and modified from process context on remote cpu
so it is needed to use WRITE_ONCE to clear the pointer.
Fixes: efc2214b6047 ("ice: Add support for XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index f4b2b1bca234..4c115531beba 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -456,7 +456,7 @@ void ice_free_rx_ring(struct ice_rx_ring *rx_ring)
if (rx_ring->vsi->type == ICE_VSI_PF)
if (xdp_rxq_info_is_reg(&rx_ring->xdp_rxq))
xdp_rxq_info_unreg(&rx_ring->xdp_rxq);
- rx_ring->xdp_prog = NULL;
+ WRITE_ONCE(rx_ring->xdp_prog, NULL);
if (rx_ring->xsk_pool) {
kfree(rx_ring->xdp_buf);
rx_ring->xdp_buf = NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (6 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
@ 2024-05-29 11:23 ` Maciej Fijalkowski
2024-06-04 6:46 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-31 23:52 ` [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Nelson, Shannon
8 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-05-29 11:23 UTC (permalink / raw)
To: intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba, Maciej Fijalkowski
ice_cfg_txq_interrupt() internally handles XDP Tx ring. Do not use
ice_for_each_tx_ring() in ice_qvec_cfg_msix() as this causing us to
treat XDP ring that belongs to queue vector as Tx ring and therefore
misconfiguring the interrupts.
Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
drivers/net/ethernet/intel/ice/ice_xsk.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
index 693f0e3a37ce..85aa841a16bb 100644
--- a/drivers/net/ethernet/intel/ice/ice_xsk.c
+++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
@@ -111,25 +111,29 @@ ice_qvec_dis_irq(struct ice_vsi *vsi, struct ice_rx_ring *rx_ring,
* ice_qvec_cfg_msix - Enable IRQ for given queue vector
* @vsi: the VSI that contains queue vector
* @q_vector: queue vector
+ * @qid: queue index
*/
static void
-ice_qvec_cfg_msix(struct ice_vsi *vsi, struct ice_q_vector *q_vector)
+ice_qvec_cfg_msix(struct ice_vsi *vsi, struct ice_q_vector *q_vector, u16 qid)
{
u16 reg_idx = q_vector->reg_idx;
struct ice_pf *pf = vsi->back;
struct ice_hw *hw = &pf->hw;
- struct ice_tx_ring *tx_ring;
- struct ice_rx_ring *rx_ring;
+ int q, _qid = qid;
ice_cfg_itr(hw, q_vector);
- ice_for_each_tx_ring(tx_ring, q_vector->tx)
- ice_cfg_txq_interrupt(vsi, tx_ring->reg_idx, reg_idx,
- q_vector->tx.itr_idx);
+ for (q = 0; q < q_vector->num_ring_tx; q++) {
+ ice_cfg_txq_interrupt(vsi, _qid, reg_idx, q_vector->tx.itr_idx);
+ _qid++;
+ }
- ice_for_each_rx_ring(rx_ring, q_vector->rx)
- ice_cfg_rxq_interrupt(vsi, rx_ring->reg_idx, reg_idx,
- q_vector->rx.itr_idx);
+ _qid = qid;
+
+ for (q = 0; q < q_vector->num_ring_rx; q++) {
+ ice_cfg_rxq_interrupt(vsi, _qid, reg_idx, q_vector->rx.itr_idx);
+ _qid++;
+ }
ice_flush(hw);
}
@@ -241,7 +245,7 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
fail = err;
q_vector = vsi->rx_rings[q_idx]->q_vector;
- ice_qvec_cfg_msix(vsi, q_vector);
+ ice_qvec_cfg_msix(vsi, q_vector, q_idx);
err = ice_vsi_ctrl_one_rx_ring(vsi, true, q_idx, true);
if (!fail)
--
2.34.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net
2024-05-29 11:23 ` [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
@ 2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 10:49 ` Maciej Fijalkowski
2024-06-04 6:49 ` [Intel-wired-lan] " Rout, ChandanX
1 sibling, 1 reply; 23+ messages in thread
From: Nelson, Shannon @ 2024-05-31 23:49 UTC (permalink / raw)
To: Maciej Fijalkowski, intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba
On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
>
> Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can
> be called instead of synchronize_rcu() so that XDP rings can finish its
> job in a faster way. Also let us do this as earlier in XSK queue disable
> flow.
>
> Additionally, turn off regular Tx queue before disabling irqs and NAPI.
>
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 4f606a1055b0..e93cb0ca4106 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -53,7 +53,6 @@ static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx)
> {
> ice_clean_tx_ring(vsi->tx_rings[q_idx]);
> if (ice_is_xdp_ena_vsi(vsi)) {
> - synchronize_rcu();
> ice_clean_tx_ring(vsi->xdp_rings[q_idx]);
> }
With only one statement left in the block you'll probably want to remove
the {}'s
sln
> ice_clean_rx_ring(vsi->rx_rings[q_idx]);
> @@ -180,11 +179,12 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> usleep_range(1000, 2000);
> }
>
> + synchronize_net();
> + netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> +
> ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> ice_qvec_toggle_napi(vsi, q_vector, false);
>
> - netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> -
> ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
> err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
> if (err)
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
2024-05-29 11:23 ` [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
@ 2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 10:52 ` Maciej Fijalkowski
2024-06-04 6:50 ` [Intel-wired-lan] [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t, r}x_ring::xsk_pool Rout, ChandanX
1 sibling, 1 reply; 23+ messages in thread
From: Nelson, Shannon @ 2024-05-31 23:49 UTC (permalink / raw)
To: Maciej Fijalkowski, intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba
On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
>
> xsk_buff_pool pointers that ice ring structs hold are updated via
> ndo_bpf that is executed in process context while it can be read by
> remote CPU at the same time within NAPI poll. Use synchronize_net()
> after pointer update and {READ,WRITE}_ONCE() when working with mentioned
> pointer.
>
> Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> ---
> drivers/net/ethernet/intel/ice/ice.h | 6 +-
> drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
> drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
> drivers/net/ethernet/intel/ice/ice_txrx.c | 4 +-
> drivers/net/ethernet/intel/ice/ice_xsk.c | 78 ++++++++++++++---------
> drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
> 6 files changed, 59 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> index da8c8afebc93..701a61d791dd 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -771,12 +771,12 @@ static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
> * Returns a pointer to xsk_buff_pool structure if there is a buffer pool
> * present, NULL otherwise.
> */
> -static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
> +static inline void ice_xsk_pool(struct ice_rx_ring *ring)
Since this patch is changing the logic here you mighht want to change
the name of the function. Instead of getting a pointer with no side
effects it is now doing something, so a better name might be
ice_set_xsk_pool() to reflect the resulting side effect.
sln
> {
> struct ice_vsi *vsi = ring->vsi;
> u16 qid = ring->q_index;
>
> - return ice_get_xp_from_qid(vsi, qid);
> + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> }
>
> /**
> @@ -801,7 +801,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
> if (!ring)
> return;
>
> - ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
> + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> }
>
> /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> index 5d396c1a7731..f3dfce136106 100644
> --- a/drivers/net/ethernet/intel/ice/ice_base.c
> +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> @@ -536,7 +536,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> return err;
> }
>
> - ring->xsk_pool = ice_xsk_pool(ring);
> + ice_xsk_pool(ring);
> if (ring->xsk_pool) {
> xdp_rxq_info_unreg(&ring->xdp_rxq);
>
> @@ -597,7 +597,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> return 0;
> }
>
> - ok = ice_alloc_rx_bufs_zc(ring, num_bufs);
> + ok = ice_alloc_rx_bufs_zc(ring, ring->xsk_pool, num_bufs);
> if (!ok) {
> u16 pf_q = ring->vsi->rxq_map[ring->q_index];
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> index 1b61ca3a6eb6..15a6805ac2a1 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -2946,7 +2946,7 @@ static void ice_vsi_rx_napi_schedule(struct ice_vsi *vsi)
> ice_for_each_rxq(vsi, i) {
> struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
>
> - if (rx_ring->xsk_pool)
> + if (READ_ONCE(rx_ring->xsk_pool))
> napi_schedule(&rx_ring->q_vector->napi);
> }
> }
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 8bb743f78fcb..f4b2b1bca234 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1523,7 +1523,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> ice_for_each_tx_ring(tx_ring, q_vector->tx) {
> bool wd;
>
> - if (tx_ring->xsk_pool)
> + if (READ_ONCE(tx_ring->xsk_pool))
> wd = ice_xmit_zc(tx_ring);
> else if (ice_ring_is_xdp(tx_ring))
> wd = true;
> @@ -1556,7 +1556,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> * comparison in the irq context instead of many inside the
> * ice_clean_rx_irq function and makes the codebase cleaner.
> */
> - cleaned = rx_ring->xsk_pool ?
> + cleaned = READ_ONCE(rx_ring->xsk_pool) ?
> ice_clean_rx_irq_zc(rx_ring, budget_per_ring) :
> ice_clean_rx_irq(rx_ring, budget_per_ring);
> work_done += cleaned;
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> index 8c5006f37310..693f0e3a37ce 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> @@ -250,6 +250,8 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> ice_qvec_toggle_napi(vsi, q_vector, true);
> ice_qvec_ena_irq(vsi, q_vector);
>
> + /* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */
> + synchronize_net();
> netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> netif_carrier_on(vsi->netdev);
> clear_bit(ICE_CFG_BUSY, vsi->state);
> @@ -461,6 +463,7 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> /**
> * __ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> * @rx_ring: Rx ring
> + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> * @count: The number of buffers to allocate
> *
> * Place the @count of descriptors onto Rx ring. Handle the ring wrap
> @@ -469,7 +472,8 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> *
> * Returns true if all allocations were successful, false if any fail.
> */
> -static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> +static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> + struct xsk_buff_pool *xsk_pool, u16 count)
> {
> u32 nb_buffs_extra = 0, nb_buffs = 0;
> union ice_32b_rx_flex_desc *rx_desc;
> @@ -481,8 +485,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> xdp = ice_xdp_buf(rx_ring, ntu);
>
> if (ntu + count >= rx_ring->count) {
> - nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
> - rx_desc,
> + nb_buffs_extra = ice_fill_rx_descs(xsk_pool, xdp, rx_desc,
> rx_ring->count - ntu);
> if (nb_buffs_extra != rx_ring->count - ntu) {
> ntu += nb_buffs_extra;
> @@ -495,7 +498,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> ice_release_rx_desc(rx_ring, 0);
> }
>
> - nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
> + nb_buffs = ice_fill_rx_descs(xsk_pool, xdp, rx_desc, count);
>
> ntu += nb_buffs;
> if (ntu == rx_ring->count)
> @@ -511,6 +514,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> /**
> * ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> * @rx_ring: Rx ring
> + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> * @count: The number of buffers to allocate
> *
> * Wrapper for internal allocation routine; figure out how many tail
> @@ -518,7 +522,8 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> *
> * Returns true if all calls to internal alloc routine succeeded
> */
> -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> + struct xsk_buff_pool *xsk_pool, u16 count)
> {
> u16 rx_thresh = ICE_RING_QUARTER(rx_ring);
> u16 leftover, i, tail_bumps;
> @@ -527,9 +532,9 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> leftover = count - (tail_bumps * rx_thresh);
>
> for (i = 0; i < tail_bumps; i++)
> - if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh))
> + if (!__ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, rx_thresh))
> return false;
> - return __ice_alloc_rx_bufs_zc(rx_ring, leftover);
> + return __ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, leftover);
> }
>
> /**
> @@ -650,7 +655,7 @@ static u32 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring)
> if (xdp_ring->next_to_clean >= cnt)
> xdp_ring->next_to_clean -= cnt;
> if (xsk_frames)
> - xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames);
> + xsk_tx_completed(READ_ONCE(xdp_ring->xsk_pool), xsk_frames);
>
> return completed_frames;
> }
> @@ -702,7 +707,8 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
> dma_addr_t dma;
>
> dma = xsk_buff_xdp_get_dma(xdp);
> - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, size);
> + xsk_buff_raw_dma_sync_for_device(READ_ONCE(xdp_ring->xsk_pool),
> + dma, size);
>
> tx_buf->xdp = xdp;
> tx_buf->type = ICE_TX_BUF_XSK_TX;
> @@ -760,7 +766,8 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
> if (!err)
> return ICE_XDP_REDIR;
> - if (xsk_uses_need_wakeup(rx_ring->xsk_pool) && err == -ENOBUFS)
> + if (xsk_uses_need_wakeup(READ_ONCE(rx_ring->xsk_pool)) &&
> + err == -ENOBUFS)
> result = ICE_XDP_EXIT;
> else
> result = ICE_XDP_CONSUMED;
> @@ -829,8 +836,8 @@ ice_add_xsk_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *first,
> */
> int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> {
> + struct xsk_buff_pool *xsk_pool = READ_ONCE(rx_ring->xsk_pool);
> unsigned int total_rx_bytes = 0, total_rx_packets = 0;
> - struct xsk_buff_pool *xsk_pool = rx_ring->xsk_pool;
> u32 ntc = rx_ring->next_to_clean;
> u32 ntu = rx_ring->next_to_use;
> struct xdp_buff *first = NULL;
> @@ -942,7 +949,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> rx_ring->next_to_clean = ntc;
> entries_to_alloc = ICE_RX_DESC_UNUSED(rx_ring);
> if (entries_to_alloc > ICE_RING_QUARTER(rx_ring))
> - failure |= !ice_alloc_rx_bufs_zc(rx_ring, entries_to_alloc);
> + failure |= !ice_alloc_rx_bufs_zc(rx_ring, xsk_pool,
> + entries_to_alloc);
>
> ice_finalize_xdp_rx(xdp_ring, xdp_xmit, 0);
> ice_update_rx_ring_stats(rx_ring, total_rx_packets, total_rx_bytes);
> @@ -965,17 +973,19 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> /**
> * ice_xmit_pkt - produce a single HW Tx descriptor out of AF_XDP descriptor
> * @xdp_ring: XDP ring to produce the HW Tx descriptor on
> + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> * @desc: AF_XDP descriptor to pull the DMA address and length from
> * @total_bytes: bytes accumulator that will be used for stats update
> */
> -static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> +static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring,
> + struct xsk_buff_pool *xsk_pool, struct xdp_desc *desc,
> unsigned int *total_bytes)
> {
> struct ice_tx_desc *tx_desc;
> dma_addr_t dma;
>
> - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr);
> - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len);
> + dma = xsk_buff_raw_get_dma(xsk_pool, desc->addr);
> + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, desc->len);
>
> tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_to_use++);
> tx_desc->buf_addr = cpu_to_le64(dma);
> @@ -988,10 +998,13 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> /**
> * ice_xmit_pkt_batch - produce a batch of HW Tx descriptors out of AF_XDP descriptors
> * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> * @total_bytes: bytes accumulator that will be used for stats update
> */
> -static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> +static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring,
> + struct xsk_buff_pool *xsk_pool,
> + struct xdp_desc *descs,
> unsigned int *total_bytes)
> {
> u16 ntu = xdp_ring->next_to_use;
> @@ -1001,8 +1014,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) {
> dma_addr_t dma;
>
> - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, descs[i].addr);
> - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, descs[i].len);
> + dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr);
> + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, descs[i].len);
>
> tx_desc = ICE_TX_DESC(xdp_ring, ntu++);
> tx_desc->buf_addr = cpu_to_le64(dma);
> @@ -1018,21 +1031,24 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> /**
> * ice_fill_tx_hw_ring - produce the number of Tx descriptors onto ring
> * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> * @nb_pkts: count of packets to be send
> * @total_bytes: bytes accumulator that will be used for stats update
> */
> -static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> - u32 nb_pkts, unsigned int *total_bytes)
> +static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring,
> + struct xsk_buff_pool *xsk_pool,
> + struct xdp_desc *descs, u32 nb_pkts,
> + unsigned int *total_bytes)
> {
> u32 batched, leftover, i;
>
> batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH);
> leftover = nb_pkts & (PKTS_PER_BATCH - 1);
> for (i = 0; i < batched; i += PKTS_PER_BATCH)
> - ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes);
> + ice_xmit_pkt_batch(xdp_ring, xsk_pool, &descs[i], total_bytes);
> for (; i < batched + leftover; i++)
> - ice_xmit_pkt(xdp_ring, &descs[i], total_bytes);
> + ice_xmit_pkt(xdp_ring, xsk_pool, &descs[i], total_bytes);
> }
>
> /**
> @@ -1043,7 +1059,8 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d
> */
> bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> {
> - struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> + struct xsk_buff_pool *xsk_pool = READ_ONCE(xdp_ring->xsk_pool);
> + struct xdp_desc *descs = xsk_pool->tx_descs;
> u32 nb_pkts, nb_processed = 0;
> unsigned int total_bytes = 0;
> int budget;
> @@ -1057,25 +1074,26 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> budget = ICE_DESC_UNUSED(xdp_ring);
> budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
>
> - nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> + nb_pkts = xsk_tx_peek_release_desc_batch(xsk_pool, budget);
> if (!nb_pkts)
> return true;
>
> if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> - ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes);
> + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, descs, nb_processed,
> + &total_bytes);
> xdp_ring->next_to_use = 0;
> }
>
> - ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed,
> - &total_bytes);
> + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, &descs[nb_processed],
> + nb_pkts - nb_processed, &total_bytes);
>
> ice_set_rs_bit(xdp_ring);
> ice_xdp_ring_update_tail(xdp_ring);
> ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
>
> - if (xsk_uses_need_wakeup(xdp_ring->xsk_pool))
> - xsk_set_tx_need_wakeup(xdp_ring->xsk_pool);
> + if (xsk_uses_need_wakeup(xsk_pool))
> + xsk_set_tx_need_wakeup(xsk_pool);
>
> return nb_pkts < budget;
> }
> @@ -1108,7 +1126,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
>
> ring = vsi->rx_rings[queue_id]->xdp_ring;
>
> - if (!ring->xsk_pool)
> + if (!READ_ONCE(ring->xsk_pool))
> return -EINVAL;
>
> /* The idea here is that if NAPI is running, mark a miss, so
> diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h
> index 6fa181f080ef..4cd2d62a0836 100644
> --- a/drivers/net/ethernet/intel/ice/ice_xsk.h
> +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h
> @@ -22,7 +22,8 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool,
> u16 qid);
> int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget);
> int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
> -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count);
> +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> + struct xsk_buff_pool *xsk_pool, u16 count);
> bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
> void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring);
> void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring);
> @@ -51,6 +52,7 @@ ice_clean_rx_irq_zc(struct ice_rx_ring __always_unused *rx_ring,
>
> static inline bool
> ice_alloc_rx_bufs_zc(struct ice_rx_ring __always_unused *rx_ring,
> + struct xsk_buff_pool __always_unused *xsk_pool,
> u16 __always_unused count)
> {
> return false;
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
` (7 preceding siblings ...)
2024-05-29 11:23 ` [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
@ 2024-05-31 23:52 ` Nelson, Shannon
8 siblings, 0 replies; 23+ messages in thread
From: Nelson, Shannon @ 2024-05-31 23:52 UTC (permalink / raw)
To: Maciej Fijalkowski, intel-wired-lan
Cc: netdev, anthony.l.nguyen, magnus.karlsson, michal.kubiak,
larysa.zaremba
On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
>
> Hi,
>
> changes included in this patchset address an issue that customer has
> been facing when AF_XDP ZC Tx sockets were used in combination with flow
> control and regular Tx traffic.
>
> After executing:
> ethtool --set-priv-flags $dev link-down-on-close on
> ethtool -A $dev rx on tx on
>
> launching multiple ZC Tx sockets on $dev + pinging remote interface (so
> that regular Tx traffic is present) and then going through down/up of
> $dev, Tx timeout occured and then most of the time ice driver was unable
> to recover from that state.
>
> These patches combined together solve the described above issue on
> customer side. Main focus here is to forbid producing Tx descriptors
> when either carrier is not yet initialized or process of bringing
> interface down has already started.
>
> Thanks,
> Maciej
>
> v1->v2:
> - fix kdoc issues in patch 6 and 8
> - drop Larysa's patches for now
>
>
> Maciej Fijalkowski (7):
> ice: don't busy wait for Rx queue disable in ice_qp_dis()
> ice: replace synchronize_rcu with synchronize_net
> ice: modify error handling when setting XSK pool in ndo_bpf
> ice: toggle netif_carrier when setting up XSK pool
> ice: improve updating ice_{t,r}x_ring::xsk_pool
> ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
> ice: xsk: fix txq interrupt mapping
>
> Michal Kubiak (1):
> ice: respect netif readiness in AF_XDP ZC related ndo's
Aside from a couple minor notes, these seem reasonable.
For the set:
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
>
> drivers/net/ethernet/intel/ice/ice.h | 6 +-
> drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
> drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
> drivers/net/ethernet/intel/ice/ice_txrx.c | 6 +-
> drivers/net/ethernet/intel/ice/ice_xsk.c | 150 +++++++++++++---------
> drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
> 6 files changed, 101 insertions(+), 71 deletions(-)
>
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool
2024-05-29 11:23 ` [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
@ 2024-06-04 6:45 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:45 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Nagaraju, Shwetha, Nagraj, Shravan, Pandey, Atul
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when
>setting up XSK pool
>
>This so we prevent Tx timeout issues. One of conditions checked on running in
>the background dev_watchdog() is netif_carrier_ok(), so let us turn it off when
>we disable the queues that belong to a q_vector where XSK pool is being
>configured.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 2 ++
> 1 file changed, 2 insertions(+)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping
2024-05-29 11:23 ` [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
@ 2024-06-04 6:46 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:46 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Pandey, Atul, Nagaraju, Shwetha, Nagraj, Shravan
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt
>mapping
>
>ice_cfg_txq_interrupt() internally handles XDP Tx ring. Do not use
>ice_for_each_tx_ring() in ice_qvec_cfg_msix() as this causing us to treat XDP
>ring that belongs to queue vector as Tx ring and therefore misconfiguring the
>interrupts.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 24 ++++++++++++++----------
> 1 file changed, 14 insertions(+), 10 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's
2024-05-29 11:23 ` [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
@ 2024-06-04 6:47 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:47 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Nagaraju, Shwetha, Nagraj, Shravan, Pandey, Atul
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 1/8] ice: respect netif readiness in
>AF_XDP ZC related ndo's
>
>From: Michal Kubiak <michal.kubiak@intel.com>
>
>Address a scenario in which XSK ZC Tx produces descriptors to XDP Tx ring
>when link is either not yet fully initialized or process of stopping the netdev has
>already started. To avoid this, add checks against carrier readiness in
>ice_xsk_wakeup() and in ice_xmit_zc().
>One could argue that bailing out early in ice_xsk_wakeup() would be sufficient
>but given the fact that we produce Tx descriptors on behalf of NAPI that is
>triggered for Rx traffic, the latter is also needed.
>
>Bringing link up is an asynchronous event executed within ice_service_task so
>even though interface has been brought up there is still a time frame where link
>is not yet ok.
>
>Without this patch, when AF_XDP ZC Tx is used simultaneously with stack Tx, Tx
>timeouts occur after going through link flap (admin brings interface down then
>up again). HW seem to be unable to transmit descriptor to the wire after HW
>tail register bump which in turn causes bit __QUEUE_STATE_STACK_XOFF to be
>set forever as
>netdev_tx_completed_queue() sees no cleaned bytes on the input.
>
>Fixes: 126cdfe1007a ("ice: xsk: Improve AF_XDP ZC Tx and use batching API")
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net
2024-05-29 11:23 ` [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
@ 2024-06-04 6:49 ` Rout, ChandanX
1 sibling, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:49 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Pandey, Atul, Nagraj, Shravan, Nagaraju, Shwetha
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu
>with synchronize_net
>
>Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can be called
>instead of synchronize_rcu() so that XDP rings can finish its job in a faster way.
>Also let us do this as earlier in XSK queue disable flow.
>
>Additionally, turn off regular Tx queue before disabling irqs and NAPI.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t, r}x_ring::xsk_pool
2024-05-29 11:23 ` [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
@ 2024-06-04 6:50 ` Rout, ChandanX
1 sibling, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:50 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Pandey, Atul, Nagaraju, Shwetha, Nagraj, Shravan
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,
>r}x_ring::xsk_pool
>
>xsk_buff_pool pointers that ice ring structs hold are updated via ndo_bpf that is
>executed in process context while it can be read by remote CPU at the same
>time within NAPI poll. Use synchronize_net() after pointer update and
>{READ,WRITE}_ONCE() when working with mentioned pointer.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice.h | 6 +-
> drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
>drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
>drivers/net/ethernet/intel/ice/ice_txrx.c | 4 +-
>drivers/net/ethernet/intel/ice/ice_xsk.c | 78 ++++++++++++++---------
>drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
> 6 files changed, 59 insertions(+), 39 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog
2024-05-29 11:23 ` [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
@ 2024-06-04 6:52 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:52 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Nagraj, Shravan, Nagaraju, Shwetha, Pandey, Atul
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE
>when clearing ice_rx_ring::xdp_prog
>
>It is read by data path and modified from process context on remote cpu so it is
>needed to use WRITE_ONCE to clear the pointer.
>
>Fixes: efc2214b6047 ("ice: Add support for XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis()
2024-05-29 11:23 ` [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
@ 2024-06-04 6:53 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:53 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Pandey, Atul, Nagaraju, Shwetha, Pandey, Atul
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx
>queue disable in ice_qp_dis()
>
>When ice driver is spammed with multiple xdpsock instances and flow control is
>enabled, there are cases when Rx queue gets stuck and unable to reflect the
>disable state in QRX_CTRL register. Similar issue has previously been addressed
>in commit 13a6233b033f ("ice: Add support to enable/disable all Rx queues
>before waiting").
>
>To workaround this, let us simply not wait for a disabled state as later patch will
>make sure that regardless of the encountered error in the process of disabling a
>queue pair, the Rx queue will be enabled.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [Intel-wired-lan] [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf
2024-05-29 11:23 ` [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
@ 2024-06-04 6:54 ` Rout, ChandanX
0 siblings, 0 replies; 23+ messages in thread
From: Rout, ChandanX @ 2024-06-04 6:54 UTC (permalink / raw)
To: Fijalkowski, Maciej, intel-wired-lan@lists.osuosl.org
Cc: Zaremba, Larysa, netdev@vger.kernel.org, Kubiak, Michal,
Nguyen, Anthony L, Karlsson, Magnus, Kuruvinakunnel, George,
Pandey, Atul, Nagraj, Shravan, Nagaraju, Shwetha
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
>Fijalkowski, Maciej
>Sent: Wednesday, May 29, 2024 4:54 PM
>To: intel-wired-lan@lists.osuosl.org
>Cc: Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; Zaremba, Larysa
><larysa.zaremba@intel.com>; netdev@vger.kernel.org; Kubiak, Michal
><michal.kubiak@intel.com>; Nguyen, Anthony L
><anthony.l.nguyen@intel.com>; Karlsson, Magnus
><magnus.karlsson@intel.com>
>Subject: [Intel-wired-lan] [PATCH v2 iwl-net 4/8] ice: modify error handling
>when setting XSK pool in ndo_bpf
>
>Don't bail out right when spotting an error within ice_qp_{dis,ena}() but rather
>track error and go through whole flow of disabling and enabling queue pair.
>
>Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
>Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
>---
> drivers/net/ethernet/intel/ice/ice_xsk.c | 30 +++++++++++++-----------
> 1 file changed, 16 insertions(+), 14 deletions(-)
>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net
2024-05-31 23:49 ` Nelson, Shannon
@ 2024-06-04 10:49 ` Maciej Fijalkowski
0 siblings, 0 replies; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 10:49 UTC (permalink / raw)
To: Nelson, Shannon
Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
michal.kubiak, larysa.zaremba
On Fri, May 31, 2024 at 04:49:01PM -0700, Nelson, Shannon wrote:
> On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
> >
> > Given that ice_qp_dis() is called under rtnl_lock, synchronize_net() can
> > be called instead of synchronize_rcu() so that XDP rings can finish its
> > job in a faster way. Also let us do this as earlier in XSK queue disable
> > flow.
> >
> > Additionally, turn off regular Tx queue before disabling irqs and NAPI.
> >
> > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> > drivers/net/ethernet/intel/ice/ice_xsk.c | 6 +++---
> > 1 file changed, 3 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index 4f606a1055b0..e93cb0ca4106 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -53,7 +53,6 @@ static void ice_qp_clean_rings(struct ice_vsi *vsi, u16 q_idx)
> > {
> > ice_clean_tx_ring(vsi->tx_rings[q_idx]);
> > if (ice_is_xdp_ena_vsi(vsi)) {
> > - synchronize_rcu();
> > ice_clean_tx_ring(vsi->xdp_rings[q_idx]);
> > }
>
> With only one statement left in the block you'll probably want to remove the
> {}'s
Thanks for catching this. To explain myself here, I was getting rid of a
patch which removes the XDP prog presence check altogether because it is a
false positive - we can't get into these queue pair disabling/enabling
functions if there is no XDP prog on interface. And the reason why I was
tossing this out of patch set was that it was not -net candidate.
Thanks again and will fix on v3.
>
> sln
>
>
> > ice_clean_rx_ring(vsi->rx_rings[q_idx]);
> > @@ -180,11 +179,12 @@ static int ice_qp_dis(struct ice_vsi *vsi, u16 q_idx)
> > usleep_range(1000, 2000);
> > }
> >
> > + synchronize_net();
> > + netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > +
> > ice_qvec_dis_irq(vsi, rx_ring, q_vector);
> > ice_qvec_toggle_napi(vsi, q_vector, false);
> >
> > - netif_tx_stop_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > -
> > ice_fill_txq_meta(vsi, tx_ring, &txq_meta);
> > err = ice_vsi_stop_tx_ring(vsi, ICE_NO_RESET, 0, tx_ring, &txq_meta);
> > if (err)
> > --
> > 2.34.1
> >
> >
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
2024-05-31 23:49 ` Nelson, Shannon
@ 2024-06-04 10:52 ` Maciej Fijalkowski
2024-06-04 11:12 ` Maciej Fijalkowski
0 siblings, 1 reply; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 10:52 UTC (permalink / raw)
To: Nelson, Shannon
Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
michal.kubiak, larysa.zaremba
On Fri, May 31, 2024 at 04:49:05PM -0700, Nelson, Shannon wrote:
> On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
> >
> > xsk_buff_pool pointers that ice ring structs hold are updated via
> > ndo_bpf that is executed in process context while it can be read by
> > remote CPU at the same time within NAPI poll. Use synchronize_net()
> > after pointer update and {READ,WRITE}_ONCE() when working with mentioned
> > pointer.
> >
> > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > ---
> > drivers/net/ethernet/intel/ice/ice.h | 6 +-
> > drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
> > drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
> > drivers/net/ethernet/intel/ice/ice_txrx.c | 4 +-
> > drivers/net/ethernet/intel/ice/ice_xsk.c | 78 ++++++++++++++---------
> > drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
> > 6 files changed, 59 insertions(+), 39 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > index da8c8afebc93..701a61d791dd 100644
> > --- a/drivers/net/ethernet/intel/ice/ice.h
> > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > @@ -771,12 +771,12 @@ static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
> > * Returns a pointer to xsk_buff_pool structure if there is a buffer pool
> > * present, NULL otherwise.
> > */
> > -static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
> > +static inline void ice_xsk_pool(struct ice_rx_ring *ring)
>
> Since this patch is changing the logic here you mighht want to change the
> name of the function. Instead of getting a pointer with no side effects it
> is now doing something, so a better name might be ice_set_xsk_pool() to
> reflect the resulting side effect.
Makes sense, plus the function description needs some adjustment. Will fix
on v3.
>
> sln
>
> > {
> > struct ice_vsi *vsi = ring->vsi;
> > u16 qid = ring->q_index;
> >
> > - return ice_get_xp_from_qid(vsi, qid);
> > + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> > }
> >
> > /**
> > @@ -801,7 +801,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
> > if (!ring)
> > return;
> >
> > - ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
> > + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> > }
> >
> > /**
> > diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> > index 5d396c1a7731..f3dfce136106 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_base.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> > @@ -536,7 +536,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> > return err;
> > }
> >
> > - ring->xsk_pool = ice_xsk_pool(ring);
> > + ice_xsk_pool(ring);
> > if (ring->xsk_pool) {
> > xdp_rxq_info_unreg(&ring->xdp_rxq);
> >
> > @@ -597,7 +597,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> > return 0;
> > }
> >
> > - ok = ice_alloc_rx_bufs_zc(ring, num_bufs);
> > + ok = ice_alloc_rx_bufs_zc(ring, ring->xsk_pool, num_bufs);
> > if (!ok) {
> > u16 pf_q = ring->vsi->rxq_map[ring->q_index];
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > index 1b61ca3a6eb6..15a6805ac2a1 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > @@ -2946,7 +2946,7 @@ static void ice_vsi_rx_napi_schedule(struct ice_vsi *vsi)
> > ice_for_each_rxq(vsi, i) {
> > struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
> >
> > - if (rx_ring->xsk_pool)
> > + if (READ_ONCE(rx_ring->xsk_pool))
> > napi_schedule(&rx_ring->q_vector->napi);
> > }
> > }
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > index 8bb743f78fcb..f4b2b1bca234 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > @@ -1523,7 +1523,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> > ice_for_each_tx_ring(tx_ring, q_vector->tx) {
> > bool wd;
> >
> > - if (tx_ring->xsk_pool)
> > + if (READ_ONCE(tx_ring->xsk_pool))
> > wd = ice_xmit_zc(tx_ring);
> > else if (ice_ring_is_xdp(tx_ring))
> > wd = true;
> > @@ -1556,7 +1556,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> > * comparison in the irq context instead of many inside the
> > * ice_clean_rx_irq function and makes the codebase cleaner.
> > */
> > - cleaned = rx_ring->xsk_pool ?
> > + cleaned = READ_ONCE(rx_ring->xsk_pool) ?
> > ice_clean_rx_irq_zc(rx_ring, budget_per_ring) :
> > ice_clean_rx_irq(rx_ring, budget_per_ring);
> > work_done += cleaned;
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > index 8c5006f37310..693f0e3a37ce 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > @@ -250,6 +250,8 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> > ice_qvec_toggle_napi(vsi, q_vector, true);
> > ice_qvec_ena_irq(vsi, q_vector);
> >
> > + /* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */
> > + synchronize_net();
> > netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > netif_carrier_on(vsi->netdev);
> > clear_bit(ICE_CFG_BUSY, vsi->state);
> > @@ -461,6 +463,7 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> > /**
> > * __ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> > * @rx_ring: Rx ring
> > + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> > * @count: The number of buffers to allocate
> > *
> > * Place the @count of descriptors onto Rx ring. Handle the ring wrap
> > @@ -469,7 +472,8 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> > *
> > * Returns true if all allocations were successful, false if any fail.
> > */
> > -static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > +static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > + struct xsk_buff_pool *xsk_pool, u16 count)
> > {
> > u32 nb_buffs_extra = 0, nb_buffs = 0;
> > union ice_32b_rx_flex_desc *rx_desc;
> > @@ -481,8 +485,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > xdp = ice_xdp_buf(rx_ring, ntu);
> >
> > if (ntu + count >= rx_ring->count) {
> > - nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
> > - rx_desc,
> > + nb_buffs_extra = ice_fill_rx_descs(xsk_pool, xdp, rx_desc,
> > rx_ring->count - ntu);
> > if (nb_buffs_extra != rx_ring->count - ntu) {
> > ntu += nb_buffs_extra;
> > @@ -495,7 +498,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > ice_release_rx_desc(rx_ring, 0);
> > }
> >
> > - nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
> > + nb_buffs = ice_fill_rx_descs(xsk_pool, xdp, rx_desc, count);
> >
> > ntu += nb_buffs;
> > if (ntu == rx_ring->count)
> > @@ -511,6 +514,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > /**
> > * ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> > * @rx_ring: Rx ring
> > + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> > * @count: The number of buffers to allocate
> > *
> > * Wrapper for internal allocation routine; figure out how many tail
> > @@ -518,7 +522,8 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > *
> > * Returns true if all calls to internal alloc routine succeeded
> > */
> > -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > + struct xsk_buff_pool *xsk_pool, u16 count)
> > {
> > u16 rx_thresh = ICE_RING_QUARTER(rx_ring);
> > u16 leftover, i, tail_bumps;
> > @@ -527,9 +532,9 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > leftover = count - (tail_bumps * rx_thresh);
> >
> > for (i = 0; i < tail_bumps; i++)
> > - if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh))
> > + if (!__ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, rx_thresh))
> > return false;
> > - return __ice_alloc_rx_bufs_zc(rx_ring, leftover);
> > + return __ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, leftover);
> > }
> >
> > /**
> > @@ -650,7 +655,7 @@ static u32 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring)
> > if (xdp_ring->next_to_clean >= cnt)
> > xdp_ring->next_to_clean -= cnt;
> > if (xsk_frames)
> > - xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames);
> > + xsk_tx_completed(READ_ONCE(xdp_ring->xsk_pool), xsk_frames);
> >
> > return completed_frames;
> > }
> > @@ -702,7 +707,8 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
> > dma_addr_t dma;
> >
> > dma = xsk_buff_xdp_get_dma(xdp);
> > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, size);
> > + xsk_buff_raw_dma_sync_for_device(READ_ONCE(xdp_ring->xsk_pool),
> > + dma, size);
> >
> > tx_buf->xdp = xdp;
> > tx_buf->type = ICE_TX_BUF_XSK_TX;
> > @@ -760,7 +766,8 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
> > if (!err)
> > return ICE_XDP_REDIR;
> > - if (xsk_uses_need_wakeup(rx_ring->xsk_pool) && err == -ENOBUFS)
> > + if (xsk_uses_need_wakeup(READ_ONCE(rx_ring->xsk_pool)) &&
> > + err == -ENOBUFS)
> > result = ICE_XDP_EXIT;
> > else
> > result = ICE_XDP_CONSUMED;
> > @@ -829,8 +836,8 @@ ice_add_xsk_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *first,
> > */
> > int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > {
> > + struct xsk_buff_pool *xsk_pool = READ_ONCE(rx_ring->xsk_pool);
> > unsigned int total_rx_bytes = 0, total_rx_packets = 0;
> > - struct xsk_buff_pool *xsk_pool = rx_ring->xsk_pool;
> > u32 ntc = rx_ring->next_to_clean;
> > u32 ntu = rx_ring->next_to_use;
> > struct xdp_buff *first = NULL;
> > @@ -942,7 +949,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > rx_ring->next_to_clean = ntc;
> > entries_to_alloc = ICE_RX_DESC_UNUSED(rx_ring);
> > if (entries_to_alloc > ICE_RING_QUARTER(rx_ring))
> > - failure |= !ice_alloc_rx_bufs_zc(rx_ring, entries_to_alloc);
> > + failure |= !ice_alloc_rx_bufs_zc(rx_ring, xsk_pool,
> > + entries_to_alloc);
> >
> > ice_finalize_xdp_rx(xdp_ring, xdp_xmit, 0);
> > ice_update_rx_ring_stats(rx_ring, total_rx_packets, total_rx_bytes);
> > @@ -965,17 +973,19 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > /**
> > * ice_xmit_pkt - produce a single HW Tx descriptor out of AF_XDP descriptor
> > * @xdp_ring: XDP ring to produce the HW Tx descriptor on
> > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > * @desc: AF_XDP descriptor to pull the DMA address and length from
> > * @total_bytes: bytes accumulator that will be used for stats update
> > */
> > -static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> > +static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring,
> > + struct xsk_buff_pool *xsk_pool, struct xdp_desc *desc,
> > unsigned int *total_bytes)
> > {
> > struct ice_tx_desc *tx_desc;
> > dma_addr_t dma;
> >
> > - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr);
> > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len);
> > + dma = xsk_buff_raw_get_dma(xsk_pool, desc->addr);
> > + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, desc->len);
> >
> > tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_to_use++);
> > tx_desc->buf_addr = cpu_to_le64(dma);
> > @@ -988,10 +998,13 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> > /**
> > * ice_xmit_pkt_batch - produce a batch of HW Tx descriptors out of AF_XDP descriptors
> > * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> > * @total_bytes: bytes accumulator that will be used for stats update
> > */
> > -static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> > +static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring,
> > + struct xsk_buff_pool *xsk_pool,
> > + struct xdp_desc *descs,
> > unsigned int *total_bytes)
> > {
> > u16 ntu = xdp_ring->next_to_use;
> > @@ -1001,8 +1014,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> > loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) {
> > dma_addr_t dma;
> >
> > - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, descs[i].addr);
> > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, descs[i].len);
> > + dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr);
> > + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, descs[i].len);
> >
> > tx_desc = ICE_TX_DESC(xdp_ring, ntu++);
> > tx_desc->buf_addr = cpu_to_le64(dma);
> > @@ -1018,21 +1031,24 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> > /**
> > * ice_fill_tx_hw_ring - produce the number of Tx descriptors onto ring
> > * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> > * @nb_pkts: count of packets to be send
> > * @total_bytes: bytes accumulator that will be used for stats update
> > */
> > -static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> > - u32 nb_pkts, unsigned int *total_bytes)
> > +static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring,
> > + struct xsk_buff_pool *xsk_pool,
> > + struct xdp_desc *descs, u32 nb_pkts,
> > + unsigned int *total_bytes)
> > {
> > u32 batched, leftover, i;
> >
> > batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH);
> > leftover = nb_pkts & (PKTS_PER_BATCH - 1);
> > for (i = 0; i < batched; i += PKTS_PER_BATCH)
> > - ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes);
> > + ice_xmit_pkt_batch(xdp_ring, xsk_pool, &descs[i], total_bytes);
> > for (; i < batched + leftover; i++)
> > - ice_xmit_pkt(xdp_ring, &descs[i], total_bytes);
> > + ice_xmit_pkt(xdp_ring, xsk_pool, &descs[i], total_bytes);
> > }
> >
> > /**
> > @@ -1043,7 +1059,8 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d
> > */
> > bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> > {
> > - struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > + struct xsk_buff_pool *xsk_pool = READ_ONCE(xdp_ring->xsk_pool);
> > + struct xdp_desc *descs = xsk_pool->tx_descs;
> > u32 nb_pkts, nb_processed = 0;
> > unsigned int total_bytes = 0;
> > int budget;
> > @@ -1057,25 +1074,26 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> > budget = ICE_DESC_UNUSED(xdp_ring);
> > budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
> >
> > - nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > + nb_pkts = xsk_tx_peek_release_desc_batch(xsk_pool, budget);
> > if (!nb_pkts)
> > return true;
> >
> > if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > - ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes);
> > + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, descs, nb_processed,
> > + &total_bytes);
> > xdp_ring->next_to_use = 0;
> > }
> >
> > - ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed,
> > - &total_bytes);
> > + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, &descs[nb_processed],
> > + nb_pkts - nb_processed, &total_bytes);
> >
> > ice_set_rs_bit(xdp_ring);
> > ice_xdp_ring_update_tail(xdp_ring);
> > ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
> >
> > - if (xsk_uses_need_wakeup(xdp_ring->xsk_pool))
> > - xsk_set_tx_need_wakeup(xdp_ring->xsk_pool);
> > + if (xsk_uses_need_wakeup(xsk_pool))
> > + xsk_set_tx_need_wakeup(xsk_pool);
> >
> > return nb_pkts < budget;
> > }
> > @@ -1108,7 +1126,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
> >
> > ring = vsi->rx_rings[queue_id]->xdp_ring;
> >
> > - if (!ring->xsk_pool)
> > + if (!READ_ONCE(ring->xsk_pool))
> > return -EINVAL;
> >
> > /* The idea here is that if NAPI is running, mark a miss, so
> > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h
> > index 6fa181f080ef..4cd2d62a0836 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_xsk.h
> > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h
> > @@ -22,7 +22,8 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool,
> > u16 qid);
> > int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget);
> > int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
> > -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count);
> > +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > + struct xsk_buff_pool *xsk_pool, u16 count);
> > bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
> > void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring);
> > void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring);
> > @@ -51,6 +52,7 @@ ice_clean_rx_irq_zc(struct ice_rx_ring __always_unused *rx_ring,
> >
> > static inline bool
> > ice_alloc_rx_bufs_zc(struct ice_rx_ring __always_unused *rx_ring,
> > + struct xsk_buff_pool __always_unused *xsk_pool,
> > u16 __always_unused count)
> > {
> > return false;
> > --
> > 2.34.1
> >
> >
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool
2024-06-04 10:52 ` Maciej Fijalkowski
@ 2024-06-04 11:12 ` Maciej Fijalkowski
0 siblings, 0 replies; 23+ messages in thread
From: Maciej Fijalkowski @ 2024-06-04 11:12 UTC (permalink / raw)
To: Nelson, Shannon
Cc: intel-wired-lan, netdev, anthony.l.nguyen, magnus.karlsson,
michal.kubiak, larysa.zaremba
On Tue, Jun 04, 2024 at 12:52:16PM +0200, Maciej Fijalkowski wrote:
> On Fri, May 31, 2024 at 04:49:05PM -0700, Nelson, Shannon wrote:
> > On 5/29/2024 4:23 AM, Maciej Fijalkowski wrote:
> > >
> > > xsk_buff_pool pointers that ice ring structs hold are updated via
> > > ndo_bpf that is executed in process context while it can be read by
> > > remote CPU at the same time within NAPI poll. Use synchronize_net()
> > > after pointer update and {READ,WRITE}_ONCE() when working with mentioned
> > > pointer.
> > >
> > > Fixes: 2d4238f55697 ("ice: Add support for AF_XDP")
> > > Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > ---
> > > drivers/net/ethernet/intel/ice/ice.h | 6 +-
> > > drivers/net/ethernet/intel/ice/ice_base.c | 4 +-
> > > drivers/net/ethernet/intel/ice/ice_main.c | 2 +-
> > > drivers/net/ethernet/intel/ice/ice_txrx.c | 4 +-
> > > drivers/net/ethernet/intel/ice/ice_xsk.c | 78 ++++++++++++++---------
> > > drivers/net/ethernet/intel/ice/ice_xsk.h | 4 +-
> > > 6 files changed, 59 insertions(+), 39 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h
> > > index da8c8afebc93..701a61d791dd 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice.h
> > > @@ -771,12 +771,12 @@ static inline struct xsk_buff_pool *ice_get_xp_from_qid(struct ice_vsi *vsi,
> > > * Returns a pointer to xsk_buff_pool structure if there is a buffer pool
> > > * present, NULL otherwise.
> > > */
> > > -static inline struct xsk_buff_pool *ice_xsk_pool(struct ice_rx_ring *ring)
> > > +static inline void ice_xsk_pool(struct ice_rx_ring *ring)
> >
> > Since this patch is changing the logic here you mighht want to change the
> > name of the function. Instead of getting a pointer with no side effects it
> > is now doing something, so a better name might be ice_set_xsk_pool() to
> > reflect the resulting side effect.
>
> Makes sense, plus the function description needs some adjustment. Will fix
> on v3.
Having second thought on this, I'll go only with
s/ice_xsk_pool/ice_rx_xsk_pool/ to align with Tx side. Tx side also lacks
the 'set' keyword in its naming, but I don't want to touch this here.
I currently don't see a reason for these functions being inlined as we are
not setting xsk pool pointers in the hot path at all, therefore I can go
later on with a patch dedicated for -next that fixes naming and drops the
inlining.
>
> >
> > sln
> >
> > > {
> > > struct ice_vsi *vsi = ring->vsi;
> > > u16 qid = ring->q_index;
> > >
> > > - return ice_get_xp_from_qid(vsi, qid);
> > > + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> > > }
> > >
> > > /**
> > > @@ -801,7 +801,7 @@ static inline void ice_tx_xsk_pool(struct ice_vsi *vsi, u16 qid)
> > > if (!ring)
> > > return;
> > >
> > > - ring->xsk_pool = ice_get_xp_from_qid(vsi, qid);
> > > + WRITE_ONCE(ring->xsk_pool, ice_get_xp_from_qid(vsi, qid));
> > > }
> > >
> > > /**
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c
> > > index 5d396c1a7731..f3dfce136106 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_base.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_base.c
> > > @@ -536,7 +536,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> > > return err;
> > > }
> > >
> > > - ring->xsk_pool = ice_xsk_pool(ring);
> > > + ice_xsk_pool(ring);
> > > if (ring->xsk_pool) {
> > > xdp_rxq_info_unreg(&ring->xdp_rxq);
> > >
> > > @@ -597,7 +597,7 @@ static int ice_vsi_cfg_rxq(struct ice_rx_ring *ring)
> > > return 0;
> > > }
> > >
> > > - ok = ice_alloc_rx_bufs_zc(ring, num_bufs);
> > > + ok = ice_alloc_rx_bufs_zc(ring, ring->xsk_pool, num_bufs);
> > > if (!ok) {
> > > u16 pf_q = ring->vsi->rxq_map[ring->q_index];
> > >
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c
> > > index 1b61ca3a6eb6..15a6805ac2a1 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_main.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> > > @@ -2946,7 +2946,7 @@ static void ice_vsi_rx_napi_schedule(struct ice_vsi *vsi)
> > > ice_for_each_rxq(vsi, i) {
> > > struct ice_rx_ring *rx_ring = vsi->rx_rings[i];
> > >
> > > - if (rx_ring->xsk_pool)
> > > + if (READ_ONCE(rx_ring->xsk_pool))
> > > napi_schedule(&rx_ring->q_vector->napi);
> > > }
> > > }
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > index 8bb743f78fcb..f4b2b1bca234 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > > @@ -1523,7 +1523,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> > > ice_for_each_tx_ring(tx_ring, q_vector->tx) {
> > > bool wd;
> > >
> > > - if (tx_ring->xsk_pool)
> > > + if (READ_ONCE(tx_ring->xsk_pool))
> > > wd = ice_xmit_zc(tx_ring);
> > > else if (ice_ring_is_xdp(tx_ring))
> > > wd = true;
> > > @@ -1556,7 +1556,7 @@ int ice_napi_poll(struct napi_struct *napi, int budget)
> > > * comparison in the irq context instead of many inside the
> > > * ice_clean_rx_irq function and makes the codebase cleaner.
> > > */
> > > - cleaned = rx_ring->xsk_pool ?
> > > + cleaned = READ_ONCE(rx_ring->xsk_pool) ?
> > > ice_clean_rx_irq_zc(rx_ring, budget_per_ring) :
> > > ice_clean_rx_irq(rx_ring, budget_per_ring);
> > > work_done += cleaned;
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.c b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > index 8c5006f37310..693f0e3a37ce 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.c
> > > @@ -250,6 +250,8 @@ static int ice_qp_ena(struct ice_vsi *vsi, u16 q_idx)
> > > ice_qvec_toggle_napi(vsi, q_vector, true);
> > > ice_qvec_ena_irq(vsi, q_vector);
> > >
> > > + /* make sure NAPI sees updated ice_{t,x}_ring::xsk_pool */
> > > + synchronize_net();
> > > netif_tx_start_queue(netdev_get_tx_queue(vsi->netdev, q_idx));
> > > netif_carrier_on(vsi->netdev);
> > > clear_bit(ICE_CFG_BUSY, vsi->state);
> > > @@ -461,6 +463,7 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> > > /**
> > > * __ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> > > * @rx_ring: Rx ring
> > > + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> > > * @count: The number of buffers to allocate
> > > *
> > > * Place the @count of descriptors onto Rx ring. Handle the ring wrap
> > > @@ -469,7 +472,8 @@ static u16 ice_fill_rx_descs(struct xsk_buff_pool *pool, struct xdp_buff **xdp,
> > > *
> > > * Returns true if all allocations were successful, false if any fail.
> > > */
> > > -static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > +static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > > + struct xsk_buff_pool *xsk_pool, u16 count)
> > > {
> > > u32 nb_buffs_extra = 0, nb_buffs = 0;
> > > union ice_32b_rx_flex_desc *rx_desc;
> > > @@ -481,8 +485,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > xdp = ice_xdp_buf(rx_ring, ntu);
> > >
> > > if (ntu + count >= rx_ring->count) {
> > > - nb_buffs_extra = ice_fill_rx_descs(rx_ring->xsk_pool, xdp,
> > > - rx_desc,
> > > + nb_buffs_extra = ice_fill_rx_descs(xsk_pool, xdp, rx_desc,
> > > rx_ring->count - ntu);
> > > if (nb_buffs_extra != rx_ring->count - ntu) {
> > > ntu += nb_buffs_extra;
> > > @@ -495,7 +498,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > ice_release_rx_desc(rx_ring, 0);
> > > }
> > >
> > > - nb_buffs = ice_fill_rx_descs(rx_ring->xsk_pool, xdp, rx_desc, count);
> > > + nb_buffs = ice_fill_rx_descs(xsk_pool, xdp, rx_desc, count);
> > >
> > > ntu += nb_buffs;
> > > if (ntu == rx_ring->count)
> > > @@ -511,6 +514,7 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > /**
> > > * ice_alloc_rx_bufs_zc - allocate a number of Rx buffers
> > > * @rx_ring: Rx ring
> > > + * @xsk_pool: XSK buffer pool to pick buffers to be filled by HW
> > > * @count: The number of buffers to allocate
> > > *
> > > * Wrapper for internal allocation routine; figure out how many tail
> > > @@ -518,7 +522,8 @@ static bool __ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > *
> > > * Returns true if all calls to internal alloc routine succeeded
> > > */
> > > -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > > + struct xsk_buff_pool *xsk_pool, u16 count)
> > > {
> > > u16 rx_thresh = ICE_RING_QUARTER(rx_ring);
> > > u16 leftover, i, tail_bumps;
> > > @@ -527,9 +532,9 @@ bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count)
> > > leftover = count - (tail_bumps * rx_thresh);
> > >
> > > for (i = 0; i < tail_bumps; i++)
> > > - if (!__ice_alloc_rx_bufs_zc(rx_ring, rx_thresh))
> > > + if (!__ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, rx_thresh))
> > > return false;
> > > - return __ice_alloc_rx_bufs_zc(rx_ring, leftover);
> > > + return __ice_alloc_rx_bufs_zc(rx_ring, xsk_pool, leftover);
> > > }
> > >
> > > /**
> > > @@ -650,7 +655,7 @@ static u32 ice_clean_xdp_irq_zc(struct ice_tx_ring *xdp_ring)
> > > if (xdp_ring->next_to_clean >= cnt)
> > > xdp_ring->next_to_clean -= cnt;
> > > if (xsk_frames)
> > > - xsk_tx_completed(xdp_ring->xsk_pool, xsk_frames);
> > > + xsk_tx_completed(READ_ONCE(xdp_ring->xsk_pool), xsk_frames);
> > >
> > > return completed_frames;
> > > }
> > > @@ -702,7 +707,8 @@ static int ice_xmit_xdp_tx_zc(struct xdp_buff *xdp,
> > > dma_addr_t dma;
> > >
> > > dma = xsk_buff_xdp_get_dma(xdp);
> > > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, size);
> > > + xsk_buff_raw_dma_sync_for_device(READ_ONCE(xdp_ring->xsk_pool),
> > > + dma, size);
> > >
> > > tx_buf->xdp = xdp;
> > > tx_buf->type = ICE_TX_BUF_XSK_TX;
> > > @@ -760,7 +766,8 @@ ice_run_xdp_zc(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp,
> > > err = xdp_do_redirect(rx_ring->netdev, xdp, xdp_prog);
> > > if (!err)
> > > return ICE_XDP_REDIR;
> > > - if (xsk_uses_need_wakeup(rx_ring->xsk_pool) && err == -ENOBUFS)
> > > + if (xsk_uses_need_wakeup(READ_ONCE(rx_ring->xsk_pool)) &&
> > > + err == -ENOBUFS)
> > > result = ICE_XDP_EXIT;
> > > else
> > > result = ICE_XDP_CONSUMED;
> > > @@ -829,8 +836,8 @@ ice_add_xsk_frag(struct ice_rx_ring *rx_ring, struct xdp_buff *first,
> > > */
> > > int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > > {
> > > + struct xsk_buff_pool *xsk_pool = READ_ONCE(rx_ring->xsk_pool);
> > > unsigned int total_rx_bytes = 0, total_rx_packets = 0;
> > > - struct xsk_buff_pool *xsk_pool = rx_ring->xsk_pool;
> > > u32 ntc = rx_ring->next_to_clean;
> > > u32 ntu = rx_ring->next_to_use;
> > > struct xdp_buff *first = NULL;
> > > @@ -942,7 +949,8 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > > rx_ring->next_to_clean = ntc;
> > > entries_to_alloc = ICE_RX_DESC_UNUSED(rx_ring);
> > > if (entries_to_alloc > ICE_RING_QUARTER(rx_ring))
> > > - failure |= !ice_alloc_rx_bufs_zc(rx_ring, entries_to_alloc);
> > > + failure |= !ice_alloc_rx_bufs_zc(rx_ring, xsk_pool,
> > > + entries_to_alloc);
> > >
> > > ice_finalize_xdp_rx(xdp_ring, xdp_xmit, 0);
> > > ice_update_rx_ring_stats(rx_ring, total_rx_packets, total_rx_bytes);
> > > @@ -965,17 +973,19 @@ int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget)
> > > /**
> > > * ice_xmit_pkt - produce a single HW Tx descriptor out of AF_XDP descriptor
> > > * @xdp_ring: XDP ring to produce the HW Tx descriptor on
> > > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > > * @desc: AF_XDP descriptor to pull the DMA address and length from
> > > * @total_bytes: bytes accumulator that will be used for stats update
> > > */
> > > -static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> > > +static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring,
> > > + struct xsk_buff_pool *xsk_pool, struct xdp_desc *desc,
> > > unsigned int *total_bytes)
> > > {
> > > struct ice_tx_desc *tx_desc;
> > > dma_addr_t dma;
> > >
> > > - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, desc->addr);
> > > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, desc->len);
> > > + dma = xsk_buff_raw_get_dma(xsk_pool, desc->addr);
> > > + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, desc->len);
> > >
> > > tx_desc = ICE_TX_DESC(xdp_ring, xdp_ring->next_to_use++);
> > > tx_desc->buf_addr = cpu_to_le64(dma);
> > > @@ -988,10 +998,13 @@ static void ice_xmit_pkt(struct ice_tx_ring *xdp_ring, struct xdp_desc *desc,
> > > /**
> > > * ice_xmit_pkt_batch - produce a batch of HW Tx descriptors out of AF_XDP descriptors
> > > * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> > > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > > * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> > > * @total_bytes: bytes accumulator that will be used for stats update
> > > */
> > > -static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> > > +static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring,
> > > + struct xsk_buff_pool *xsk_pool,
> > > + struct xdp_desc *descs,
> > > unsigned int *total_bytes)
> > > {
> > > u16 ntu = xdp_ring->next_to_use;
> > > @@ -1001,8 +1014,8 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> > > loop_unrolled_for(i = 0; i < PKTS_PER_BATCH; i++) {
> > > dma_addr_t dma;
> > >
> > > - dma = xsk_buff_raw_get_dma(xdp_ring->xsk_pool, descs[i].addr);
> > > - xsk_buff_raw_dma_sync_for_device(xdp_ring->xsk_pool, dma, descs[i].len);
> > > + dma = xsk_buff_raw_get_dma(xsk_pool, descs[i].addr);
> > > + xsk_buff_raw_dma_sync_for_device(xsk_pool, dma, descs[i].len);
> > >
> > > tx_desc = ICE_TX_DESC(xdp_ring, ntu++);
> > > tx_desc->buf_addr = cpu_to_le64(dma);
> > > @@ -1018,21 +1031,24 @@ static void ice_xmit_pkt_batch(struct ice_tx_ring *xdp_ring, struct xdp_desc *de
> > > /**
> > > * ice_fill_tx_hw_ring - produce the number of Tx descriptors onto ring
> > > * @xdp_ring: XDP ring to produce the HW Tx descriptors on
> > > + * @xsk_pool: XSK buffer pool to pick buffers to be consumed by HW
> > > * @descs: AF_XDP descriptors to pull the DMA addresses and lengths from
> > > * @nb_pkts: count of packets to be send
> > > * @total_bytes: bytes accumulator that will be used for stats update
> > > */
> > > -static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *descs,
> > > - u32 nb_pkts, unsigned int *total_bytes)
> > > +static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring,
> > > + struct xsk_buff_pool *xsk_pool,
> > > + struct xdp_desc *descs, u32 nb_pkts,
> > > + unsigned int *total_bytes)
> > > {
> > > u32 batched, leftover, i;
> > >
> > > batched = ALIGN_DOWN(nb_pkts, PKTS_PER_BATCH);
> > > leftover = nb_pkts & (PKTS_PER_BATCH - 1);
> > > for (i = 0; i < batched; i += PKTS_PER_BATCH)
> > > - ice_xmit_pkt_batch(xdp_ring, &descs[i], total_bytes);
> > > + ice_xmit_pkt_batch(xdp_ring, xsk_pool, &descs[i], total_bytes);
> > > for (; i < batched + leftover; i++)
> > > - ice_xmit_pkt(xdp_ring, &descs[i], total_bytes);
> > > + ice_xmit_pkt(xdp_ring, xsk_pool, &descs[i], total_bytes);
> > > }
> > >
> > > /**
> > > @@ -1043,7 +1059,8 @@ static void ice_fill_tx_hw_ring(struct ice_tx_ring *xdp_ring, struct xdp_desc *d
> > > */
> > > bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> > > {
> > > - struct xdp_desc *descs = xdp_ring->xsk_pool->tx_descs;
> > > + struct xsk_buff_pool *xsk_pool = READ_ONCE(xdp_ring->xsk_pool);
> > > + struct xdp_desc *descs = xsk_pool->tx_descs;
> > > u32 nb_pkts, nb_processed = 0;
> > > unsigned int total_bytes = 0;
> > > int budget;
> > > @@ -1057,25 +1074,26 @@ bool ice_xmit_zc(struct ice_tx_ring *xdp_ring)
> > > budget = ICE_DESC_UNUSED(xdp_ring);
> > > budget = min_t(u16, budget, ICE_RING_QUARTER(xdp_ring));
> > >
> > > - nb_pkts = xsk_tx_peek_release_desc_batch(xdp_ring->xsk_pool, budget);
> > > + nb_pkts = xsk_tx_peek_release_desc_batch(xsk_pool, budget);
> > > if (!nb_pkts)
> > > return true;
> > >
> > > if (xdp_ring->next_to_use + nb_pkts >= xdp_ring->count) {
> > > nb_processed = xdp_ring->count - xdp_ring->next_to_use;
> > > - ice_fill_tx_hw_ring(xdp_ring, descs, nb_processed, &total_bytes);
> > > + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, descs, nb_processed,
> > > + &total_bytes);
> > > xdp_ring->next_to_use = 0;
> > > }
> > >
> > > - ice_fill_tx_hw_ring(xdp_ring, &descs[nb_processed], nb_pkts - nb_processed,
> > > - &total_bytes);
> > > + ice_fill_tx_hw_ring(xdp_ring, xsk_pool, &descs[nb_processed],
> > > + nb_pkts - nb_processed, &total_bytes);
> > >
> > > ice_set_rs_bit(xdp_ring);
> > > ice_xdp_ring_update_tail(xdp_ring);
> > > ice_update_tx_ring_stats(xdp_ring, nb_pkts, total_bytes);
> > >
> > > - if (xsk_uses_need_wakeup(xdp_ring->xsk_pool))
> > > - xsk_set_tx_need_wakeup(xdp_ring->xsk_pool);
> > > + if (xsk_uses_need_wakeup(xsk_pool))
> > > + xsk_set_tx_need_wakeup(xsk_pool);
> > >
> > > return nb_pkts < budget;
> > > }
> > > @@ -1108,7 +1126,7 @@ ice_xsk_wakeup(struct net_device *netdev, u32 queue_id,
> > >
> > > ring = vsi->rx_rings[queue_id]->xdp_ring;
> > >
> > > - if (!ring->xsk_pool)
> > > + if (!READ_ONCE(ring->xsk_pool))
> > > return -EINVAL;
> > >
> > > /* The idea here is that if NAPI is running, mark a miss, so
> > > diff --git a/drivers/net/ethernet/intel/ice/ice_xsk.h b/drivers/net/ethernet/intel/ice/ice_xsk.h
> > > index 6fa181f080ef..4cd2d62a0836 100644
> > > --- a/drivers/net/ethernet/intel/ice/ice_xsk.h
> > > +++ b/drivers/net/ethernet/intel/ice/ice_xsk.h
> > > @@ -22,7 +22,8 @@ int ice_xsk_pool_setup(struct ice_vsi *vsi, struct xsk_buff_pool *pool,
> > > u16 qid);
> > > int ice_clean_rx_irq_zc(struct ice_rx_ring *rx_ring, int budget);
> > > int ice_xsk_wakeup(struct net_device *netdev, u32 queue_id, u32 flags);
> > > -bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring, u16 count);
> > > +bool ice_alloc_rx_bufs_zc(struct ice_rx_ring *rx_ring,
> > > + struct xsk_buff_pool *xsk_pool, u16 count);
> > > bool ice_xsk_any_rx_ring_ena(struct ice_vsi *vsi);
> > > void ice_xsk_clean_rx_ring(struct ice_rx_ring *rx_ring);
> > > void ice_xsk_clean_xdp_ring(struct ice_tx_ring *xdp_ring);
> > > @@ -51,6 +52,7 @@ ice_clean_rx_irq_zc(struct ice_rx_ring __always_unused *rx_ring,
> > >
> > > static inline bool
> > > ice_alloc_rx_bufs_zc(struct ice_rx_ring __always_unused *rx_ring,
> > > + struct xsk_buff_pool __always_unused *xsk_pool,
> > > u16 __always_unused count)
> > > {
> > > return false;
> > > --
> > > 2.34.1
> > >
> > >
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2024-06-04 11:12 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-29 11:23 [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Maciej Fijalkowski
2024-05-29 11:23 ` [PATCH v2 iwl-net 1/8] ice: respect netif readiness in AF_XDP ZC related ndo's Maciej Fijalkowski
2024-06-04 6:47 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 2/8] ice: don't busy wait for Rx queue disable in ice_qp_dis() Maciej Fijalkowski
2024-06-04 6:53 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 3/8] ice: replace synchronize_rcu with synchronize_net Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 10:49 ` Maciej Fijalkowski
2024-06-04 6:49 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 4/8] ice: modify error handling when setting XSK pool in ndo_bpf Maciej Fijalkowski
2024-06-04 6:54 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 5/8] ice: toggle netif_carrier when setting up XSK pool Maciej Fijalkowski
2024-06-04 6:45 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t,r}x_ring::xsk_pool Maciej Fijalkowski
2024-05-31 23:49 ` Nelson, Shannon
2024-06-04 10:52 ` Maciej Fijalkowski
2024-06-04 11:12 ` Maciej Fijalkowski
2024-06-04 6:50 ` [Intel-wired-lan] [PATCH v2 iwl-net 6/8] ice: improve updating ice_{t, r}x_ring::xsk_pool Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 7/8] ice: add missing WRITE_ONCE when clearing ice_rx_ring::xdp_prog Maciej Fijalkowski
2024-06-04 6:52 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-29 11:23 ` [PATCH v2 iwl-net 8/8] ice: xsk: fix txq interrupt mapping Maciej Fijalkowski
2024-06-04 6:46 ` [Intel-wired-lan] " Rout, ChandanX
2024-05-31 23:52 ` [PATCH v2 iwl-net 0/8] ice: fix AF_XDP ZC timeout and concurrency issues Nelson, Shannon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).