* [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
@ 2026-01-12 13:03 Vivek Behera via Intel-wired-lan
2026-01-12 14:11 ` Paul Menzel
` (4 more replies)
0 siblings, 5 replies; 14+ messages in thread
From: Vivek Behera via Intel-wired-lan @ 2026-01-12 13:03 UTC (permalink / raw)
To: aleksandr.loktionov, jacob.e.keller, anthony.l.nguyen,
przemyslaw.kitszel, maciej.fijalkowski, sriram.yagnaraman, kurt
Cc: intel-wired-lan, vivek.behera
The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
to share the same irq. This would lead to triggering of incorrect irq
in split irq configuration.
This patch addresses this issue which could impact environments
with 2 active cpu cores
or when the number of queues is reduced to 2 or less
cat /proc/interrupts | grep eno2
167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
0-edge eno2
168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
1-edge eno2-rx-0
169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
2-edge eno2-rx-1
170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
3-edge eno2-tx-0
171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
4-edge eno2-tx-1
Furthermore it uses the flags input argument to trigger either rx, tx or
both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation
Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
---
v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-vivek.behera@siemens.com/
v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-vivek.behera@siemens.com/
v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-vivek.behera@siemens.com/
v4: https://lore.kernel.org/intel-wired-lan/20251222115747.230521-1-vivek.behera@siemens.com/
changelog:
v1
- Inital description of the Bug and fixes made in the patch
v1 -> v2
- Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
- Review suggestions by Aleksander: Modified sequence to complete all
error checks for rx and tx before updating napi states and triggering irqs
- Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
- Added define for Tx interrupt trigger bit mask for E1000_ICS
v2 -> v3
- Included applicable feedback and suggestions from igc patch
- Fixed logic in updating eics value when both TX and RX need wakeup
v3 -> v4
- Added comments to explain trigerring of both TX and RX with active queue pairs
- Fixed check of xsk pools in if statement
v4 -> v5
- Introduced a simplified logic for sequential check for RX and TX
---
.../net/ethernet/intel/igb/e1000_defines.h | 1 +
drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
2 files changed, 61 insertions(+), 15 deletions(-)
diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
index fa028928482f..9357564a2d58 100644
--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
@@ -443,6 +443,7 @@
#define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
#define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold */
#define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
+#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written back */
/* Extended Interrupt Cause Set */
/* E1000_EITR_CNT_IGNR is only for 82576 and newer */
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..6e51b5b6f131 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
struct igb_adapter *adapter = netdev_priv(dev);
struct e1000_hw *hw = &adapter->hw;
struct igb_ring *ring;
+ struct igb_q_vector *q_vector;
+ struct napi_struct *rx_napi;
+ struct napi_struct *tx_napi;
+ bool trigger_irq_tx = false;
+ bool trigger_irq_rx = false;
+ u32 eics_tx = 0;
+ u32 eics_rx = 0;
u32 eics = 0;
if (test_bit(__IGB_DOWN, &adapter->state))
@@ -536,27 +543,65 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
if (!igb_xdp_is_enabled(adapter))
return -EINVAL;
-
- if (qid >= adapter->num_tx_queues)
+ /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
+ if (qid >= adapter->num_rx_queues)
return -EINVAL;
-
- ring = adapter->tx_ring[qid];
-
- if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
- return -ENETDOWN;
-
- if (!READ_ONCE(ring->xsk_pool))
+ /* Check if flags are valid */
+ if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
return -EINVAL;
-
- if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
- /* Cause software interrupt */
+ if (flags & XDP_WAKEUP_RX) {
+ /* IRQ trigger preparation for Rx */
+ ring = adapter->rx_ring[qid];
+ if (!READ_ONCE(ring->xsk_pool))
+ return -ENXIO;
+ q_vector = ring->q_vector;
+ rx_napi = &q_vector->napi;
+ /* Extend the BIT mask for eics */
+ eics_rx = ring->q_vector->eims_value;
+ trigger_irq_rx = true;
+ }
+ if (flags & XDP_WAKEUP_TX) {
+ if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
+ /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
+ * so a single IRQ trigger will wake both RX and TX processing
+ */
+ } else {
+ /* IRQ trigger preparation for Tx */
+ ring = adapter->tx_ring[qid];
+ if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
+ return -ENETDOWN;
+
+ if (!READ_ONCE(ring->xsk_pool))
+ return -ENXIO;
+ q_vector = ring->q_vector;
+ tx_napi = &q_vector->napi;
+ /* Extend the BIT mask for eics */
+ eics_tx = ring->q_vector->eims_value;
+ trigger_irq_tx = true;
+ }
+ }
+ /* All error checks are finished. Check and update napi states for rx and tx */
+ if (trigger_irq_rx) {
+ if (!napi_if_scheduled_mark_missed(rx_napi))
+ eics |= eics_rx;
+ }
+ if (trigger_irq_tx) {
+ if (!napi_if_scheduled_mark_missed(tx_napi))
+ eics |= eics_tx;
+ }
+ /* Now we trigger the required irqs for Rx and Tx */
+ if ((trigger_irq_rx) || (trigger_irq_tx)) {
if (adapter->flags & IGB_FLAG_HAS_MSIX) {
- eics |= ring->q_vector->eims_value;
wr32(E1000_EICS, eics);
} else {
- wr32(E1000_ICS, E1000_ICS_RXDMT0);
+ if ((trigger_irq_rx) && (trigger_irq_tx))
+ wr32(E1000_ICS,
+ E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
+ else if (trigger_irq_rx)
+ wr32(E1000_ICS, E1000_ICS_RXDMT0);
+ else
+ wr32(E1000_ICS, E1000_ICS_TXDW);
}
}
-
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
@ 2026-01-12 14:11 ` Paul Menzel
2026-01-13 6:58 ` Behera, VIVEK via Intel-wired-lan
2026-01-12 14:53 ` Kwapulinski, Piotr
` (3 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Paul Menzel @ 2026-01-12 14:11 UTC (permalink / raw)
To: Vivek Behera
Cc: aleksandr.loktionov, jacob.e.keller, anthony.l.nguyen,
przemyslaw.kitszel, maciej.fijalkowski, sriram.yagnaraman, kurt,
intel-wired-lan
Dear Vivek,
Thank you for your patch. Some minor comments below.
Am 12.01.26 um 14:03 schrieb Vivek Behera via Intel-wired-lan:
> The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
Please re-flow for 75 characters per line.
> to share the same irq. This would lead to triggering of incorrect irq
> in split irq configuration.
> This patch addresses this issue which could impact environments
> with 2 active cpu cores
> or when the number of queues is reduced to 2 or less
Why break the line in the middle of the sentence?
> cat /proc/interrupts | grep eno2
> 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 0-edge eno2
> 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 1-edge eno2-rx-0
> 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 2-edge eno2-rx-1
> 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 3-edge eno2-tx-0
> 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 4-edge eno2-tx-1
>
> Furthermore it uses the flags input argument to trigger either rx, tx or
> both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation
Please add a dot/period at the end of sentences.
> Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-vivek.behera@siemens.com/
> v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-vivek.behera@siemens.com/
> v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-vivek.behera@siemens.com/
> v4: https://lore.kernel.org/intel-wired-lan/20251222115747.230521-1-vivek.behera@siemens.com/
>
> changelog:
> v1
> - Inital description of the Bug and fixes made in the patch
Initial
>
> v1 -> v2
> - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
> - Review suggestions by Aleksander: Modified sequence to complete all
> error checks for rx and tx before updating napi states and triggering irqs
> - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
> - Added define for Tx interrupt trigger bit mask for E1000_ICS
>
> v2 -> v3
> - Included applicable feedback and suggestions from igc patch
> - Fixed logic in updating eics value when both TX and RX need wakeup
>
> v3 -> v4
> - Added comments to explain trigerring of both TX and RX with active queue pairs
> - Fixed check of xsk pools in if statement
>
> v4 -> v5
> - Introduced a simplified logic for sequential check for RX and TX
> ---
> .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> 2 files changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index fa028928482f..9357564a2d58 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -443,6 +443,7 @@
> #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold */
> #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written back */
>
> /* Extended Interrupt Cause Set */
> /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..6e51b5b6f131 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
> + struct igb_q_vector *q_vector;
> + struct napi_struct *rx_napi;
> + struct napi_struct *tx_napi;
> + bool trigger_irq_tx = false;
> + bool trigger_irq_rx = false;
> + u32 eics_tx = 0;
> + u32 eics_rx = 0;
> u32 eics = 0;
>
> if (test_bit(__IGB_DOWN, &adapter->state))
> @@ -536,27 +543,65 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>
> if (!igb_xdp_is_enabled(adapter))
> return -EINVAL;
> -
Why remove the blank line.
> - if (qid >= adapter->num_tx_queues)
> + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> + if (qid >= adapter->num_rx_queues)
> return -EINVAL;
> -
> - ring = adapter->tx_ring[qid];
> -
> - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> - return -ENETDOWN;
> -
> - if (!READ_ONCE(ring->xsk_pool))
> + /* Check if flags are valid */
> + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> return -EINVAL;
The comment seems redundant.
> -
> - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> - /* Cause software interrupt */
> + if (flags & XDP_WAKEUP_RX) {
> + /* IRQ trigger preparation for Rx */
> + ring = adapter->rx_ring[qid];
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + rx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_rx = ring->q_vector->eims_value;
> + trigger_irq_rx = true;
> + }
> + if (flags & XDP_WAKEUP_TX) {
> + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> + /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> + * so a single IRQ trigger will wake both RX and TX processing
> + */
> + } else {
> + /* IRQ trigger preparation for Tx */
> + ring = adapter->tx_ring[qid];
> + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> + return -ENETDOWN;
> +
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + tx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_tx = ring->q_vector->eims_value;
> + trigger_irq_tx = true;
> + }
> + }
> + /* All error checks are finished. Check and update napi states for rx and tx */
> + if (trigger_irq_rx) {
> + if (!napi_if_scheduled_mark_missed(rx_napi))
> + eics |= eics_rx;
> + }
> + if (trigger_irq_tx) {
> + if (!napi_if_scheduled_mark_missed(tx_napi))
> + eics |= eics_tx;
> + }
> + /* Now we trigger the required irqs for Rx and Tx */
> + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> - eics |= ring->q_vector->eims_value;
> wr32(E1000_EICS, eics);
> } else {
> - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + if ((trigger_irq_rx) && (trigger_irq_tx))
> + wr32(E1000_ICS,
> + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> + else if (trigger_irq_rx)
> + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + else
> + wr32(E1000_ICS, E1000_ICS_TXDW);
> }
> }
> -
The removal of the blank line is unrelated. It looks like you divert
from the coding style of this file. I’d suggest to avoid that.
> return 0;
> }
Kind regards,
Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
2026-01-12 14:11 ` Paul Menzel
@ 2026-01-12 14:53 ` Kwapulinski, Piotr
2026-01-13 6:55 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 4:04 ` Behera, VIVEK via Intel-wired-lan
` (2 subsequent siblings)
4 siblings, 1 reply; 14+ messages in thread
From: Kwapulinski, Piotr @ 2026-01-12 14:53 UTC (permalink / raw)
To: Behera, Vivek, Loktionov, Aleksandr, Keller, Jacob E,
Nguyen, Anthony L, Kitszel, Przemyslaw, Fijalkowski, Maciej,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de
Cc: intel-wired-lan@lists.osuosl.org
>-----Original Message-----
>From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Vivek Behera via Intel-wired-lan
>Sent: Monday, January 12, 2026 2:04 PM
>To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Keller, Jacob E <jacob.e.keller@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com; kurt@linutronix.de
>Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek <vivek.behera@siemens.com>
>Subject: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
>
>The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues to share the same irq. This would lead to triggering of incorrect irq in split irq configuration.
>This patch addresses this issue which could impact environments with 2 active cpu cores or when the number of queues is reduced to 2 or less
>
>cat /proc/interrupts | grep eno2
> 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 0-edge eno2
> 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 1-edge eno2-rx-0
> 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 2-edge eno2-rx-1
> 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 3-edge eno2-tx-0
> 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 4-edge eno2-tx-1
>
>Furthermore it uses the flags input argument to trigger either rx, tx or both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation
>
>Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
>Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
>Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>---
>v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-vivek.behera@siemens.com/
>v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-vivek.behera@siemens.com/
>v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-vivek.behera@siemens.com/
>v4: https://lore.kernel.org/intel-wired-lan/20251222115747.230521-1-vivek.behera@siemens.com/
>
>changelog:
>v1
>- Inital description of the Bug and fixes made in the patch
>
>v1 -> v2
>- Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
>- Review suggestions by Aleksander: Modified sequence to complete all
> error checks for rx and tx before updating napi states and triggering irqs
>- Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
>- Added define for Tx interrupt trigger bit mask for E1000_ICS
>
>v2 -> v3
>- Included applicable feedback and suggestions from igc patch
>- Fixed logic in updating eics value when both TX and RX need wakeup
>
>v3 -> v4
>- Added comments to explain trigerring of both TX and RX with active queue pairs
>- Fixed check of xsk pools in if statement
>
>v4 -> v5
>- Introduced a simplified logic for sequential check for RX and TX
>---
> .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> 2 files changed, 61 insertions(+), 15 deletions(-)
>
>diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
>index fa028928482f..9357564a2d58 100644
>--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
>+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
>@@ -443,6 +443,7 @@
> #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold */
> #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
>+#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written back */
>
> /* Extended Interrupt Cause Set */
> /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
>index 30ce5fbb5b77..6e51b5b6f131 100644
>--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
>+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
>@@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
>+ struct igb_q_vector *q_vector;
>+ struct napi_struct *rx_napi;
>+ struct napi_struct *tx_napi;
Please merge into a single line
>+ bool trigger_irq_tx = false;
>+ bool trigger_irq_rx = false;
Please merge into a single line
>+ u32 eics_tx = 0;
>+ u32 eics_rx = 0;
> u32 eics = 0;
Please merge into a single line
>
> if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>
> if (!igb_xdp_is_enabled(adapter))
> return -EINVAL;
>-
>- if (qid >= adapter->num_tx_queues)
>+ /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
>+ if (qid >= adapter->num_rx_queues)
> return -EINVAL;
>-
>- ring = adapter->tx_ring[qid];
>-
>- if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
>- return -ENETDOWN;
>-
>- if (!READ_ONCE(ring->xsk_pool))
>+ /* Check if flags are valid */
>+ if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> return -EINVAL;
>-
>- if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
>- /* Cause software interrupt */
>+ if (flags & XDP_WAKEUP_RX) {
>+ /* IRQ trigger preparation for Rx */
>+ ring = adapter->rx_ring[qid];
>+ if (!READ_ONCE(ring->xsk_pool))
>+ return -ENXIO;
>+ q_vector = ring->q_vector;
>+ rx_napi = &q_vector->napi;
>+ /* Extend the BIT mask for eics */
>+ eics_rx = ring->q_vector->eims_value;
>+ trigger_irq_rx = true;
>+ }
>+ if (flags & XDP_WAKEUP_TX) {
>+ if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
>+ /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
>+ * so a single IRQ trigger will wake both RX and TX processing
>+ */
>+ } else {
>+ /* IRQ trigger preparation for Tx */
>+ ring = adapter->tx_ring[qid];
>+ if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
>+ return -ENETDOWN;
>+
>+ if (!READ_ONCE(ring->xsk_pool))
>+ return -ENXIO;
>+ q_vector = ring->q_vector;
>+ tx_napi = &q_vector->napi;
>+ /* Extend the BIT mask for eics */
>+ eics_tx = ring->q_vector->eims_value;
>+ trigger_irq_tx = true;
>+ }
>+ }
>+ /* All error checks are finished. Check and update napi states for rx and tx */
>+ if (trigger_irq_rx) {
>+ if (!napi_if_scheduled_mark_missed(rx_napi))
>+ eics |= eics_rx;
>+ }
Please refactor as "if (cond1 && cond2)"
>+ if (trigger_irq_tx) {
>+ if (!napi_if_scheduled_mark_missed(tx_napi))
>+ eics |= eics_tx;
>+ }
Please refactor as "if (cond1 && cond2)"
Piotr
[...]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
2026-01-12 14:11 ` Paul Menzel
2026-01-12 14:53 ` Kwapulinski, Piotr
@ 2026-01-13 4:04 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 9:00 ` Loktionov, Aleksandr
2026-01-13 11:41 ` Maciej Fijalkowski
4 siblings, 0 replies; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-13 4:04 UTC (permalink / raw)
To: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
maciej.fijalkowski@intel.com, sriram.yagnaraman@ericsson.com,
kurt@linutronix.de
Cc: intel-wired-lan@lists.osuosl.org
> -----Original Message-----
> From: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> Sent: Monday, January 12, 2026 2:04 PM
> To: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> maciej.fijalkowski@intel.com; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de
> Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek (DI FA DSP ICC PRC1)
> <vivek.behera@siemens.com>
> Subject: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
>
> The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
> to share the same irq. This would lead to triggering of incorrect irq in split irq
> configuration.
> This patch addresses this issue which could impact environments with 2 active
> cpu cores or when the number of queues is reduced to 2 or less
>
> cat /proc/interrupts | grep eno2
> 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 0-edge eno2
> 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 1-edge eno2-rx-0
> 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 2-edge eno2-rx-1
> 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 3-edge eno2-tx-0
> 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 4-edge eno2-tx-1
>
> Furthermore it uses the flags input argument to trigger either rx, tx or both rx and
> tx irqs as specified in the ndo_xsk_wakeup api documentation
>
> Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1:
> https://lore.kernel.o/
> rg%2Fintel-wired-lan%2F20251212131454.124116-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens
> .com%7C195e9677fcd9464e392c08de51db0d32%7C38ae3bcd95794fd4addab42e1
> 495d55a%7C1%7C0%7C639038198405683495%7CUnknown%7CTWFpbGZsb3d8
> eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=rSn9N%2B9xaRiZoE5
> mjETCe2iG%2Buu4CfCnjwsJ8lIgLSg%3D&reserved=0
> v2:
> https://lore.kernel.o/
> rg%2Fintel-wired-lan%2F20251215115416.410619-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens
> .com%7C195e9677fcd9464e392c08de51db0d32%7C38ae3bcd95794fd4addab42e1
> 495d55a%7C1%7C0%7C639038198405727630%7CUnknown%7CTWFpbGZsb3d8
> eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=uC83YhZDYrzPPMfbbk
> b06AEgsP%2Bqqre4Ht5TIZT46vI%3D&reserved=0
> v3:
> https://lore.kernel.o/
> rg%2Fintel-wired-lan%2F20251220114936.140473-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens
> .com%7C195e9677fcd9464e392c08de51db0d32%7C38ae3bcd95794fd4addab42e1
> 495d55a%7C1%7C0%7C639038198405760767%7CUnknown%7CTWFpbGZsb3d8
> eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ZYaDd3oMihS3FXabs
> NAMHTh0cVHxErFXsZ7V%2BK74TzY%3D&reserved=0
> v4:
> https://lore.kernel.o/
> rg%2Fintel-wired-lan%2F20251222115747.230521-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens
> .com%7C195e9677fcd9464e392c08de51db0d32%7C38ae3bcd95794fd4addab42e1
> 495d55a%7C1%7C0%7C639038198405783856%7CUnknown%7CTWFpbGZsb3d8
> eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoi
> TWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=HYb8WkqgdiGLkXASa
> pmMQKvnFXAFwYFEErxQHAkSPXI%3D&reserved=0
>
> changelog:
> v1
> - Inital description of the Bug and fixes made in the patch
>
> v1 -> v2
> - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
> - Review suggestions by Aleksander: Modified sequence to complete all
> error checks for rx and tx before updating napi states and triggering irqs
> - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
> - Added define for Tx interrupt trigger bit mask for E1000_ICS
>
> v2 -> v3
> - Included applicable feedback and suggestions from igc patch
> - Fixed logic in updating eics value when both TX and RX need wakeup
>
> v3 -> v4
> - Added comments to explain trigerring of both TX and RX with active queue pairs
> - Fixed check of xsk pools in if statement
>
> v4 -> v5
> - Introduced a simplified logic for sequential check for RX and TX
> ---
> .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> 2 files changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index fa028928482f..9357564a2d58 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -443,6 +443,7 @@
> #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold
> */
> #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written
> back */
>
> /* Extended Interrupt Cause Set */
> /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..6e51b5b6f131 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid,
> u32 flags)
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
> + struct igb_q_vector *q_vector;
> + struct napi_struct *rx_napi;
> + struct napi_struct *tx_napi;
> + bool trigger_irq_tx = false;
> + bool trigger_irq_rx = false;
> + u32 eics_tx = 0;
> + u32 eics_rx = 0;
> u32 eics = 0;
>
> if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@
> int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>
> if (!igb_xdp_is_enabled(adapter))
> return -EINVAL;
> -
> - if (qid >= adapter->num_tx_queues)
> + /* Check if queue_id is valid. Tx and Rx queue numbers are always same
> */
> + if (qid >= adapter->num_rx_queues)
> return -EINVAL;
> -
> - ring = adapter->tx_ring[qid];
> -
> - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> - return -ENETDOWN;
> -
> - if (!READ_ONCE(ring->xsk_pool))
> + /* Check if flags are valid */
> + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> return -EINVAL;
> -
> - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> - /* Cause software interrupt */
> + if (flags & XDP_WAKEUP_RX) {
> + /* IRQ trigger preparation for Rx */
> + ring = adapter->rx_ring[qid];
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + rx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_rx = ring->q_vector->eims_value;
> + trigger_irq_rx = true;
> + }
> + if (flags & XDP_WAKEUP_TX) {
> + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
I just realized this has to be changed to if ((adapter->flags & IGB_FLAG_QUEUE_PAIRS) && (flags & XDP_WAKEUP_RX))
Otherwise it will never execute the part needed for TX. I will fix this in the next version
> + /* In queue-pair mode, rx_ring and tx_ring share the same
> q_vector,
> + * so a single IRQ trigger will wake both RX and TX processing
> + */
> + } else {
> + /* IRQ trigger preparation for Tx */
> + ring = adapter->tx_ring[qid];
> + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring-
> >flags))
> + return -ENETDOWN;
> +
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + tx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_tx = ring->q_vector->eims_value;
> + trigger_irq_tx = true;
> + }
> + }
> + /* All error checks are finished. Check and update napi states for rx and tx
> */
> + if (trigger_irq_rx) {
> + if (!napi_if_scheduled_mark_missed(rx_napi))
> + eics |= eics_rx;
> + }
> + if (trigger_irq_tx) {
> + if (!napi_if_scheduled_mark_missed(tx_napi))
> + eics |= eics_tx;
> + }
> + /* Now we trigger the required irqs for Rx and Tx */
> + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> - eics |= ring->q_vector->eims_value;
> wr32(E1000_EICS, eics);
> } else {
> - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + if ((trigger_irq_rx) && (trigger_irq_tx))
> + wr32(E1000_ICS,
> + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> + else if (trigger_irq_rx)
> + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + else
> + wr32(E1000_ICS, E1000_ICS_TXDW);
> }
> }
> -
> return 0;
> }
> --
> 2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 14:53 ` Kwapulinski, Piotr
@ 2026-01-13 6:55 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 0 replies; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-13 6:55 UTC (permalink / raw)
To: Kwapulinski, Piotr, Loktionov, Aleksandr, Keller, Jacob E,
Nguyen, Anthony L, Kitszel, Przemyslaw, Fijalkowski, Maciej,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de
Cc: intel-wired-lan@lists.osuosl.org
> -----Ursprüngliche Nachricht-----
> Von: Kwapulinski, Piotr <piotr.kwapulinski@intel.com>
> Gesendet: Montag, 12. Januar 2026 15:54
> An: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>;
> Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de
> Cc: intel-wired-lan@lists.osuosl.org
> Betreff: RE: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
>
> >-----Original Message-----
> >From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Vivek
> Behera via Intel-wired-lan
> >Sent: Monday, January 12, 2026 2:04 PM
> >To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Keller, Jacob E
> <jacob.e.keller@intel.com>; Nguyen, Anthony L <anthony.l.nguyen@intel.com>;
> Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de
> >Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek <vivek.behera@siemens.com>
> >Subject: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
> >
> >The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
> to share the same irq. This would lead to triggering of incorrect irq in split irq
> configuration.
> >This patch addresses this issue which could impact environments with 2 active cpu
> cores or when the number of queues is reduced to 2 or less
> >
> >cat /proc/interrupts | grep eno2
> > 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 0-edge eno2
> > 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 1-edge eno2-rx-0
> > 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 2-edge eno2-rx-1
> > 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 3-edge eno2-tx-0
> > 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 4-edge eno2-tx-1
> >
> >Furthermore it uses the flags input argument to trigger either rx, tx or both rx and tx
> irqs as specified in the ndo_xsk_wakeup api documentation
> >
> >Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> >Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> >Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> >---
> >v1:
> https://lore.kernel.org/
> %2Fintel-wired-lan%2F20251212131454.124116-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.c
> om%7Ccb7787910aa7418c604008de51ea6e02%7C38ae3bcd95794fd4addab42e1495
> d55a%7C1%7C0%7C639038264488251701%7CUnknown%7CTWFpbGZsb3d8eyJFb
> XB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbC
> IsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=MQHlgMmrbsKJOnjBXpN3r0XDa2
> omFxwgw9eJrfypeyU%3D&reserved=0
> >v2:
> https://lore.kernel.org/
> %2Fintel-wired-lan%2F20251215115416.410619-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.c
> om%7Ccb7787910aa7418c604008de51ea6e02%7C38ae3bcd95794fd4addab42e1495
> d55a%7C1%7C0%7C639038264488322888%7CUnknown%7CTWFpbGZsb3d8eyJFb
> XB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbC
> IsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6bf2asonyRHwDTgMKdsrtuuprEn
> oDSC8FH9lXaNOmcc%3D&reserved=0
> >v3:
> https://lore.kernel.org/
> %2Fintel-wired-lan%2F20251220114936.140473-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.c
> om%7Ccb7787910aa7418c604008de51ea6e02%7C38ae3bcd95794fd4addab42e1495
> d55a%7C1%7C0%7C639038264488406758%7CUnknown%7CTWFpbGZsb3d8eyJFb
> XB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbC
> IsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=kWSBkyPnmfekbtpSw2pbD2OMf
> 7auXWIfPWRArZuFK98%3D&reserved=0
> >v4:
> https://lore.kernel.org/
> %2Fintel-wired-lan%2F20251222115747.230521-1-
> vivek.behera%40siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.c
> om%7Ccb7787910aa7418c604008de51ea6e02%7C38ae3bcd95794fd4addab42e1495
> d55a%7C1%7C0%7C639038264488464388%7CUnknown%7CTWFpbGZsb3d8eyJFb
> XB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbC
> IsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=NIanHqowaMuUfVsgYDEDTkY5rt
> RfD8Aj6Tnk1O6aN14%3D&reserved=0
> >
> >changelog:
> >v1
> >- Inital description of the Bug and fixes made in the patch
> >
> >v1 -> v2
> >- Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
> >- Review suggestions by Aleksander: Modified sequence to complete all
> > error checks for rx and tx before updating napi states and triggering irqs
> >- Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
> >- Added define for Tx interrupt trigger bit mask for E1000_ICS
> >
> >v2 -> v3
> >- Included applicable feedback and suggestions from igc patch
> >- Fixed logic in updating eics value when both TX and RX need wakeup
> >
> >v3 -> v4
> >- Added comments to explain trigerring of both TX and RX with active queue pairs
> >- Fixed check of xsk pools in if statement
> >
> >v4 -> v5
> >- Introduced a simplified logic for sequential check for RX and TX
> >---
> > .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> > drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> > 2 files changed, 61 insertions(+), 15 deletions(-)
> >
> >diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> b/drivers/net/ethernet/intel/igb/e1000_defines.h
> >index fa028928482f..9357564a2d58 100644
> >--- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> >+++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> >@@ -443,6 +443,7 @@
> > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold
> */
> > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> >+#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written
> back */
> >
> > /* Extended Interrupt Cause Set */
> > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> >index 30ce5fbb5b77..6e51b5b6f131 100644
> >--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> >+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> >@@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> flags)
> > struct igb_adapter *adapter = netdev_priv(dev);
> > struct e1000_hw *hw = &adapter->hw;
> > struct igb_ring *ring;
> >+ struct igb_q_vector *q_vector;
> >+ struct napi_struct *rx_napi;
> >+ struct napi_struct *tx_napi;
> Please merge into a single line
>
Okay. I will include this in the next version
> >+ bool trigger_irq_tx = false;
> >+ bool trigger_irq_rx = false;
> Please merge into a single line
Okay. I will include this in the next version
>
> >+ u32 eics_tx = 0;
> >+ u32 eics_rx = 0;
> > u32 eics = 0;
> Please merge into a single line
>
Okay. I will include this in the next version
> >
> > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@ int
> igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> >
> > if (!igb_xdp_is_enabled(adapter))
> > return -EINVAL;
> >-
> >- if (qid >= adapter->num_tx_queues)
> >+ /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> >+ if (qid >= adapter->num_rx_queues)
> > return -EINVAL;
> >-
> >- ring = adapter->tx_ring[qid];
> >-
> >- if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> >- return -ENETDOWN;
> >-
> >- if (!READ_ONCE(ring->xsk_pool))
> >+ /* Check if flags are valid */
> >+ if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > return -EINVAL;
> >-
> >- if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> >- /* Cause software interrupt */
> >+ if (flags & XDP_WAKEUP_RX) {
> >+ /* IRQ trigger preparation for Rx */
> >+ ring = adapter->rx_ring[qid];
> >+ if (!READ_ONCE(ring->xsk_pool))
> >+ return -ENXIO;
> >+ q_vector = ring->q_vector;
> >+ rx_napi = &q_vector->napi;
> >+ /* Extend the BIT mask for eics */
> >+ eics_rx = ring->q_vector->eims_value;
> >+ trigger_irq_rx = true;
> >+ }
> >+ if (flags & XDP_WAKEUP_TX) {
> >+ if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> >+ /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> >+ * so a single IRQ trigger will wake both RX and TX processing
> >+ */
> >+ } else {
> >+ /* IRQ trigger preparation for Tx */
> >+ ring = adapter->tx_ring[qid];
> >+ if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> >+ return -ENETDOWN;
> >+
> >+ if (!READ_ONCE(ring->xsk_pool))
> >+ return -ENXIO;
> >+ q_vector = ring->q_vector;
> >+ tx_napi = &q_vector->napi;
> >+ /* Extend the BIT mask for eics */
> >+ eics_tx = ring->q_vector->eims_value;
> >+ trigger_irq_tx = true;
> >+ }
> >+ }
> >+ /* All error checks are finished. Check and update napi states for rx and tx */
> >+ if (trigger_irq_rx) {
> >+ if (!napi_if_scheduled_mark_missed(rx_napi))
> >+ eics |= eics_rx;
> >+ }
> Please refactor as "if (cond1 && cond2)"
Okay. I will include this in the next version
>
> >+ if (trigger_irq_tx) {
> >+ if (!napi_if_scheduled_mark_missed(tx_napi))
> >+ eics |= eics_tx;
> >+ }
> Please refactor as "if (cond1 && cond2)"
> Piotr
>
> [...]
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 14:11 ` Paul Menzel
@ 2026-01-13 6:58 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 0 replies; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-13 6:58 UTC (permalink / raw)
To: Paul Menzel
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
maciej.fijalkowski@intel.com, sriram.yagnaraman@ericsson.com,
kurt@linutronix.de, intel-wired-lan@lists.osuosl.org
Hi Paul,
Thanks for your feedback. I will include your suggestions in the next version of the patch
> -----Ursprüngliche Nachricht-----
> Von: Paul Menzel <pmenzel@molgen.mpg.de>
> Gesendet: Montag, 12. Januar 2026 15:12
> An: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> maciej.fijalkowski@intel.com; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de; intel-wired-lan@lists.osuosl.org
> Betreff: Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
>
> Dear Vivek,
>
>
> Thank you for your patch. Some minor comments below.
>
> Am 12.01.26 um 14:03 schrieb Vivek Behera via Intel-wired-lan:
> > The current implementation in the igb_xsk_wakeup expects the Rx and Tx
> > queues
>
> Please re-flow for 75 characters per line.
>
> > to share the same irq. This would lead to triggering of incorrect irq
> > in split irq configuration.
> > This patch addresses this issue which could impact environments with 2
> > active cpu cores or when the number of queues is reduced to 2 or less
>
> Why break the line in the middle of the sentence?
>
> > cat /proc/interrupts | grep eno2
> > 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 0-edge eno2
> > 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 1-edge eno2-rx-0
> > 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 2-edge eno2-rx-1
> > 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 3-edge eno2-tx-0
> > 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 4-edge eno2-tx-1
> >
> > Furthermore it uses the flags input argument to trigger either rx, tx
> > or both rx and tx irqs as specified in the ndo_xsk_wakeup api
> > documentation
>
> Please add a dot/period at the end of sentences.
>
> > Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> > Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> > ---
> > v1:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251212131454.124116-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C96def2f
> e2d
> >
> 6c49e4144208de51e49371%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9038239346073271%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=WjSv74wZk60k3930VLYz2L2C8jAWfWVLWHyqBuoIiCQ%3D&rese
> rved=0
> > v2:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251215115416.410619-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C96def2f
> e2d
> >
> 6c49e4144208de51e49371%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9038239346119542%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=d6BslMcsbOzhMJ7mPhiO2%2B1voZ1pUEDtlt5IEcEDiXQ%3D&re
> served
> > =0
> > v3:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251220114936.140473-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C96def2f
> e2d
> >
> 6c49e4144208de51e49371%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9038239346160501%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=vTKHOwGEYTxFTxKLD1HSmD6r88cQI8SB9KBexi128HA%3D&re
> served=0
> > v4:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251222115747.230521-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7C96def2f
> e2d
> >
> 6c49e4144208de51e49371%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9038239346202115%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=FqWB94I9hDxPnQy3OLmkzv7WZrG80fpOyW2saoMDDOM%3D&r
> eserved=0
> >
> > changelog:
> > v1
> > - Inital description of the Bug and fixes made in the patch
>
> Initial
>
> >
> > v1 -> v2
> > - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ
> > configuration
> > - Review suggestions by Aleksander: Modified sequence to complete all
> > error checks for rx and tx before updating napi states and
> > triggering irqs
> > - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix
> > use case)
> > - Added define for Tx interrupt trigger bit mask for E1000_ICS
> >
> > v2 -> v3
> > - Included applicable feedback and suggestions from igc patch
> > - Fixed logic in updating eics value when both TX and RX need wakeup
> >
> > v3 -> v4
> > - Added comments to explain trigerring of both TX and RX with active
> > queue pairs
> > - Fixed check of xsk pools in if statement
> >
> > v4 -> v5
> > - Introduced a simplified logic for sequential check for RX and TX
> > ---
> > .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> > drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> > 2 files changed, 61 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > index fa028928482f..9357564a2d58 100644
> > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > @@ -443,6 +443,7 @@
> > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold
> */
> > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written
> back */
> >
> > /* Extended Interrupt Cause Set */
> > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..6e51b5b6f131 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> flags)
> > struct igb_adapter *adapter = netdev_priv(dev);
> > struct e1000_hw *hw = &adapter->hw;
> > struct igb_ring *ring;
> > + struct igb_q_vector *q_vector;
> > + struct napi_struct *rx_napi;
> > + struct napi_struct *tx_napi;
> > + bool trigger_irq_tx = false;
> > + bool trigger_irq_rx = false;
> > + u32 eics_tx = 0;
> > + u32 eics_rx = 0;
> > u32 eics = 0;
> >
> > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@
> > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> >
> > if (!igb_xdp_is_enabled(adapter))
> > return -EINVAL;
> > -
>
> Why remove the blank line.
>
> > - if (qid >= adapter->num_tx_queues)
> > + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> > + if (qid >= adapter->num_rx_queues)
> > return -EINVAL;
> > -
> > - ring = adapter->tx_ring[qid];
> > -
> > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > - return -ENETDOWN;
> > -
> > - if (!READ_ONCE(ring->xsk_pool))
> > + /* Check if flags are valid */
> > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > return -EINVAL;
>
> The comment seems redundant.
>
> > -
> > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > - /* Cause software interrupt */
> > + if (flags & XDP_WAKEUP_RX) {
> > + /* IRQ trigger preparation for Rx */
> > + ring = adapter->rx_ring[qid];
> > + if (!READ_ONCE(ring->xsk_pool))
> > + return -ENXIO;
> > + q_vector = ring->q_vector;
> > + rx_napi = &q_vector->napi;
> > + /* Extend the BIT mask for eics */
> > + eics_rx = ring->q_vector->eims_value;
> > + trigger_irq_rx = true;
> > + }
> > + if (flags & XDP_WAKEUP_TX) {
> > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > + /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> > + * so a single IRQ trigger will wake both RX and TX processing
> > + */
> > + } else {
> > + /* IRQ trigger preparation for Tx */
> > + ring = adapter->tx_ring[qid];
> > + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > + return -ENETDOWN;
> > +
> > + if (!READ_ONCE(ring->xsk_pool))
> > + return -ENXIO;
> > + q_vector = ring->q_vector;
> > + tx_napi = &q_vector->napi;
> > + /* Extend the BIT mask for eics */
> > + eics_tx = ring->q_vector->eims_value;
> > + trigger_irq_tx = true;
> > + }
> > + }
> > + /* All error checks are finished. Check and update napi states for rx and tx */
> > + if (trigger_irq_rx) {
> > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > + eics |= eics_rx;
> > + }
> > + if (trigger_irq_tx) {
> > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > + eics |= eics_tx;
> > + }
> > + /* Now we trigger the required irqs for Rx and Tx */
> > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > - eics |= ring->q_vector->eims_value;
> > wr32(E1000_EICS, eics);
> > } else {
> > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > + wr32(E1000_ICS,
> > + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > + else if (trigger_irq_rx)
> > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > + else
> > + wr32(E1000_ICS, E1000_ICS_TXDW);
> > }
> > }
> > -
>
> The removal of the blank line is unrelated. It looks like you divert from the coding
> style of this file. I'd suggest to avoid that.
>
> > return 0;
> > }
>
>
> Kind regards,
>
> Paul
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
` (2 preceding siblings ...)
2026-01-13 4:04 ` Behera, VIVEK via Intel-wired-lan
@ 2026-01-13 9:00 ` Loktionov, Aleksandr
2026-01-13 10:02 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 11:41 ` Maciej Fijalkowski
4 siblings, 1 reply; 14+ messages in thread
From: Loktionov, Aleksandr @ 2026-01-13 9:00 UTC (permalink / raw)
To: Behera, Vivek, Keller, Jacob E, Nguyen, Anthony L,
Kitszel, Przemyslaw, Fijalkowski, Maciej,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de
Cc: intel-wired-lan@lists.osuosl.org, Behera, Vivek
> -----Original Message-----
> From: Vivek Behera <vivek.behera@siemens.com>
> Sent: Monday, January 12, 2026 2:04 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Keller,
> Jacob E <jacob.e.keller@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de
> Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek
> <vivek.behera@siemens.com>
> Subject: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
>
> The current implementation in the igb_xsk_wakeup expects the Rx and Tx
> queues to share the same irq. This would lead to triggering of
> incorrect irq in split irq configuration.
> This patch addresses this issue which could impact environments with 2
> active cpu cores or when the number of queues is reduced to 2 or less
>
> cat /proc/interrupts | grep eno2
> 167: 0 0 0 0 IR-PCI-MSIX-
> 0000:08:00.0
> 0-edge eno2
> 168: 0 0 0 0 IR-PCI-MSIX-
> 0000:08:00.0
> 1-edge eno2-rx-0
> 169: 0 0 0 0 IR-PCI-MSIX-
> 0000:08:00.0
> 2-edge eno2-rx-1
> 170: 0 0 0 0 IR-PCI-MSIX-
> 0000:08:00.0
> 3-edge eno2-tx-0
> 171: 0 0 0 0 IR-PCI-MSIX-
> 0000:08:00.0
> 4-edge eno2-tx-1
>
> Furthermore it uses the flags input argument to trigger either rx, tx
> or both rx and tx irqs as specified in the ndo_xsk_wakeup api
> documentation
>
> Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-
> vivek.behera@siemens.com/
> v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-
> vivek.behera@siemens.com/
> v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-
> vivek.behera@siemens.com/
> v4: https://lore.kernel.org/intel-wired-lan/20251222115747.230521-1-
> vivek.behera@siemens.com/
>
> changelog:
> v1
> - Inital description of the Bug and fixes made in the patch
>
> v1 -> v2
> - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ
> configuration
> - Review suggestions by Aleksander: Modified sequence to complete all
> error checks for rx and tx before updating napi states and
> triggering irqs
> - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix
> use case)
> - Added define for Tx interrupt trigger bit mask for E1000_ICS
>
> v2 -> v3
> - Included applicable feedback and suggestions from igc patch
> - Fixed logic in updating eics value when both TX and RX need wakeup
>
> v3 -> v4
> - Added comments to explain trigerring of both TX and RX with active
> queue pairs
> - Fixed check of xsk pools in if statement
>
> v4 -> v5
> - Introduced a simplified logic for sequential check for RX and TX
> ---
> .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++---
> -
> 2 files changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index fa028928482f..9357564a2d58 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -443,6 +443,7 @@
> #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change
> */
> #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min.
> threshold */
> #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset
> Aserted */
> +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc
> written back */
>
> /* Extended Interrupt Cause Set */
> /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> a/drivers/net/ethernet/intel/igb/igb_xsk.c
> b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..6e51b5b6f131 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32
> qid, u32 flags)
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
> + struct igb_q_vector *q_vector;
> + struct napi_struct *rx_napi;
> + struct napi_struct *tx_napi;
> + bool trigger_irq_tx = false;
> + bool trigger_irq_rx = false;
> + u32 eics_tx = 0;
> + u32 eics_rx = 0;
> u32 eics = 0;
>
> if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65
> @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>
> if (!igb_xdp_is_enabled(adapter))
> return -EINVAL;
> -
> - if (qid >= adapter->num_tx_queues)
> + /* Check if queue_id is valid. Tx and Rx queue numbers are
> always same */
> + if (qid >= adapter->num_rx_queues)
> return -EINVAL;
But the number may differ in case of reconfiguration.
Why not:
if (qid >= adapter->num_rx_queues || qid >= adapter->num_tx_queues)
return -EINVAL;
Or add assertion that they're equal:
if (WARN_ON(adapter->num_rx_queues != adapter->num_tx_queues))
return -EINVAL;
if (qid >= adapter->num_rx_queues)
return -EINVAL;
> -
> - ring = adapter->tx_ring[qid];
> -
...
> --
> 2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-13 9:00 ` Loktionov, Aleksandr
@ 2026-01-13 10:02 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 0 replies; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-13 10:02 UTC (permalink / raw)
To: Loktionov, Aleksandr, Keller, Jacob E, Nguyen, Anthony L,
Kitszel, Przemyslaw, Fijalkowski, Maciej,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de
Cc: intel-wired-lan@lists.osuosl.org
> -----Ursprüngliche Nachricht-----
> Von: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Gesendet: Dienstag, 13. Januar 2026 10:00
> An: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>; Keller,
> Jacob E <jacob.e.keller@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej
> <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com;
> kurt@linutronix.de
> Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek (DI FA DSP ICC PRC1)
> <vivek.behera@siemens.com>
> Betreff: RE: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
>
>
>
> > -----Original Message-----
> > From: Vivek Behera <vivek.behera@siemens.com>
> > Sent: Monday, January 12, 2026 2:04 PM
> > To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Keller,
> > Jacob E <jacob.e.keller@intel.com>; Nguyen, Anthony L
> > <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> > <przemyslaw.kitszel@intel.com>; Fijalkowski, Maciej
> > <maciej.fijalkowski@intel.com>; sriram.yagnaraman@ericsson.com;
> > kurt@linutronix.de
> > Cc: intel-wired-lan@lists.osuosl.org; Behera, Vivek
> > <vivek.behera@siemens.com>
> > Subject: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> > igb_xsk_wakeup
> >
> > The current implementation in the igb_xsk_wakeup expects the Rx and Tx
> > queues to share the same irq. This would lead to triggering of
> > incorrect irq in split irq configuration.
> > This patch addresses this issue which could impact environments with 2
> > active cpu cores or when the number of queues is reduced to 2 or less
> >
> > cat /proc/interrupts | grep eno2
> > 167: 0 0 0 0 IR-PCI-MSIX-
> > 0000:08:00.0
> > 0-edge eno2
> > 168: 0 0 0 0 IR-PCI-MSIX-
> > 0000:08:00.0
> > 1-edge eno2-rx-0
> > 169: 0 0 0 0 IR-PCI-MSIX-
> > 0000:08:00.0
> > 2-edge eno2-rx-1
> > 170: 0 0 0 0 IR-PCI-MSIX-
> > 0000:08:00.0
> > 3-edge eno2-tx-0
> > 171: 0 0 0 0 IR-PCI-MSIX-
> > 0000:08:00.0
> > 4-edge eno2-tx-1
> >
> > Furthermore it uses the flags input argument to trigger either rx, tx
> > or both rx and tx irqs as specified in the ndo_xsk_wakeup api
> > documentation
> >
> > Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> > Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> > ---
> > v1:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251212131454.124116-1-&data=05%7C02%
> >
> 7Cvivek.behera%40siemens.com%7C89e4331d9dd54d83b19108de5282346c%7C38
> ae
> >
> 3bcd95794fd4addab42e1495d55a%7C1%7C0%7C639038916338104430%7CUnknow
> n%7C
> >
> TWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
> zMi
> >
> IsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=m3LL7aO3Tvj9
> Kimm
> > uKo9Fgw6dEqAg%2F1pUBRzk3haioo%3D&reserved=0
> > vivek.behera@siemens.com/
> > v2:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251215115416.410619-1-&data=05%7C02%
> >
> 7Cvivek.behera%40siemens.com%7C89e4331d9dd54d83b19108de5282346c%7C38
> ae
> >
> 3bcd95794fd4addab42e1495d55a%7C1%7C0%7C639038916338132851%7CUnknow
> n%7C
> >
> TWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
> zMi
> >
> IsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KaVmMrku6oX
> yzlRh
> > Ii5qvgm47j5HWJ4ma601OvHysxM%3D&reserved=0
> > vivek.behera@siemens.com/
> > v3:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251220114936.140473-1-&data=05%7C02%
> >
> 7Cvivek.behera%40siemens.com%7C89e4331d9dd54d83b19108de5282346c%7C38
> ae
> >
> 3bcd95794fd4addab42e1495d55a%7C1%7C0%7C639038916338151419%7CUnknow
> n%7C
> >
> TWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
> zMi
> >
> IsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=lTjnm324in4Zap
> H4
> > veRCyAqnkYpAgfzi1fcDpAxqmZE%3D&reserved=0
> > vivek.behera@siemens.com/
> > v4:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251222115747.230521-1-&data=05%7C02%
> >
> 7Cvivek.behera%40siemens.com%7C89e4331d9dd54d83b19108de5282346c%7C38
> ae
> >
> 3bcd95794fd4addab42e1495d55a%7C1%7C0%7C639038916338169212%7CUnknow
> n%7C
> >
> TWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4
> zMi
> >
> IsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=YE9U7UzDDE
> %2FyNc
> > tyeMkJgD8Jmdhch4TlpuovCNa4bY0%3D&reserved=0
> > vivek.behera@siemens.com/
> >
> > changelog:
> > v1
> > - Inital description of the Bug and fixes made in the patch
> >
> > v1 -> v2
> > - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ
> > configuration
> > - Review suggestions by Aleksander: Modified sequence to complete all
> > error checks for rx and tx before updating napi states and
> > triggering irqs
> > - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix
> > use case)
> > - Added define for Tx interrupt trigger bit mask for E1000_ICS
> >
> > v2 -> v3
> > - Included applicable feedback and suggestions from igc patch
> > - Fixed logic in updating eics value when both TX and RX need wakeup
> >
> > v3 -> v4
> > - Added comments to explain trigerring of both TX and RX with active
> > queue pairs
> > - Fixed check of xsk pools in if statement
> >
> > v4 -> v5
> > - Introduced a simplified logic for sequential check for RX and TX
> > ---
> > .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> > drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++---
> > -
> > 2 files changed, 61 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > index fa028928482f..9357564a2d58 100644
> > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > @@ -443,6 +443,7 @@
> > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change
> > */
> > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min.
> > threshold */
> > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset
> > Aserted */
> > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc
> > written back */
> >
> > /* Extended Interrupt Cause Set */
> > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..6e51b5b6f131 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32
> > qid, u32 flags)
> > struct igb_adapter *adapter = netdev_priv(dev);
> > struct e1000_hw *hw = &adapter->hw;
> > struct igb_ring *ring;
> > + struct igb_q_vector *q_vector;
> > + struct napi_struct *rx_napi;
> > + struct napi_struct *tx_napi;
> > + bool trigger_irq_tx = false;
> > + bool trigger_irq_rx = false;
> > + u32 eics_tx = 0;
> > + u32 eics_rx = 0;
> > u32 eics = 0;
> >
> > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@ int
> > igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> >
> > if (!igb_xdp_is_enabled(adapter))
> > return -EINVAL;
> > -
> > - if (qid >= adapter->num_tx_queues)
> > + /* Check if queue_id is valid. Tx and Rx queue numbers are
> > always same */
> > + if (qid >= adapter->num_rx_queues)
> > return -EINVAL;
> But the number may differ in case of reconfiguration.
Sorry but I think we had this discussion some weeks ago. The number of rx and tx queues can never be different for igb.
The driver's ethtool only supports setting combined queues. Here is the snippet from the igb_set_channels from igb_ethtool.c
/* Verify they are not requesting separate vectors */
if (!count || ch->rx_count || ch->tx_count)
return -EINVAL;
Consequently, it is not possible to configure different number of rx and tx queues.
sudo ethtool -L eno2 rx 2 tx 1
netlink error: requested channel count exceeds maximum (offset 36)
netlink error: Invalid argument
only combined option is supported
eddx@mvs:~/linux-6.18$ sudo ethtool -L eno2 combined 4
eddx@mvs:~/linux-6.18$ cat /proc/interrupts | grep eno2
162: 1 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2
163: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-TxRx-0
164: 2 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-TxRx-1
165: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 3-edge eno2-TxRx-2
166: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 4-edge eno2-TxRx-3
However, the irq's are split over the available irq's depending on the number of combined pairs active
sudo ethtool -L eno2 combined 2
eddx@mvs:~/linux-6.18$ cat /proc/interrupts | grep eno2
162: 1 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2
163: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-rx-0
164: 2 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-rx-1
165: 3 0 0 0 IR-PCI-MSIX-0000:08:00.0 3-edge eno2-tx-0
166: 5 0 0 0 IR-PCI-MSIX-0000:08:00.0 4-edge eno2-tx-1
eddx@mvs:~/linux-6.18$ sudo ethtool -L eno2 combined 3
eddx@mvs:~/linux-6.18$ cat /proc/interrupts | grep eno2
162: 1 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2
163: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-TxRx-0
164: 5 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-TxRx-1
165: 2 0 0 0 IR-PCI-MSIX-0000:08:00.0 3-edge eno2-TxRx-2
eddx@mvs:~/linux-6.18$ sudo ethtool -L eno2 combined 1
eddx@mvs:~/linux-6.18$ cat /proc/interrupts | grep eno2
162: 1 0 0 0 IR-PCI-MSIX-0000:08:00.0 0-edge eno2
163: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0 1-edge eno2-rx-0
164: 2 0 0 0 IR-PCI-MSIX-0000:08:00.0 2-edge eno2-tx-0
So, unless I am missing something I think the current logic should be sufficient.
> Why not:
> if (qid >= adapter->num_rx_queues || qid >= adapter->num_tx_queues)
> return -EINVAL;
> Or add assertion that they're equal:
> if (WARN_ON(adapter->num_rx_queues != adapter->num_tx_queues))
> return -EINVAL;
Okay. I can add this.
> if (qid >= adapter->num_rx_queues)
> return -EINVAL;
>
>
> > -
> > - ring = adapter->tx_ring[qid];
> > -
>
> ...
>
> > --
> > 2.34.1
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
` (3 preceding siblings ...)
2026-01-13 9:00 ` Loktionov, Aleksandr
@ 2026-01-13 11:41 ` Maciej Fijalkowski
2026-01-14 8:19 ` Behera, VIVEK via Intel-wired-lan
4 siblings, 1 reply; 14+ messages in thread
From: Maciej Fijalkowski @ 2026-01-13 11:41 UTC (permalink / raw)
To: Vivek Behera
Cc: aleksandr.loktionov, jacob.e.keller, anthony.l.nguyen,
przemyslaw.kitszel, sriram.yagnaraman, kurt, intel-wired-lan,
magnus.karlsson
On Mon, Jan 12, 2026 at 02:03:49PM +0100, Vivek Behera wrote:
> The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
> to share the same irq. This would lead to triggering of incorrect irq
> in split irq configuration.
> This patch addresses this issue which could impact environments
> with 2 active cpu cores
> or when the number of queues is reduced to 2 or less
>
> cat /proc/interrupts | grep eno2
> 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 0-edge eno2
> 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 1-edge eno2-rx-0
> 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 2-edge eno2-rx-1
> 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 3-edge eno2-tx-0
> 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> 4-edge eno2-tx-1
>
> Furthermore it uses the flags input argument to trigger either rx, tx or
> both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation
>
> Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-vivek.behera@siemens.com/
> v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-vivek.behera@siemens.com/
> v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-vivek.behera@siemens.com/
> v4: https://lore.kernel.org/intel-wired-lan/20251222115747.230521-1-vivek.behera@siemens.com/
>
> changelog:
> v1
> - Inital description of the Bug and fixes made in the patch
>
> v1 -> v2
> - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
> - Review suggestions by Aleksander: Modified sequence to complete all
> error checks for rx and tx before updating napi states and triggering irqs
> - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
> - Added define for Tx interrupt trigger bit mask for E1000_ICS
>
> v2 -> v3
> - Included applicable feedback and suggestions from igc patch
> - Fixed logic in updating eics value when both TX and RX need wakeup
>
> v3 -> v4
> - Added comments to explain trigerring of both TX and RX with active queue pairs
> - Fixed check of xsk pools in if statement
>
> v4 -> v5
> - Introduced a simplified logic for sequential check for RX and TX
> ---
> .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> 2 files changed, 61 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index fa028928482f..9357564a2d58 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -443,6 +443,7 @@
> #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold */
> #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written back */
>
> /* Extended Interrupt Cause Set */
> /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..6e51b5b6f131 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
> + struct igb_q_vector *q_vector;
> + struct napi_struct *rx_napi;
> + struct napi_struct *tx_napi;
> + bool trigger_irq_tx = false;
> + bool trigger_irq_rx = false;
> + u32 eics_tx = 0;
> + u32 eics_rx = 0;
> u32 eics = 0;
>
> if (test_bit(__IGB_DOWN, &adapter->state))
> @@ -536,27 +543,65 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>
> if (!igb_xdp_is_enabled(adapter))
> return -EINVAL;
> -
> - if (qid >= adapter->num_tx_queues)
> + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> + if (qid >= adapter->num_rx_queues)
> return -EINVAL;
> -
> - ring = adapter->tx_ring[qid];
> -
> - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> - return -ENETDOWN;
> -
> - if (!READ_ONCE(ring->xsk_pool))
> + /* Check if flags are valid */
> + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> return -EINVAL;
> -
> - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> - /* Cause software interrupt */
> + if (flags & XDP_WAKEUP_RX) {
> + /* IRQ trigger preparation for Rx */
> + ring = adapter->rx_ring[qid];
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + rx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_rx = ring->q_vector->eims_value;
> + trigger_irq_rx = true;
> + }
> + if (flags & XDP_WAKEUP_TX) {
> + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> + /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> + * so a single IRQ trigger will wake both RX and TX processing
> + */
> + } else {
> + /* IRQ trigger preparation for Tx */
> + ring = adapter->tx_ring[qid];
> + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> + return -ENETDOWN;
> +
> + if (!READ_ONCE(ring->xsk_pool))
> + return -ENXIO;
> + q_vector = ring->q_vector;
> + tx_napi = &q_vector->napi;
> + /* Extend the BIT mask for eics */
> + eics_tx = ring->q_vector->eims_value;
> + trigger_irq_tx = true;
> + }
> + }
> + /* All error checks are finished. Check and update napi states for rx and tx */
> + if (trigger_irq_rx) {
> + if (!napi_if_scheduled_mark_missed(rx_napi))
> + eics |= eics_rx;
> + }
> + if (trigger_irq_tx) {
> + if (!napi_if_scheduled_mark_missed(tx_napi))
> + eics |= eics_tx;
> + }
> + /* Now we trigger the required irqs for Rx and Tx */
> + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> - eics |= ring->q_vector->eims_value;
> wr32(E1000_EICS, eics);
> } else {
> - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + if ((trigger_irq_rx) && (trigger_irq_tx))
> + wr32(E1000_ICS,
> + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> + else if (trigger_irq_rx)
> + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> + else
> + wr32(E1000_ICS, E1000_ICS_TXDW);
My understanding is something below would be sufficient. Bits set on
E1000_ICS are not handled in any way so we don't have to distinguish
between rx/tx, it's just the matter of irq trigger and napi schedule.
-----------------8<-----------------
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..0aba7afd6a03 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct xsk_buff_pool *xsk_pool)
return nb_pkts < budget;
}
+static void igb_sw_irq(struct igb_q_vector *q_vector)
+{
+ u32 eics = 0;
+
+ if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
+ /* Cause software interrupt */
+ if (adapter->flags & IGB_FLAG_HAS_MSIX) {
+ eics |= ring->q_vector->eims_value;
+ wr32(E1000_EICS, eics);
+ } else {
+ wr32(E1000_ICS, E1000_ICS_RXDMT0);
+ }
+ }
+}
+
int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
{
struct igb_adapter *adapter = netdev_priv(dev);
struct e1000_hw *hw = &adapter->hw;
struct igb_ring *ring;
- u32 eics = 0;
if (test_bit(__IGB_DOWN, &adapter->state))
return -ENETDOWN;
@@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
if (!READ_ONCE(ring->xsk_pool))
return -EINVAL;
- if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
- /* Cause software interrupt */
- if (adapter->flags & IGB_FLAG_HAS_MSIX) {
- eics |= ring->q_vector->eims_value;
- wr32(E1000_EICS, eics);
- } else {
- wr32(E1000_ICS, E1000_ICS_RXDMT0);
- }
+ if (flags & XDP_WAKEUP_TX)
+ igb_sw_irq(ring->q_vector);
+
+ if (flags & XDP_WAKEUP_RX) {
+ ring = adapter->rx_ring[qid];
+ /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
+ * been already marked with NAPIF_STATE_MISSED
+ */
+ igb_sw_irq(ring->q_vector);
}
return 0;
----------------->8-----------------
> }
> }
> -
> return 0;
> }
> --
> 2.34.1
>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-13 11:41 ` Maciej Fijalkowski
@ 2026-01-14 8:19 ` Behera, VIVEK via Intel-wired-lan
2026-01-14 18:20 ` Maciej Fijalkowski
0 siblings, 1 reply; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-14 8:19 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de,
intel-wired-lan@lists.osuosl.org, magnus.karlsson@intel.com
> -----Ursprüngliche Nachricht-----
> Von: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Gesendet: Dienstag, 13. Januar 2026 12:41
> An: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> sriram.yagnaraman@ericsson.com; kurt@linutronix.de; intel-wired-
> lan@lists.osuosl.org; magnus.karlsson@intel.com
> Betreff: Re: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
>
> On Mon, Jan 12, 2026 at 02:03:49PM +0100, Vivek Behera wrote:
> > The current implementation in the igb_xsk_wakeup expects the Rx and Tx
> > queues to share the same irq. This would lead to triggering of
> > incorrect irq in split irq configuration.
> > This patch addresses this issue which could impact environments with 2
> > active cpu cores or when the number of queues is reduced to 2 or less
> >
> > cat /proc/interrupts | grep eno2
> > 167: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 0-edge eno2
> > 168: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 1-edge eno2-rx-0
> > 169: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 2-edge eno2-rx-1
> > 170: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 3-edge eno2-tx-0
> > 171: 0 0 0 0 IR-PCI-MSIX-0000:08:00.0
> > 4-edge eno2-tx-1
> >
> > Furthermore it uses the flags input argument to trigger either rx, tx
> > or both rx and tx irqs as specified in the ndo_xsk_wakeup api
> > documentation
> >
> > Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> > Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> > Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> > ---
> > v1:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251212131454.124116-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7Ca31558f
> 1fe
> >
> ea4357387008de5298bc67%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9039013116385150%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=6jtDYEHBQwK0mSoY5bRnMu2YXbrZzKqEeTSim8EsumI%3D&re
> served=0
> > v2:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251215115416.410619-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7Ca31558f
> 1fe
> >
> ea4357387008de5298bc67%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9039013116422789%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=V7%2BmPEiR7nJ9a9p9Fcl8RqPjij%2BgGop05dkWB7pRMlM%3D
> &reserv
> > ed=0
> > v3:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251220114936.140473-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7Ca31558f
> 1fe
> >
> ea4357387008de5298bc67%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9039013116446169%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=c9OVw3ziHwqlpeXKJGsUxVyJCyeO2pwwf98ejBaSg9s%3D&reser
> ved=0
> > v4:
> > https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore
> > .kernel.org%2Fintel-wired-lan%2F20251222115747.230521-1-vivek.behera%4
> >
> 0siemens.com%2F&data=05%7C02%7Cvivek.behera%40siemens.com%7Ca31558f
> 1fe
> >
> ea4357387008de5298bc67%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7
> C63
> >
> 9039013116466859%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydW
> UsIlYiO
> >
> iIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7
> C
> >
> %7C%7C&sdata=HJsueSZ49aNY%2FSY7iCQwvc7pcLWvB7I%2FXUdx%2F%2Ft70
> gQ%3D&re
> > served=0
> >
> > changelog:
> > v1
> > - Inital description of the Bug and fixes made in the patch
> >
> > v1 -> v2
> > - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ
> > configuration
> > - Review suggestions by Aleksander: Modified sequence to complete all
> > error checks for rx and tx before updating napi states and
> > triggering irqs
> > - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix
> > use case)
> > - Added define for Tx interrupt trigger bit mask for E1000_ICS
> >
> > v2 -> v3
> > - Included applicable feedback and suggestions from igc patch
> > - Fixed logic in updating eics value when both TX and RX need wakeup
> >
> > v3 -> v4
> > - Added comments to explain trigerring of both TX and RX with active
> > queue pairs
> > - Fixed check of xsk pools in if statement
> >
> > v4 -> v5
> > - Introduced a simplified logic for sequential check for RX and TX
> > ---
> > .../net/ethernet/intel/igb/e1000_defines.h | 1 +
> > drivers/net/ethernet/intel/igb/igb_xsk.c | 75 +++++++++++++++----
> > 2 files changed, 61 insertions(+), 15 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > index fa028928482f..9357564a2d58 100644
> > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > @@ -443,6 +443,7 @@
> > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold
> */
> > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written
> back */
> >
> > /* Extended Interrupt Cause Set */
> > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..6e51b5b6f131 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> flags)
> > struct igb_adapter *adapter = netdev_priv(dev);
> > struct e1000_hw *hw = &adapter->hw;
> > struct igb_ring *ring;
> > + struct igb_q_vector *q_vector;
> > + struct napi_struct *rx_napi;
> > + struct napi_struct *tx_napi;
> > + bool trigger_irq_tx = false;
> > + bool trigger_irq_rx = false;
> > + u32 eics_tx = 0;
> > + u32 eics_rx = 0;
> > u32 eics = 0;
> >
> > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@ int
> > igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> >
> > if (!igb_xdp_is_enabled(adapter))
> > return -EINVAL;
> > -
> > - if (qid >= adapter->num_tx_queues)
> > + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> > + if (qid >= adapter->num_rx_queues)
> > return -EINVAL;
> > -
> > - ring = adapter->tx_ring[qid];
> > -
> > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > - return -ENETDOWN;
> > -
> > - if (!READ_ONCE(ring->xsk_pool))
> > + /* Check if flags are valid */
> > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > return -EINVAL;
> > -
> > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > - /* Cause software interrupt */
> > + if (flags & XDP_WAKEUP_RX) {
> > + /* IRQ trigger preparation for Rx */
> > + ring = adapter->rx_ring[qid];
> > + if (!READ_ONCE(ring->xsk_pool))
> > + return -ENXIO;
> > + q_vector = ring->q_vector;
> > + rx_napi = &q_vector->napi;
> > + /* Extend the BIT mask for eics */
> > + eics_rx = ring->q_vector->eims_value;
> > + trigger_irq_rx = true;
> > + }
> > + if (flags & XDP_WAKEUP_TX) {
> > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > + /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> > + * so a single IRQ trigger will wake both RX and TX processing
> > + */
> > + } else {
> > + /* IRQ trigger preparation for Tx */
> > + ring = adapter->tx_ring[qid];
> > + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > + return -ENETDOWN;
> > +
> > + if (!READ_ONCE(ring->xsk_pool))
> > + return -ENXIO;
> > + q_vector = ring->q_vector;
> > + tx_napi = &q_vector->napi;
> > + /* Extend the BIT mask for eics */
> > + eics_tx = ring->q_vector->eims_value;
> > + trigger_irq_tx = true;
> > + }
> > + }
> > + /* All error checks are finished. Check and update napi states for rx and tx */
> > + if (trigger_irq_rx) {
> > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > + eics |= eics_rx;
> > + }
> > + if (trigger_irq_tx) {
> > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > + eics |= eics_tx;
> > + }
> > + /* Now we trigger the required irqs for Rx and Tx */
> > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > - eics |= ring->q_vector->eims_value;
> > wr32(E1000_EICS, eics);
> > } else {
> > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > + wr32(E1000_ICS,
> > + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > + else if (trigger_irq_rx)
> > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > + else
> > + wr32(E1000_ICS, E1000_ICS_TXDW);
>
> My understanding is something below would be sufficient. Bits set on E1000_ICS are
> not handled in any way so we don't have to distinguish between rx/tx, it's just the
> matter of irq trigger and napi schedule.
>
Hi see my comments below
> -----------------8<-----------------
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..0aba7afd6a03 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct
> xsk_buff_pool *xsk_pool)
> return nb_pkts < budget;
> }
>
> +static void igb_sw_irq(struct igb_q_vector *q_vector) {
> + u32 eics = 0;
> +
> + if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
> + /* Cause software interrupt */
> + if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> + eics |= ring->q_vector->eims_value;
> + wr32(E1000_EICS, eics);
> + } else {
> + wr32(E1000_ICS, E1000_ICS_RXDMT0);
So here it is sufficient to rely on the E1000_ICS_RXDMT0 bit to trigger the correct irq (Tx and Rx)?
I remember I received a review comment from Intel point to E1000_ICS_TXDW as being the correct bit of triggering TX for non MSIX case.
I can't really evaluate this since I don't have a setup to test this. But okay
> + }
> + }
> +}
> +
> int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> struct igb_adapter *adapter = netdev_priv(dev);
> struct e1000_hw *hw = &adapter->hw;
> struct igb_ring *ring;
> - u32 eics = 0;
>
> if (test_bit(__IGB_DOWN, &adapter->state))
> return -ENETDOWN;
> @@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> flags)
> if (!READ_ONCE(ring->xsk_pool))
> return -EINVAL;
>
> - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> - /* Cause software interrupt */
> - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> - eics |= ring->q_vector->eims_value;
> - wr32(E1000_EICS, eics);
> - } else {
> - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> - }
> + if (flags & XDP_WAKEUP_TX)
> + igb_sw_irq(ring->q_vector);
> +
> + if (flags & XDP_WAKEUP_RX) {
> + ring = adapter->rx_ring[qid];
> + /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> + * been already marked with NAPIF_STATE_MISSED
> + */
I think you meant for the case IGB_FLAG_QUEUE_PAIRS. Since when the queue pairs are not active
the Tx AND Rx queues don't share the same qvector and consequently not the same NAPI
> + igb_sw_irq(ring->q_vector);
Okay so you would be triggering soft irq's in two steps if both TX and RX flags are set.
Honestly, I have tried to avoid doing this in my patch. Which is the reason why I wait to finish all the error checks,
Napi updates before triggering the required irq vectors by writing to eics with a single write.
But okay the other approach also works
> }
>
> return 0;
>
> ----------------->8-----------------
>
> > }
> > }
> > -
> > return 0;
> > }
> > --
> > 2.34.1
> >
I think the strategy of triggering interrupts in one step after performing all the necessary checks is what might make this approach look complex.
IMHO the one step strategy is better and more intuitive.
Unfortunately, there isn't a reference here to go by since none of the xsk_wakeup hooks implemented in the kernel care about the flags
I can submit a v6 of the patch based the one step approach with further simplifications. v6 would also include review suggestions I received for v5.
Like this I can also submit the next version to the igc patch. It follows the same logic as the igb
I have our regression tests with RTC testbench and our Siemens Profinet RT tester running with these patches with I210 and I226
Alternatively, you could submit patches following the igb and igc following the two-step logic.
Regards
Vivek
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-14 8:19 ` Behera, VIVEK via Intel-wired-lan
@ 2026-01-14 18:20 ` Maciej Fijalkowski
2026-01-15 11:05 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 1 reply; 14+ messages in thread
From: Maciej Fijalkowski @ 2026-01-14 18:20 UTC (permalink / raw)
To: Behera, VIVEK
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de,
intel-wired-lan@lists.osuosl.org, magnus.karlsson@intel.com
On Wed, Jan 14, 2026 at 08:19:37AM +0000, Behera, VIVEK wrote:
(...)
> > >
> > > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > index fa028928482f..9357564a2d58 100644
> > > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > @@ -443,6 +443,7 @@
> > > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min. threshold
> > */
> > > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset Aserted */
> > > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc written
> > back */
> > >
> > > /* Extended Interrupt Cause Set */
> > > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > index 30ce5fbb5b77..6e51b5b6f131 100644
> > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> > flags)
> > > struct igb_adapter *adapter = netdev_priv(dev);
> > > struct e1000_hw *hw = &adapter->hw;
> > > struct igb_ring *ring;
> > > + struct igb_q_vector *q_vector;
> > > + struct napi_struct *rx_napi;
> > > + struct napi_struct *tx_napi;
> > > + bool trigger_irq_tx = false;
> > > + bool trigger_irq_rx = false;
> > > + u32 eics_tx = 0;
> > > + u32 eics_rx = 0;
> > > u32 eics = 0;
> > >
> > > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@ int
> > > igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> > >
> > > if (!igb_xdp_is_enabled(adapter))
> > > return -EINVAL;
> > > -
> > > - if (qid >= adapter->num_tx_queues)
> > > + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> > > + if (qid >= adapter->num_rx_queues)
> > > return -EINVAL;
> > > -
> > > - ring = adapter->tx_ring[qid];
> > > -
> > > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > > - return -ENETDOWN;
> > > -
> > > - if (!READ_ONCE(ring->xsk_pool))
> > > + /* Check if flags are valid */
> > > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > > return -EINVAL;
> > > -
> > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > - /* Cause software interrupt */
> > > + if (flags & XDP_WAKEUP_RX) {
> > > + /* IRQ trigger preparation for Rx */
> > > + ring = adapter->rx_ring[qid];
> > > + if (!READ_ONCE(ring->xsk_pool))
> > > + return -ENXIO;
> > > + q_vector = ring->q_vector;
> > > + rx_napi = &q_vector->napi;
> > > + /* Extend the BIT mask for eics */
> > > + eics_rx = ring->q_vector->eims_value;
> > > + trigger_irq_rx = true;
> > > + }
> > > + if (flags & XDP_WAKEUP_TX) {
> > > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > > + /* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> > > + * so a single IRQ trigger will wake both RX and TX processing
> > > + */
> > > + } else {
> > > + /* IRQ trigger preparation for Tx */
> > > + ring = adapter->tx_ring[qid];
> > > + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > > + return -ENETDOWN;
> > > +
> > > + if (!READ_ONCE(ring->xsk_pool))
> > > + return -ENXIO;
> > > + q_vector = ring->q_vector;
> > > + tx_napi = &q_vector->napi;
> > > + /* Extend the BIT mask for eics */
> > > + eics_tx = ring->q_vector->eims_value;
> > > + trigger_irq_tx = true;
> > > + }
> > > + }
> > > + /* All error checks are finished. Check and update napi states for rx and tx */
> > > + if (trigger_irq_rx) {
> > > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > > + eics |= eics_rx;
> > > + }
> > > + if (trigger_irq_tx) {
> > > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > > + eics |= eics_tx;
> > > + }
> > > + /* Now we trigger the required irqs for Rx and Tx */
> > > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > - eics |= ring->q_vector->eims_value;
> > > wr32(E1000_EICS, eics);
> > > } else {
> > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > > + wr32(E1000_ICS,
> > > + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > > + else if (trigger_irq_rx)
> > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > + else
> > > + wr32(E1000_ICS, E1000_ICS_TXDW);
> >
> > My understanding is something below would be sufficient. Bits set on E1000_ICS are
> > not handled in any way so we don't have to distinguish between rx/tx, it's just the
> > matter of irq trigger and napi schedule.
> >
> Hi see my comments below
> > -----------------8<-----------------
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..0aba7afd6a03 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct
> > xsk_buff_pool *xsk_pool)
> > return nb_pkts < budget;
> > }
> >
> > +static void igb_sw_irq(struct igb_q_vector *q_vector) {
> > + u32 eics = 0;
> > +
> > + if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
> > + /* Cause software interrupt */
> > + if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > + eics |= ring->q_vector->eims_value;
> > + wr32(E1000_EICS, eics);
> > + } else {
> > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> So here it is sufficient to rely on the E1000_ICS_RXDMT0 bit to trigger the correct irq (Tx and Rx)?
> I remember I received a review comment from Intel point to E1000_ICS_TXDW as being the correct bit of triggering TX for non MSIX case.
> I can't really evaluate this since I don't have a setup to test this. But okay
I don't see in irq handlers that we do any specific handling for txdw vs
rxdmt0. It's rather a matter of getting an irq here.
> > + }
> > + }
> > +}
> > +
> > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> > struct igb_adapter *adapter = netdev_priv(dev);
> > struct e1000_hw *hw = &adapter->hw;
> > struct igb_ring *ring;
> > - u32 eics = 0;
> >
> > if (test_bit(__IGB_DOWN, &adapter->state))
> > return -ENETDOWN;
> > @@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> > flags)
> > if (!READ_ONCE(ring->xsk_pool))
> > return -EINVAL;
> >
> > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > - /* Cause software interrupt */
> > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > - eics |= ring->q_vector->eims_value;
> > - wr32(E1000_EICS, eics);
> > - } else {
> > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > - }
> > + if (flags & XDP_WAKEUP_TX)
> > + igb_sw_irq(ring->q_vector);
> > +
> > + if (flags & XDP_WAKEUP_RX) {
> > + ring = adapter->rx_ring[qid];
> > + /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> > + * been already marked with NAPIF_STATE_MISSED
> > + */
> I think you meant for the case IGB_FLAG_QUEUE_PAIRS. Since when the queue pairs are not active
> the Tx AND Rx queues don't share the same qvector and consequently not the same NAPI
yes, correct
> > + igb_sw_irq(ring->q_vector);
> Okay so you would be triggering soft irq's in two steps if both TX and RX flags are set.
> Honestly, I have tried to avoid doing this in my patch. Which is the reason why I wait to finish all the error checks,
> Napi updates before triggering the required irq vectors by writing to eics with a single write.
> But okay the other approach also works
>
> > }
> >
> > return 0;
> >
> > ----------------->8-----------------
> >
> > > }
> > > }
> > > -
> > > return 0;
> > > }
> > > --
> > > 2.34.1
> > >
> I think the strategy of triggering interrupts in one step after performing all the necessary checks is what might make this approach look complex.
> IMHO the one step strategy is better and more intuitive.
> Unfortunately, there isn't a reference here to go by since none of the xsk_wakeup hooks implemented in the kernel care about the flags
> I can submit a v6 of the patch based the one step approach with further simplifications. v6 would also include review suggestions I received for v5.
> Like this I can also submit the next version to the igc patch. It follows the same logic as the igb
> I have our regression tests with RTC testbench and our Siemens Profinet RT tester running with these patches with I210 and I226
>
> Alternatively, you could submit patches following the igb and igc following the two-step logic.
How about we meet the half way and something below? that would include
your request of having a single write to E1000_ICS.
diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
index 30ce5fbb5b77..432b4c7c1850 100644
--- a/drivers/net/ethernet/intel/igb/igb_xsk.c
+++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
@@ -524,6 +524,17 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct xsk_buff_pool *xsk_pool)
return nb_pkts < budget;
}
+static u32 igb_sw_irq_prep(struct igb_q_vector *q_vector)
+{
+ u32 eics = 0;
+
+ if (!napi_if_scheduled_mark_missed(&q_vector->napi))
+ eics = adapter->flags & IGB_FLAG_HAS_MSIX ?
+ q_vector->eims_value : 1;
+
+ return eics;
+}
+
int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
{
struct igb_adapter *adapter = netdev_priv(dev);
@@ -548,14 +559,23 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
if (!READ_ONCE(ring->xsk_pool))
return -EINVAL;
- if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
- /* Cause software interrupt */
- if (adapter->flags & IGB_FLAG_HAS_MSIX) {
- eics |= ring->q_vector->eims_value;
+ if (flags & XDP_WAKEUP_TX)
+ eics |= igb_sw_irq_prep(ring->q_vector);
+
+ if (flags & XDP_WAKEUP_RX) {
+ ring = adapter->rx_ring[qid];
+ /* for IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
+ * been already marked with NAPIF_STATE_MISSED
+ */
+ eics |= igb_sw_irq_prep(ring->q_vector);
+ }
+
+ /* Cause software interrupt */
+ if (eics) {
+ if (adapter->flags & IGB_FLAG_HAS_MSIX)
wr32(E1000_EICS, eics);
- } else {
+ else
wr32(E1000_ICS, E1000_ICS_RXDMT0);
- }
}
return 0;
>
> Regards
>
> Vivek
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-14 18:20 ` Maciej Fijalkowski
@ 2026-01-15 11:05 ` Behera, VIVEK via Intel-wired-lan
2026-01-15 19:53 ` Maciej Fijalkowski
0 siblings, 1 reply; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-15 11:05 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de,
intel-wired-lan@lists.osuosl.org, magnus.karlsson@intel.com
Hi Maciej
> -----Original Message-----
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Sent: Wednesday, January 14, 2026 7:21 PM
> To: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> sriram.yagnaraman@ericsson.com; kurt@linutronix.de; intel-wired-
> lan@lists.osuosl.org; magnus.karlsson@intel.com
> Subject: Re: AW: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
>
> On Wed, Jan 14, 2026 at 08:19:37AM +0000, Behera, VIVEK wrote:
>
> (...)
>
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > index fa028928482f..9357564a2d58 100644
> > > > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > @@ -443,6 +443,7 @@
> > > > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > > > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min.
> threshold
> > > */
> > > > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset
> Aserted */
> > > > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc
> written
> > > back */
> > > >
> > > > /* Extended Interrupt Cause Set */
> > > > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > > > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > index 30ce5fbb5b77..6e51b5b6f131 100644
> > > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev,
> > > > u32 qid, u32
> > > flags)
> > > > struct igb_adapter *adapter = netdev_priv(dev);
> > > > struct e1000_hw *hw = &adapter->hw;
> > > > struct igb_ring *ring;
> > > > + struct igb_q_vector *q_vector;
> > > > + struct napi_struct *rx_napi;
> > > > + struct napi_struct *tx_napi;
> > > > + bool trigger_irq_tx = false;
> > > > + bool trigger_irq_rx = false;
> > > > + u32 eics_tx = 0;
> > > > + u32 eics_rx = 0;
> > > > u32 eics = 0;
> > > >
> > > > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@
> > > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> > > >
> > > > if (!igb_xdp_is_enabled(adapter))
> > > > return -EINVAL;
> > > > -
> > > > - if (qid >= adapter->num_tx_queues)
> > > > + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> > > > + if (qid >= adapter->num_rx_queues)
> > > > return -EINVAL;
> > > > -
> > > > - ring = adapter->tx_ring[qid];
> > > > -
> > > > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > > > - return -ENETDOWN;
> > > > -
> > > > - if (!READ_ONCE(ring->xsk_pool))
> > > > + /* Check if flags are valid */
> > > > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > > > return -EINVAL;
> > > > -
> > > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > > - /* Cause software interrupt */
> > > > + if (flags & XDP_WAKEUP_RX) {
> > > > + /* IRQ trigger preparation for Rx */
> > > > + ring = adapter->rx_ring[qid];
> > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > + return -ENXIO;
> > > > + q_vector = ring->q_vector;
> > > > + rx_napi = &q_vector->napi;
> > > > + /* Extend the BIT mask for eics */
> > > > + eics_rx = ring->q_vector->eims_value;
> > > > + trigger_irq_rx = true;
> > > > + }
> > > > + if (flags & XDP_WAKEUP_TX) {
> > > > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > > > + /* In queue-pair mode, rx_ring and tx_ring share the same
> q_vector,
> > > > + * so a single IRQ trigger will wake both RX and TX processing
> > > > + */
> > > > + } else {
> > > > + /* IRQ trigger preparation for Tx */
> > > > + ring = adapter->tx_ring[qid];
> > > > + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring-
> >flags))
> > > > + return -ENETDOWN;
> > > > +
> > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > + return -ENXIO;
> > > > + q_vector = ring->q_vector;
> > > > + tx_napi = &q_vector->napi;
> > > > + /* Extend the BIT mask for eics */
> > > > + eics_tx = ring->q_vector->eims_value;
> > > > + trigger_irq_tx = true;
> > > > + }
> > > > + }
> > > > + /* All error checks are finished. Check and update napi states for rx and tx
> */
> > > > + if (trigger_irq_rx) {
> > > > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > > > + eics |= eics_rx;
> > > > + }
> > > > + if (trigger_irq_tx) {
> > > > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > > > + eics |= eics_tx;
> > > > + }
> > > > + /* Now we trigger the required irqs for Rx and Tx */
> > > > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > > > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > - eics |= ring->q_vector->eims_value;
> > > > wr32(E1000_EICS, eics);
> > > > } else {
> > > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > > > + wr32(E1000_ICS,
> > > > + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > > > + else if (trigger_irq_rx)
> > > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > + else
> > > > + wr32(E1000_ICS, E1000_ICS_TXDW);
> > >
> > > My understanding is something below would be sufficient. Bits set on
> > > E1000_ICS are not handled in any way so we don't have to distinguish
> > > between rx/tx, it's just the matter of irq trigger and napi schedule.
> > >
> > Hi see my comments below
> > > -----------------8<-----------------
> > >
> > > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > index 30ce5fbb5b77..0aba7afd6a03 100644
> > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > @@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring,
> > > struct xsk_buff_pool *xsk_pool)
> > > return nb_pkts < budget;
> > > }
> > >
> > > +static void igb_sw_irq(struct igb_q_vector *q_vector) {
> > > + u32 eics = 0;
> > > +
> > > + if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
> > > + /* Cause software interrupt */
> > > + if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > + eics |= ring->q_vector->eims_value;
> > > + wr32(E1000_EICS, eics);
> > > + } else {
> > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > So here it is sufficient to rely on the E1000_ICS_RXDMT0 bit to trigger the
> correct irq (Tx and Rx)?
> > I remember I received a review comment from Intel point to E1000_ICS_TXDW
> as being the correct bit of triggering TX for non MSIX case.
> > I can't really evaluate this since I don't have a setup to test this.
> > But okay
>
> I don't see in irq handlers that we do any specific handling for txdw vs rxdmt0. It's
> rather a matter of getting an irq here.
Yes amongst the interrupt cause checks implemented in the msi interrupt handler
there is no interest in either E1000_ICR_RXDMT0 E1000_ICR_TXDW as events.
All that matters in this in this case is a softirq trigerring napi_schedule . So we can stick
to E1000_ICS_RXDMT0 irrespective of the flag in the xsk wakeup function. Although
I must mention that all other usages of E1000_ICS_RXDMT0 in the kernel are in combination
with rx_ring so from the code perspective this usage looks strange to someone without deeper knowledge of the system.
>
> > > + }
> > > + }
> > > +}
> > > +
> > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> > > struct igb_adapter *adapter = netdev_priv(dev);
> > > struct e1000_hw *hw = &adapter->hw;
> > > struct igb_ring *ring;
> > > - u32 eics = 0;
> > >
> > > if (test_bit(__IGB_DOWN, &adapter->state))
> > > return -ENETDOWN;
> > > @@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32
> > > qid, u32
> > > flags)
> > > if (!READ_ONCE(ring->xsk_pool))
> > > return -EINVAL;
> > >
> > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > - /* Cause software interrupt */
> > > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > - eics |= ring->q_vector->eims_value;
> > > - wr32(E1000_EICS, eics);
> > > - } else {
> > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > - }
> > > + if (flags & XDP_WAKEUP_TX)
> > > + igb_sw_irq(ring->q_vector);
> > > +
> > > + if (flags & XDP_WAKEUP_RX) {
> > > + ring = adapter->rx_ring[qid];
> > > + /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> > > + * been already marked with NAPIF_STATE_MISSED
> > > + */
> > I think you meant for the case IGB_FLAG_QUEUE_PAIRS. Since when the
> > queue pairs are not active the Tx AND Rx queues don't share the same
> > qvector and consequently not the same NAPI
>
> yes, correct
>
> > > + igb_sw_irq(ring->q_vector);
> > Okay so you would be triggering soft irq's in two steps if both TX and RX flags
> are set.
> > Honestly, I have tried to avoid doing this in my patch. Which is the
> > reason why I wait to finish all the error checks, Napi updates before triggering the
> required irq vectors by writing to eics with a single write.
> > But okay the other approach also works
>
>
>
> >
> > > }
> > >
> > > return 0;
> > >
> > > ----------------->8-----------------
> > >
> > > > }
> > > > }
> > > > -
> > > > return 0;
> > > > }
> > > > --
> > > > 2.34.1
> > > >
> > I think the strategy of triggering interrupts in one step after performing all the
> necessary checks is what might make this approach look complex.
> > IMHO the one step strategy is better and more intuitive.
> > Unfortunately, there isn't a reference here to go by since none of the
> xsk_wakeup hooks implemented in the kernel care about the flags
> > I can submit a v6 of the patch based the one step approach with further
> simplifications. v6 would also include review suggestions I received for v5.
> > Like this I can also submit the next version to the igc patch. It follows the same
> logic as the igb
> > I have our regression tests with RTC testbench and our Siemens Profinet RT
> tester running with these patches with I210 and I226
> >
> > Alternatively, you could submit patches following the igb and igc following the
> two-step logic.
>
> How about we meet the half way and something below? that would include
> your request of having a single write to E1000_ICS.
>
Yes it sounds reasonable. However I would like to make some observations
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..432b4c7c1850 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -524,6 +524,17 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct
> xsk_buff_pool *xsk_pool)
> return nb_pkts < budget;
> }
>
> +static u32 igb_sw_irq_prep(struct igb_q_vector *q_vector)
> +{
> + u32 eics = 0;
> +
> + if (!napi_if_scheduled_mark_missed(&q_vector->napi))
> + eics = adapter->flags & IGB_FLAG_HAS_MSIX ?
> + q_vector->eims_value : 1;
> +
> + return eics;
> +}
> +
> int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> {
> struct igb_adapter *adapter = netdev_priv(dev);
> @@ -548,14 +559,23 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> flags)
> if (!READ_ONCE(ring->xsk_pool))
> return -EINVAL;
If we want to implement proper usage of flags then I would move everything
related to a ring's internal checks inside the if case of the corresponding if case.
Use the correct adapter ring (rx or tx) to perform the checks.
Even though the functionality wise this code is correct why just randomly pick the tx ring
right at beginning of the function and do two checks with it? And same question would
pop up to the reviewers mind for the igc driver where rx_ring is used. For me correct
usage is more important than saving some lines of code in the patch
>
> - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> - /* Cause software interrupt */
> - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> - eics |= ring->q_vector->eims_value;
> + if (flags & XDP_WAKEUP_TX)
> + eics |= igb_sw_irq_prep(ring->q_vector);
> +
> + if (flags & XDP_WAKEUP_RX) {
> + ring = adapter->rx_ring[qid];
> + /* for IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> + * been already marked with NAPIF_STATE_MISSED
> + */
> + eics |= igb_sw_irq_prep(ring->q_vector);
> + }
> +
> + /* Cause software interrupt */
> + if (eics) {
> + if (adapter->flags & IGB_FLAG_HAS_MSIX)
> wr32(E1000_EICS, eics);
> - } else {
> + else
> wr32(E1000_ICS, E1000_ICS_RXDMT0);
> - }
> }
>
> return 0;
>
> >
> > Regards
> >
> > Vivek
I would like to know your view about the next steps. Would you like me to include the changes we discussed in next version of the patch I submitted?
Or would you like to submit the patch you have prepared?
Regards
Vivek
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-15 11:05 ` Behera, VIVEK via Intel-wired-lan
@ 2026-01-15 19:53 ` Maciej Fijalkowski
2026-01-16 11:44 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 1 reply; 14+ messages in thread
From: Maciej Fijalkowski @ 2026-01-15 19:53 UTC (permalink / raw)
To: Behera, VIVEK
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de,
intel-wired-lan@lists.osuosl.org, magnus.karlsson@intel.com
On Thu, Jan 15, 2026 at 11:05:37AM +0000, Behera, VIVEK wrote:
> Hi Maciej
>
> > -----Original Message-----
> > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > Sent: Wednesday, January 14, 2026 7:21 PM
> > To: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> > Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> > anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> > sriram.yagnaraman@ericsson.com; kurt@linutronix.de; intel-wired-
> > lan@lists.osuosl.org; magnus.karlsson@intel.com
> > Subject: Re: AW: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> > igb_xsk_wakeup
> >
> > On Wed, Jan 14, 2026 at 08:19:37AM +0000, Behera, VIVEK wrote:
> >
> > (...)
> >
> > > > >
> > > > > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > index fa028928482f..9357564a2d58 100644
> > > > > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > @@ -443,6 +443,7 @@
> > > > > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change */
> > > > > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min.
> > threshold
> > > > */
> > > > > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset
> > Aserted */
> > > > > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc
> > written
> > > > back */
> > > > >
> > > > > /* Extended Interrupt Cause Set */
> > > > > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff --git
> > > > > a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > index 30ce5fbb5b77..6e51b5b6f131 100644
> > > > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device *dev,
> > > > > u32 qid, u32
> > > > flags)
> > > > > struct igb_adapter *adapter = netdev_priv(dev);
> > > > > struct e1000_hw *hw = &adapter->hw;
> > > > > struct igb_ring *ring;
> > > > > + struct igb_q_vector *q_vector;
> > > > > + struct napi_struct *rx_napi;
> > > > > + struct napi_struct *tx_napi;
> > > > > + bool trigger_irq_tx = false;
> > > > > + bool trigger_irq_rx = false;
> > > > > + u32 eics_tx = 0;
> > > > > + u32 eics_rx = 0;
> > > > > u32 eics = 0;
> > > > >
> > > > > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27 +543,65 @@
> > > > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> > > > >
> > > > > if (!igb_xdp_is_enabled(adapter))
> > > > > return -EINVAL;
> > > > > -
> > > > > - if (qid >= adapter->num_tx_queues)
> > > > > + /* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> > > > > + if (qid >= adapter->num_rx_queues)
> > > > > return -EINVAL;
> > > > > -
> > > > > - ring = adapter->tx_ring[qid];
> > > > > -
> > > > > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > > > > - return -ENETDOWN;
> > > > > -
> > > > > - if (!READ_ONCE(ring->xsk_pool))
> > > > > + /* Check if flags are valid */
> > > > > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > > > > return -EINVAL;
> > > > > -
> > > > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > > > - /* Cause software interrupt */
> > > > > + if (flags & XDP_WAKEUP_RX) {
> > > > > + /* IRQ trigger preparation for Rx */
> > > > > + ring = adapter->rx_ring[qid];
> > > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > > + return -ENXIO;
> > > > > + q_vector = ring->q_vector;
> > > > > + rx_napi = &q_vector->napi;
> > > > > + /* Extend the BIT mask for eics */
> > > > > + eics_rx = ring->q_vector->eims_value;
> > > > > + trigger_irq_rx = true;
> > > > > + }
> > > > > + if (flags & XDP_WAKEUP_TX) {
> > > > > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > > > > + /* In queue-pair mode, rx_ring and tx_ring share the same
> > q_vector,
> > > > > + * so a single IRQ trigger will wake both RX and TX processing
> > > > > + */
> > > > > + } else {
> > > > > + /* IRQ trigger preparation for Tx */
> > > > > + ring = adapter->tx_ring[qid];
> > > > > + if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring-
> > >flags))
> > > > > + return -ENETDOWN;
> > > > > +
> > > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > > + return -ENXIO;
> > > > > + q_vector = ring->q_vector;
> > > > > + tx_napi = &q_vector->napi;
> > > > > + /* Extend the BIT mask for eics */
> > > > > + eics_tx = ring->q_vector->eims_value;
> > > > > + trigger_irq_tx = true;
> > > > > + }
> > > > > + }
> > > > > + /* All error checks are finished. Check and update napi states for rx and tx
> > */
> > > > > + if (trigger_irq_rx) {
> > > > > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > > > > + eics |= eics_rx;
> > > > > + }
> > > > > + if (trigger_irq_tx) {
> > > > > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > > > > + eics |= eics_tx;
> > > > > + }
> > > > > + /* Now we trigger the required irqs for Rx and Tx */
> > > > > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > > > > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > > - eics |= ring->q_vector->eims_value;
> > > > > wr32(E1000_EICS, eics);
> > > > > } else {
> > > > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > > > > + wr32(E1000_ICS,
> > > > > + E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> > > > > + else if (trigger_irq_rx)
> > > > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > > + else
> > > > > + wr32(E1000_ICS, E1000_ICS_TXDW);
> > > >
> > > > My understanding is something below would be sufficient. Bits set on
> > > > E1000_ICS are not handled in any way so we don't have to distinguish
> > > > between rx/tx, it's just the matter of irq trigger and napi schedule.
> > > >
> > > Hi see my comments below
> > > > -----------------8<-----------------
> > > >
> > > > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > index 30ce5fbb5b77..0aba7afd6a03 100644
> > > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > @@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring,
> > > > struct xsk_buff_pool *xsk_pool)
> > > > return nb_pkts < budget;
> > > > }
> > > >
> > > > +static void igb_sw_irq(struct igb_q_vector *q_vector) {
> > > > + u32 eics = 0;
> > > > +
> > > > + if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
> > > > + /* Cause software interrupt */
> > > > + if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > + eics |= ring->q_vector->eims_value;
> > > > + wr32(E1000_EICS, eics);
> > > > + } else {
> > > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > So here it is sufficient to rely on the E1000_ICS_RXDMT0 bit to trigger the
> > correct irq (Tx and Rx)?
> > > I remember I received a review comment from Intel point to E1000_ICS_TXDW
> > as being the correct bit of triggering TX for non MSIX case.
> > > I can't really evaluate this since I don't have a setup to test this.
> > > But okay
> >
> > I don't see in irq handlers that we do any specific handling for txdw vs rxdmt0. It's
> > rather a matter of getting an irq here.
> Yes amongst the interrupt cause checks implemented in the msi interrupt handler
> there is no interest in either E1000_ICR_RXDMT0 E1000_ICR_TXDW as events.
> All that matters in this in this case is a softirq trigerring napi_schedule . So we can stick
> to E1000_ICS_RXDMT0 irrespective of the flag in the xsk wakeup function. Although
> I must mention that all other usages of E1000_ICS_RXDMT0 in the kernel are in combination
> with rx_ring so from the code perspective this usage looks strange to someone without deeper knowledge of the system.
> >
> > > > + }
> > > > + }
> > > > +}
> > > > +
> > > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> > > > struct igb_adapter *adapter = netdev_priv(dev);
> > > > struct e1000_hw *hw = &adapter->hw;
> > > > struct igb_ring *ring;
> > > > - u32 eics = 0;
> > > >
> > > > if (test_bit(__IGB_DOWN, &adapter->state))
> > > > return -ENETDOWN;
> > > > @@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32
> > > > qid, u32
> > > > flags)
> > > > if (!READ_ONCE(ring->xsk_pool))
> > > > return -EINVAL;
> > > >
> > > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > > - /* Cause software interrupt */
> > > > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > - eics |= ring->q_vector->eims_value;
> > > > - wr32(E1000_EICS, eics);
> > > > - } else {
> > > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > - }
> > > > + if (flags & XDP_WAKEUP_TX)
> > > > + igb_sw_irq(ring->q_vector);
> > > > +
> > > > + if (flags & XDP_WAKEUP_RX) {
> > > > + ring = adapter->rx_ring[qid];
> > > > + /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> > > > + * been already marked with NAPIF_STATE_MISSED
> > > > + */
> > > I think you meant for the case IGB_FLAG_QUEUE_PAIRS. Since when the
> > > queue pairs are not active the Tx AND Rx queues don't share the same
> > > qvector and consequently not the same NAPI
> >
> > yes, correct
> >
> > > > + igb_sw_irq(ring->q_vector);
> > > Okay so you would be triggering soft irq's in two steps if both TX and RX flags
> > are set.
> > > Honestly, I have tried to avoid doing this in my patch. Which is the
> > > reason why I wait to finish all the error checks, Napi updates before triggering the
> > required irq vectors by writing to eics with a single write.
> > > But okay the other approach also works
> >
> >
> >
> > >
> > > > }
> > > >
> > > > return 0;
> > > >
> > > > ----------------->8-----------------
> > > >
> > > > > }
> > > > > }
> > > > > -
> > > > > return 0;
> > > > > }
> > > > > --
> > > > > 2.34.1
> > > > >
> > > I think the strategy of triggering interrupts in one step after performing all the
> > necessary checks is what might make this approach look complex.
> > > IMHO the one step strategy is better and more intuitive.
> > > Unfortunately, there isn't a reference here to go by since none of the
> > xsk_wakeup hooks implemented in the kernel care about the flags
> > > I can submit a v6 of the patch based the one step approach with further
> > simplifications. v6 would also include review suggestions I received for v5.
> > > Like this I can also submit the next version to the igc patch. It follows the same
> > logic as the igb
> > > I have our regression tests with RTC testbench and our Siemens Profinet RT
> > tester running with these patches with I210 and I226
> > >
> > > Alternatively, you could submit patches following the igb and igc following the
> > two-step logic.
> >
> > How about we meet the half way and something below? that would include
> > your request of having a single write to E1000_ICS.
> >
> Yes it sounds reasonable. However I would like to make some observations
> >
> > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > index 30ce5fbb5b77..432b4c7c1850 100644
> > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > @@ -524,6 +524,17 @@ bool igb_xmit_zc(struct igb_ring *tx_ring, struct
> > xsk_buff_pool *xsk_pool)
> > return nb_pkts < budget;
> > }
> >
> > +static u32 igb_sw_irq_prep(struct igb_q_vector *q_vector)
> > +{
> > + u32 eics = 0;
> > +
> > + if (!napi_if_scheduled_mark_missed(&q_vector->napi))
> > + eics = adapter->flags & IGB_FLAG_HAS_MSIX ?
> > + q_vector->eims_value : 1;
> > +
> > + return eics;
> > +}
> > +
> > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
> > {
> > struct igb_adapter *adapter = netdev_priv(dev);
> > @@ -548,14 +559,23 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> > flags)
> > if (!READ_ONCE(ring->xsk_pool))
> > return -EINVAL;
> If we want to implement proper usage of flags then I would move everything
> related to a ring's internal checks inside the if case of the corresponding if case.
> Use the correct adapter ring (rx or tx) to perform the checks.
> Even though the functionality wise this code is correct why just randomly pick the tx ring
> right at beginning of the function and do two checks with it? And same question would
> pop up to the reviewers mind for the igc driver where rx_ring is used. For me correct
> usage is more important than saving some lines of code in the patch
And for me code that will be easy to maintain is important.
We could move IGB_RING_FLAG_TX_DISABLED to XDP_WAKEUP_TX branch but rest
is generic. AF_XDP works on pairs of queues so pool is loaded on both tx
and rx pointed by given queue id. IOW xsk_pool will be present on both rx
and tx rings, queue id validation can stay as is as well as igb works on
'combined' channels.
FWIW I had some PoC in the past where I implemented in AF_XDP core support
for async sockets - where pool was loaded *only* on rx or tx side. Not
sure if TSN workloads would benefit from that?
> >
> > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > - /* Cause software interrupt */
> > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > - eics |= ring->q_vector->eims_value;
> > + if (flags & XDP_WAKEUP_TX)
> > + eics |= igb_sw_irq_prep(ring->q_vector);
> > +
> > + if (flags & XDP_WAKEUP_RX) {
> > + ring = adapter->rx_ring[qid];
> > + /* for IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> > + * been already marked with NAPIF_STATE_MISSED
> > + */
> > + eics |= igb_sw_irq_prep(ring->q_vector);
> > + }
> > +
> > + /* Cause software interrupt */
> > + if (eics) {
> > + if (adapter->flags & IGB_FLAG_HAS_MSIX)
> > wr32(E1000_EICS, eics);
> > - } else {
> > + else
> > wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > - }
> > }
> >
> > return 0;
> >
> > >
> > > Regards
> > >
> > > Vivek
>
> I would like to know your view about the next steps. Would you like me
> to include the changes we discussed in next version of the patch I
> submitted?
> Or would you like to submit the patch you have prepared?
Hmm...up to you. You can take my suggestions and add some tag, such as
Suggested-by.
Thanks,
Maciej
>
> Regards
>
> Vivek
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
2026-01-15 19:53 ` Maciej Fijalkowski
@ 2026-01-16 11:44 ` Behera, VIVEK via Intel-wired-lan
0 siblings, 0 replies; 14+ messages in thread
From: Behera, VIVEK via Intel-wired-lan @ 2026-01-16 11:44 UTC (permalink / raw)
To: Maciej Fijalkowski
Cc: aleksandr.loktionov@intel.com, jacob.e.keller@intel.com,
anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
sriram.yagnaraman@ericsson.com, kurt@linutronix.de,
intel-wired-lan@lists.osuosl.org, magnus.karlsson@intel.com
> -----Original Message-----
> From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> Sent: Thursday, January 15, 2026 8:53 PM
> To: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> sriram.yagnaraman@ericsson.com; kurt@linutronix.de; intel-wired-
> lan@lists.osuosl.org; magnus.karlsson@intel.com
> Subject: Re: AW: [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in
> igb_xsk_wakeup
>
> On Thu, Jan 15, 2026 at 11:05:37AM +0000, Behera, VIVEK wrote:
> > Hi Maciej
> >
> > > -----Original Message-----
> > > From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
> > > Sent: Wednesday, January 14, 2026 7:21 PM
> > > To: Behera, Vivek (DI FA DSP ICC PRC1) <vivek.behera@siemens.com>
> > > Cc: aleksandr.loktionov@intel.com; jacob.e.keller@intel.com;
> > > anthony.l.nguyen@intel.com; przemyslaw.kitszel@intel.com;
> > > sriram.yagnaraman@ericsson.com; kurt@linutronix.de; intel-wired-
> > > lan@lists.osuosl.org; magnus.karlsson@intel.com
> > > Subject: Re: AW: [PATCH iwl-net v5] igb: Fix trigger of incorrect
> > > irq in igb_xsk_wakeup
> > >
> > > On Wed, Jan 14, 2026 at 08:19:37AM +0000, Behera, VIVEK wrote:
> > >
> > > (...)
> > >
> > > > > >
> > > > > > diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > > b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > > index fa028928482f..9357564a2d58 100644
> > > > > > --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > > +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> > > > > > @@ -443,6 +443,7 @@
> > > > > > #define E1000_ICS_LSC E1000_ICR_LSC /* Link Status Change
> */
> > > > > > #define E1000_ICS_RXDMT0 E1000_ICR_RXDMT0 /* rx desc min.
> > > threshold
> > > > > */
> > > > > > #define E1000_ICS_DRSTA E1000_ICR_DRSTA /* Device Reset
> > > Aserted */
> > > > > > +#define E1000_ICS_TXDW E1000_ICR_TXDW /* Transmit desc
> > > written
> > > > > back */
> > > > > >
> > > > > > /* Extended Interrupt Cause Set */
> > > > > > /* E1000_EITR_CNT_IGNR is only for 82576 and newer */ diff
> > > > > > --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > > index 30ce5fbb5b77..6e51b5b6f131 100644
> > > > > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > > @@ -529,6 +529,13 @@ int igb_xsk_wakeup(struct net_device
> > > > > > *dev,
> > > > > > u32 qid, u32
> > > > > flags)
> > > > > > struct igb_adapter *adapter = netdev_priv(dev);
> > > > > > struct e1000_hw *hw = &adapter->hw;
> > > > > > struct igb_ring *ring;
> > > > > > + struct igb_q_vector *q_vector;
> > > > > > + struct napi_struct *rx_napi;
> > > > > > + struct napi_struct *tx_napi;
> > > > > > + bool trigger_irq_tx = false;
> > > > > > + bool trigger_irq_rx = false;
> > > > > > + u32 eics_tx = 0;
> > > > > > + u32 eics_rx = 0;
> > > > > > u32 eics = 0;
> > > > > >
> > > > > > if (test_bit(__IGB_DOWN, &adapter->state)) @@ -536,27
> > > > > > +543,65 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid,
> > > > > > u32 flags)
> > > > > >
> > > > > > if (!igb_xdp_is_enabled(adapter))
> > > > > > return -EINVAL;
> > > > > > -
> > > > > > - if (qid >= adapter->num_tx_queues)
> > > > > > + /* Check if queue_id is valid. Tx and Rx queue numbers are always
> same */
> > > > > > + if (qid >= adapter->num_rx_queues)
> > > > > > return -EINVAL;
> > > > > > -
> > > > > > - ring = adapter->tx_ring[qid];
> > > > > > -
> > > > > > - if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> > > > > > - return -ENETDOWN;
> > > > > > -
> > > > > > - if (!READ_ONCE(ring->xsk_pool))
> > > > > > + /* Check if flags are valid */
> > > > > > + if (!(flags & XDP_WAKEUP_RX) && !(flags & XDP_WAKEUP_TX))
> > > > > > return -EINVAL;
> > > > > > -
> > > > > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > > > > - /* Cause software interrupt */
> > > > > > + if (flags & XDP_WAKEUP_RX) {
> > > > > > + /* IRQ trigger preparation for Rx */
> > > > > > + ring = adapter->rx_ring[qid];
> > > > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > > > + return -ENXIO;
> > > > > > + q_vector = ring->q_vector;
> > > > > > + rx_napi = &q_vector->napi;
> > > > > > + /* Extend the BIT mask for eics */
> > > > > > + eics_rx = ring->q_vector->eims_value;
> > > > > > + trigger_irq_rx = true;
> > > > > > + }
> > > > > > + if (flags & XDP_WAKEUP_TX) {
> > > > > > + if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> > > > > > + /* In queue-pair mode, rx_ring and tx_ring share the same
> > > q_vector,
> > > > > > + * so a single IRQ trigger will wake both RX and TX
> processing
> > > > > > + */
> > > > > > + } else {
> > > > > > + /* IRQ trigger preparation for Tx */
> > > > > > + ring = adapter->tx_ring[qid];
> > > > > > + if (test_bit(IGB_RING_FLAG_TX_DISABLED,
> &ring-
> > > >flags))
> > > > > > + return -ENETDOWN;
> > > > > > +
> > > > > > + if (!READ_ONCE(ring->xsk_pool))
> > > > > > + return -ENXIO;
> > > > > > + q_vector = ring->q_vector;
> > > > > > + tx_napi = &q_vector->napi;
> > > > > > + /* Extend the BIT mask for eics */
> > > > > > + eics_tx = ring->q_vector->eims_value;
> > > > > > + trigger_irq_tx = true;
> > > > > > + }
> > > > > > + }
> > > > > > + /* All error checks are finished. Check and update napi
> > > > > > +states for rx and tx
> > > */
> > > > > > + if (trigger_irq_rx) {
> > > > > > + if (!napi_if_scheduled_mark_missed(rx_napi))
> > > > > > + eics |= eics_rx;
> > > > > > + }
> > > > > > + if (trigger_irq_tx) {
> > > > > > + if (!napi_if_scheduled_mark_missed(tx_napi))
> > > > > > + eics |= eics_tx;
> > > > > > + }
> > > > > > + /* Now we trigger the required irqs for Rx and Tx */
> > > > > > + if ((trigger_irq_rx) || (trigger_irq_tx)) {
> > > > > > if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > > > - eics |= ring->q_vector->eims_value;
> > > > > > wr32(E1000_EICS, eics);
> > > > > > } else {
> > > > > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > > > + if ((trigger_irq_rx) && (trigger_irq_tx))
> > > > > > + wr32(E1000_ICS,
> > > > > > + E1000_ICS_RXDMT0 |
> E1000_ICS_TXDW);
> > > > > > + else if (trigger_irq_rx)
> > > > > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > > > + else
> > > > > > + wr32(E1000_ICS, E1000_ICS_TXDW);
> > > > >
> > > > > My understanding is something below would be sufficient. Bits
> > > > > set on E1000_ICS are not handled in any way so we don't have to
> > > > > distinguish between rx/tx, it's just the matter of irq trigger and napi
> schedule.
> > > > >
> > > > Hi see my comments below
> > > > > -----------------8<-----------------
> > > > >
> > > > > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > index 30ce5fbb5b77..0aba7afd6a03 100644
> > > > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > > > @@ -524,12 +524,26 @@ bool igb_xmit_zc(struct igb_ring *tx_ring,
> > > > > struct xsk_buff_pool *xsk_pool)
> > > > > return nb_pkts < budget;
> > > > > }
> > > > >
> > > > > +static void igb_sw_irq(struct igb_q_vector *q_vector) {
> > > > > + u32 eics = 0;
> > > > > +
> > > > > + if (!napi_if_scheduled_mark_missed(&q_vector->napi)) {
> > > > > + /* Cause software interrupt */
> > > > > + if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > > + eics |= ring->q_vector->eims_value;
> > > > > + wr32(E1000_EICS, eics);
> > > > > + } else {
> > > > > + wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > So here it is sufficient to rely on the E1000_ICS_RXDMT0 bit to
> > > > trigger the
> > > correct irq (Tx and Rx)?
> > > > I remember I received a review comment from Intel point to
> > > > E1000_ICS_TXDW
> > > as being the correct bit of triggering TX for non MSIX case.
> > > > I can't really evaluate this since I don't have a setup to test this.
> > > > But okay
> > >
> > > I don't see in irq handlers that we do any specific handling for
> > > txdw vs rxdmt0. It's rather a matter of getting an irq here.
> > Yes amongst the interrupt cause checks implemented in the msi
> > interrupt handler there is no interest in either E1000_ICR_RXDMT0
> E1000_ICR_TXDW as events.
> > All that matters in this in this case is a softirq trigerring
> > napi_schedule . So we can stick to E1000_ICS_RXDMT0 irrespective of
> > the flag in the xsk wakeup function. Although I must mention that all
> > other usages of E1000_ICS_RXDMT0 in the kernel are in combination with
> rx_ring so from the code perspective this usage looks strange to someone without
> deeper knowledge of the system.
> > >
> > > > > + }
> > > > > + }
> > > > > +}
> > > > > +
> > > > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> > > > > struct igb_adapter *adapter = netdev_priv(dev);
> > > > > struct e1000_hw *hw = &adapter->hw;
> > > > > struct igb_ring *ring;
> > > > > - u32 eics = 0;
> > > > >
> > > > > if (test_bit(__IGB_DOWN, &adapter->state))
> > > > > return -ENETDOWN;
> > > > > @@ -548,14 +562,15 @@ int igb_xsk_wakeup(struct net_device *dev,
> > > > > u32 qid, u32
> > > > > flags)
> > > > > if (!READ_ONCE(ring->xsk_pool))
> > > > > return -EINVAL;
> > > > >
> > > > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > > > - /* Cause software interrupt */
> > > > > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > > > - eics |= ring->q_vector->eims_value;
> > > > > - wr32(E1000_EICS, eics);
> > > > > - } else {
> > > > > - wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > > > - }
> > > > > + if (flags & XDP_WAKEUP_TX)
> > > > > + igb_sw_irq(ring->q_vector);
> > > > > +
> > > > > + if (flags & XDP_WAKEUP_RX) {
> > > > > + ring = adapter->rx_ring[qid];
> > > > > + /* for !IGB_FLAG_QUEUE_PAIRS, this will be NOP as
> NAPI has
> > > > > + * been already marked with NAPIF_STATE_MISSED
> > > > > + */
> > > > I think you meant for the case IGB_FLAG_QUEUE_PAIRS. Since when
> > > > the queue pairs are not active the Tx AND Rx queues don't share
> > > > the same qvector and consequently not the same NAPI
> > >
> > > yes, correct
> > >
> > > > > + igb_sw_irq(ring->q_vector);
> > > > Okay so you would be triggering soft irq's in two steps if both TX
> > > > and RX flags
> > > are set.
> > > > Honestly, I have tried to avoid doing this in my patch. Which is
> > > > the reason why I wait to finish all the error checks, Napi updates
> > > > before triggering the
> > > required irq vectors by writing to eics with a single write.
> > > > But okay the other approach also works
> > >
> > >
> > >
> > > >
> > > > > }
> > > > >
> > > > > return 0;
> > > > >
> > > > > ----------------->8-----------------
> > > > >
> > > > > > }
> > > > > > }
> > > > > > -
> > > > > > return 0;
> > > > > > }
> > > > > > --
> > > > > > 2.34.1
> > > > > >
> > > > I think the strategy of triggering interrupts in one step after
> > > > performing all the
> > > necessary checks is what might make this approach look complex.
> > > > IMHO the one step strategy is better and more intuitive.
> > > > Unfortunately, there isn't a reference here to go by since none of
> > > > the
> > > xsk_wakeup hooks implemented in the kernel care about the flags
> > > > I can submit a v6 of the patch based the one step approach with
> > > > further
> > > simplifications. v6 would also include review suggestions I received for v5.
> > > > Like this I can also submit the next version to the igc patch. It
> > > > follows the same
> > > logic as the igb
> > > > I have our regression tests with RTC testbench and our Siemens
> > > > Profinet RT
> > > tester running with these patches with I210 and I226
> > > >
> > > > Alternatively, you could submit patches following the igb and igc
> > > > following the
> > > two-step logic.
> > >
> > > How about we meet the half way and something below? that would
> > > include your request of having a single write to E1000_ICS.
> > >
> > Yes it sounds reasonable. However I would like to make some
> > observations
> > >
> > > diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > index 30ce5fbb5b77..432b4c7c1850 100644
> > > --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> > > @@ -524,6 +524,17 @@ bool igb_xmit_zc(struct igb_ring *tx_ring,
> > > struct xsk_buff_pool *xsk_pool)
> > > return nb_pkts < budget;
> > > }
> > >
> > > +static u32 igb_sw_irq_prep(struct igb_q_vector *q_vector) {
> > > + u32 eics = 0;
> > > +
> > > + if (!napi_if_scheduled_mark_missed(&q_vector->napi))
> > > + eics = adapter->flags & IGB_FLAG_HAS_MSIX ?
> > > + q_vector->eims_value : 1;
> > > +
> > > + return eics;
> > > +}
> > > +
> > > int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags) {
> > > struct igb_adapter *adapter = netdev_priv(dev); @@ -548,14 +559,23
> > > @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32
> > > flags)
> > > if (!READ_ONCE(ring->xsk_pool))
> > > return -EINVAL;
> > If we want to implement proper usage of flags then I would move
> > everything related to a ring's internal checks inside the if case of the
> corresponding if case.
> > Use the correct adapter ring (rx or tx) to perform the checks.
> > Even though the functionality wise this code is correct why just
> > randomly pick the tx ring right at beginning of the function and do
> > two checks with it? And same question would pop up to the reviewers
> > mind for the igc driver where rx_ring is used. For me correct usage is
> > more important than saving some lines of code in the patch
>
> And for me code that will be easy to maintain is important.
>
> We could move IGB_RING_FLAG_TX_DISABLED to XDP_WAKEUP_TX branch
> but rest is generic. AF_XDP works on pairs of queues so pool is loaded on both
> tx and rx pointed by given queue id. IOW xsk_pool will be present on both rx and
> tx rings, queue id validation can stay as is as well as igb works on 'combined'
> channels.
Yes. There was never any confusion here that the xsk_buff_pool created by the xdp/xsk subsystem is for both tx and rx
and is coupled to the netdev+queue_id combination.
Also to your point regarding igb working only with combined queues is something
I have pointed out several times during the review of my patches and to make it more clear
I had in fact added a comment to my patch.
Despite the fact that usage of rx_ring in one function ( igb_xsk_pool_disable) and tx_ring here to get the xsk_pool is irritating
I can agree can keep the existing xsk pool's validity check.
>
> FWIW I had some PoC in the past where I implemented in AF_XDP core support
> for async sockets - where pool was loaded *only* on rx or tx side. Not sure if TSN
> workloads would benefit from that?
I remember this or a similar topic we in siemens discussed with Intel a few years ago which
Included new hooks in a netdev's xdp driver and extensions to libxdp .
I don't recall any discussion regarding plans for upstreaming this
In some TSN use-cases especially with extremely low jitter tolerances
a async mode with tx only traffic is indeed essential.
>
> > >
> > > - if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
> > > - /* Cause software interrupt */
> > > - if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> > > - eics |= ring->q_vector->eims_value;
> > > + if (flags & XDP_WAKEUP_TX)
> > > + eics |= igb_sw_irq_prep(ring->q_vector);
> > > +
> > > + if (flags & XDP_WAKEUP_RX) {
> > > + ring = adapter->rx_ring[qid];
> > > + /* for IGB_FLAG_QUEUE_PAIRS, this will be NOP as NAPI has
> > > + * been already marked with NAPIF_STATE_MISSED
> > > + */
> > > + eics |= igb_sw_irq_prep(ring->q_vector);
> > > + }
> > > +
> > > + /* Cause software interrupt */
> > > + if (eics) {
> > > + if (adapter->flags & IGB_FLAG_HAS_MSIX)
> > > wr32(E1000_EICS, eics);
> > > - } else {
> > > + else
> > > wr32(E1000_ICS, E1000_ICS_RXDMT0);
> > > - }
> > > }
> > >
> > > return 0;
> > >
> > > >
> > > > Regards
> > > >
> > > > Vivek
> >
> > I would like to know your view about the next steps. Would you like me
> > to include the changes we discussed in next version of the patch I
> > submitted?
> > Or would you like to submit the patch you have prepared?
>
> Hmm...up to you. You can take my suggestions and add some tag, such as
> Suggested-by.
Okay. I will prepare and send out the next version of the patch
Thanks for your suggestions
Vivek
>
> Thanks,
> Maciej
>
> >
> > Regards
> >
> > Vivek
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2026-01-16 11:44 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-12 13:03 [Intel-wired-lan] [PATCH iwl-net v5] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
2026-01-12 14:11 ` Paul Menzel
2026-01-13 6:58 ` Behera, VIVEK via Intel-wired-lan
2026-01-12 14:53 ` Kwapulinski, Piotr
2026-01-13 6:55 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 4:04 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 9:00 ` Loktionov, Aleksandr
2026-01-13 10:02 ` Behera, VIVEK via Intel-wired-lan
2026-01-13 11:41 ` Maciej Fijalkowski
2026-01-14 8:19 ` Behera, VIVEK via Intel-wired-lan
2026-01-14 18:20 ` Maciej Fijalkowski
2026-01-15 11:05 ` Behera, VIVEK via Intel-wired-lan
2026-01-15 19:53 ` Maciej Fijalkowski
2026-01-16 11:44 ` Behera, VIVEK via Intel-wired-lan
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.