Intel-Wired-Lan Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
To: Vivek Behera <vivek.behera@siemens.com>
Cc: <aleksandr.loktionov@intel.com>, <jacob.e.keller@intel.com>,
	<anthony.l.nguyen@intel.com>, <przemyslaw.kitszel@intel.com>,
	<sriram.yagnaraman@ericsson.com>, <kurt@linutronix.de>,
	<intel-wired-lan@lists.osuosl.org>
Subject: Re: [Intel-wired-lan] [PATCH iwl-net v4] igb: Fix trigger of incorrect irq in igb_xsk_wakeup
Date: Fri, 9 Jan 2026 01:14:56 +0100	[thread overview]
Message-ID: <aWBIgOaRG50IuJsU@boxer> (raw)
In-Reply-To: <20251222115747.230521-1-vivek.behera@siemens.com>

On Mon, Dec 22, 2025 at 12:57:47PM +0100, Vivek Behera wrote:
> The current implementation in the igb_xsk_wakeup expects the Rx and Tx queues
> to share the same irq. This would lead to triggering of incorrect irq
> in split irq configuration.
> This patch addresses this issue which could impact environments
> with 2 active cpu cores
> or when the number of queues is reduced to 2 or less
> 
> cat /proc/interrupts | grep eno2
>  167:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
>  0-edge      eno2
>  168:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
>  1-edge      eno2-rx-0
>  169:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
>  2-edge      eno2-rx-1
>  170:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
>  3-edge      eno2-tx-0
>  171:          0          0          0          0 IR-PCI-MSIX-0000:08:00.0
>  4-edge      eno2-tx-1
> 
> Furthermore it uses the flags input argument to trigger either rx, tx or
> both rx and tx irqs as specified in the ndo_xsk_wakeup api documentation
> 
> Fixes: 80f6ccf9f116 ("igb: Introduce XSK data structures and helpers")
> Signed-off-by: Vivek Behera <vivek.behera@siemens.com>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> ---
> v1: https://lore.kernel.org/intel-wired-lan/20251212131454.124116-1-vivek.behera@siemens.com/
> v2: https://lore.kernel.org/intel-wired-lan/20251215115416.410619-1-vivek.behera@siemens.com/
> v3: https://lore.kernel.org/intel-wired-lan/20251220114936.140473-1-vivek.behera@siemens.com/
> 
> changelog:
> v1
> - Inital description of the Bug and fixes made in the patch
> 
> v1 -> v2
> - Handling of RX and TX Wakeup in igc_xsk_wakeup for a split IRQ configuration
> - Review suggestions by Aleksander: Modified sequence to complete all
>   error checks for rx and tx before updating napi states and triggering irqs
> - Corrected trigger of TX and RX interrupts over E1000_ICS (non msix use case)
> - Added define for Tx interrupt trigger bit mask for E1000_ICS
> 
> v2 -> v3
> - Included applicable feedback and suggestions from igc patch
> - Fixed logic in updating eics value when  both TX and RX need wakeup
> 
> v3 -> v4
> - Added comments to explain trigerring of both TX and RX with active queue pairs
> - Fixed check of xsk pools in if statement
> ---
>  .../net/ethernet/intel/igb/e1000_defines.h    |  1 +
>  drivers/net/ethernet/intel/igb/igb_xsk.c      | 90 +++++++++++++++++--
>  2 files changed, 83 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/igb/e1000_defines.h b/drivers/net/ethernet/intel/igb/e1000_defines.h
> index fa028928482f..9357564a2d58 100644
> --- a/drivers/net/ethernet/intel/igb/e1000_defines.h
> +++ b/drivers/net/ethernet/intel/igb/e1000_defines.h
> @@ -443,6 +443,7 @@
>  #define E1000_ICS_LSC       E1000_ICR_LSC       /* Link Status Change */
>  #define E1000_ICS_RXDMT0    E1000_ICR_RXDMT0    /* rx desc min. threshold */
>  #define E1000_ICS_DRSTA     E1000_ICR_DRSTA     /* Device Reset Aserted */
> +#define E1000_ICS_TXDW      E1000_ICR_TXDW	/* Transmit desc written back */
>  
>  /* Extended Interrupt Cause Set */
>  /* E1000_EITR_CNT_IGNR is only for 82576 and newer */
> diff --git a/drivers/net/ethernet/intel/igb/igb_xsk.c b/drivers/net/ethernet/intel/igb/igb_xsk.c
> index 30ce5fbb5b77..1d21674c0f33 100644
> --- a/drivers/net/ethernet/intel/igb/igb_xsk.c
> +++ b/drivers/net/ethernet/intel/igb/igb_xsk.c
> @@ -529,6 +529,7 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>  	struct igb_adapter *adapter = netdev_priv(dev);
>  	struct e1000_hw *hw = &adapter->hw;
>  	struct igb_ring *ring;
> +	struct igb_q_vector *q_vector;
>  	u32 eics = 0;
>  
>  	if (test_bit(__IGB_DOWN, &adapter->state))
> @@ -536,14 +537,82 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>  
>  	if (!igb_xdp_is_enabled(adapter))
>  		return -EINVAL;
> -
> -	if (qid >= adapter->num_tx_queues)
> +	/* Check if queue_id is valid. Tx and Rx queue numbers are always same */
> +	if (qid >= adapter->num_rx_queues)
>  		return -EINVAL;
>  
> -	ring = adapter->tx_ring[qid];
> -
> -	if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> -		return -ENETDOWN;
> +	if ((flags & XDP_WAKEUP_RX) && (flags & XDP_WAKEUP_TX)) {
> +		/* If both TX and RX need to be woken up check if queue pairs are active */
> +		if (adapter->flags & IGB_FLAG_QUEUE_PAIRS) {
> +			/* In queue-pair mode, rx_ring and tx_ring share the same q_vector,
> +			 * so a single IRQ trigger will wake both RX and TX processing
> +			 */
> +			ring = adapter->rx_ring[qid];
> +		} else {
> +			/* Two irqs for Rx AND Tx need to be triggered */
> +			struct napi_struct *rx_napi;
> +			struct napi_struct *tx_napi;
> +			bool trigger_irq_tx = false;
> +			bool trigger_irq_rx = false;
> +			u32 eics_tx = 0;
> +			u32 eics_rx = 0;
> +			/* IRQ trigger preparation for Rx */
> +			ring = adapter->rx_ring[qid];
> +			if (!READ_ONCE(ring->xsk_pool))
> +				return -ENXIO;
> +			q_vector = ring->q_vector;
> +			rx_napi  = &q_vector->napi;
> +			/* Extend the BIT mask for eics */
> +			eics_rx = ring->q_vector->eims_value;
> +
> +			/* IRQ trigger preparation for Tx */
> +			ring = adapter->tx_ring[qid];
> +			if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> +				return -ENETDOWN;
> +
> +			if (!READ_ONCE(ring->xsk_pool))
> +				return -ENXIO;
> +			q_vector = ring->q_vector;
> +			tx_napi  = &q_vector->napi;
> +			/* Extend the BIT mask for eics */
> +			eics_tx = ring->q_vector->eims_value;
> +
> +			/* Check and update napi states for rx and tx */
> +			if (!napi_if_scheduled_mark_missed(rx_napi)) {
> +				trigger_irq_rx = true;
> +				eics |= eics_rx;
> +			}
> +			if (!napi_if_scheduled_mark_missed(tx_napi)) {
> +				trigger_irq_tx = true;
> +				eics |= eics_tx;
> +			}
> +			/* Now we trigger the required irqs for Rx and Tx */
> +			if ((trigger_irq_rx) || (trigger_irq_tx)) {
> +				if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> +					wr32(E1000_EICS, eics);
> +				} else {
> +					if ((trigger_irq_rx) && (trigger_irq_tx))
> +						wr32(E1000_ICS, E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> +					else if (trigger_irq_rx)
> +						wr32(E1000_ICS, E1000_ICS_RXDMT0);
> +					else
> +						wr32(E1000_ICS, E1000_ICS_TXDW);
> +				}
> +			}
> +			return 0;
> +		}
> +	} else if (flags & XDP_WAKEUP_TX) {
> +		/* Get the ring params from Tx */
> +		ring = adapter->tx_ring[qid];
> +		if (test_bit(IGB_RING_FLAG_TX_DISABLED, &ring->flags))
> +			return -ENETDOWN;
> +	} else if (flags & XDP_WAKEUP_RX) {
> +		/* Get the ring params from Rx */
> +		ring = adapter->rx_ring[qid];
> +	} else {
> +		/* Invalid Flags */
> +		return -EINVAL;
> +	}

This is too complicated IMHO. Wouldn't something like this work:
- if wakeup rx, pick napi from rx ring's q_vector
  * napi_if_scheduled_mark_missed() logic that exists in igc_xsk_wakeup()

repeat for tx; if IGB_FLAG_QUEUE_PAIRS then the branch of second
napi_if_scheduled_mark_missed() call would not be executed as we had
previously marked the missed bit in napi state;

>  
>  	if (!READ_ONCE(ring->xsk_pool))
>  		return -EINVAL;
> @@ -551,10 +620,15 @@ int igb_xsk_wakeup(struct net_device *dev, u32 qid, u32 flags)
>  	if (!napi_if_scheduled_mark_missed(&ring->q_vector->napi)) {
>  		/* Cause software interrupt */
>  		if (adapter->flags & IGB_FLAG_HAS_MSIX) {
> -			eics |= ring->q_vector->eims_value;
> +			eics = ring->q_vector->eims_value;
>  			wr32(E1000_EICS, eics);
>  		} else {
> -			wr32(E1000_ICS, E1000_ICS_RXDMT0);
> +			if ((flags & XDP_WAKEUP_RX) && (flags & XDP_WAKEUP_TX))
> +				wr32(E1000_ICS, E1000_ICS_RXDMT0 | E1000_ICS_TXDW);
> +			else if (flags & XDP_WAKEUP_RX)
> +				wr32(E1000_ICS, E1000_ICS_RXDMT0);
> +			else
> +				wr32(E1000_ICS, E1000_ICS_TXDW);
>  		}
>  	}
>  
> -- 
> 2.34.1
> 

  reply	other threads:[~2026-01-09  0:15 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-22 11:57 [Intel-wired-lan] [PATCH iwl-net v4] igb: Fix trigger of incorrect irq in igb_xsk_wakeup Vivek Behera via Intel-wired-lan
2026-01-09  0:14 ` Maciej Fijalkowski [this message]
2026-01-11 14:27   ` Behera, VIVEK via Intel-wired-lan
2026-01-12 11:38   ` Loktionov, Aleksandr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWBIgOaRG50IuJsU@boxer \
    --to=maciej.fijalkowski@intel.com \
    --cc=aleksandr.loktionov@intel.com \
    --cc=anthony.l.nguyen@intel.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=kurt@linutronix.de \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=sriram.yagnaraman@ericsson.com \
    --cc=vivek.behera@siemens.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox