public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Joe Damato <jdamato@fastly.com>
To: Ahmed Zaki <ahmed.zaki@intel.com>
Cc: netdev@vger.kernel.org, intel-wired-lan@lists.osuosl.org,
	andrew+netdev@lunn.ch, edumazet@google.com, kuba@kernel.org,
	horms@kernel.org, pabeni@redhat.com, davem@davemloft.net,
	michael.chan@broadcom.com, tariqt@nvidia.com,
	anthony.l.nguyen@intel.com, przemyslaw.kitszel@intel.com,
	shayd@nvidia.com, akpm@linux-foundation.org, shayagr@amazon.com,
	kalesh-anakkur.purayil@broadcom.com
Subject: Re: [PATCH net-next v6 2/5] net: napi: add CPU affinity to napi_config
Date: Thu, 23 Jan 2025 12:18:24 -0800	[thread overview]
Message-ID: <Z5KkEF-2NiX4SuB_@LQ3V64L9R2> (raw)
In-Reply-To: <20250118003335.155379-3-ahmed.zaki@intel.com>

On Fri, Jan 17, 2025 at 05:33:32PM -0700, Ahmed Zaki wrote:
> A common task for most drivers is to remember the user-set CPU affinity
> to its IRQs. On each netdev reset, the driver should re-assign the
> user's settings to the IRQs.
> 
> Add CPU affinity mask to napi_config. To delegate the CPU affinity
> management to the core, drivers must:
>  1 - set the new netdev flag "irq_affinity_auto":
>                                        netif_enable_irq_affinity(netdev)
>  2 - create the napi with persistent config:
>                                        netif_napi_add_config()
>  3 - bind an IRQ to the napi instance: netif_napi_set_irq()
> 
> the core will then make sure to use re-assign affinity to the napi's
> IRQ.
> 
> The default IRQ mask is set to one cpu starting from the closest NUMA.

Maybe the above is helpful to add to
Documentation/networking/napi.rst ?

> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
> ---
>  include/linux/netdevice.h | 14 ++++++++++-
>  net/core/dev.c            | 51 +++++++++++++++++++++++++++++----------
>  2 files changed, 51 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 98259f19c627..d576e5c91c43 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -351,6 +351,7 @@ struct napi_config {
>  	u64 gro_flush_timeout;
>  	u64 irq_suspend_timeout;
>  	u32 defer_hard_irqs;
> +	cpumask_t affinity_mask;
>  	unsigned int napi_id;
>  };
>  
> @@ -393,8 +394,8 @@ struct napi_struct {
>  	struct list_head	dev_list;
>  	struct hlist_node	napi_hash_node;
>  	int			irq;
> -#ifdef CONFIG_RFS_ACCEL
>  	struct irq_affinity_notify notify;
> +#ifdef CONFIG_RFS_ACCEL
>  	int			napi_rmap_idx;
>  #endif
>  	int			index;
> @@ -1991,6 +1992,11 @@ enum netdev_reg_state {
>   *
>   *	@threaded:	napi threaded mode is enabled
>   *
> + *	@irq_affinity_auto: driver wants the core to manage the IRQ affinity.
> + *			    Set by netif_enable_irq_affinity(), then driver must
> + *			    create persistent napi by netif_napi_add_config()
> + *			    and finally bind napi to IRQ (netif_napi_set_irq).
> + *
>   *	@rx_cpu_rmap_auto: driver wants the core to manage the ARFS rmap.
>   *	                   Set by calling netif_enable_cpu_rmap().
>   *
> @@ -2401,6 +2407,7 @@ struct net_device {
>  	struct lock_class_key	*qdisc_tx_busylock;
>  	bool			proto_down;
>  	bool			threaded;
> +	bool			irq_affinity_auto;
>  	bool			rx_cpu_rmap_auto;
>  
>  	/* priv_flags_slow, ungrouped to save space */
> @@ -2653,6 +2660,11 @@ static inline void netdev_set_ml_priv(struct net_device *dev,
>  	dev->ml_priv_type = type;
>  }
>  
> +static inline void netif_enable_irq_affinity(struct net_device *dev)
> +{
> +	dev->irq_affinity_auto = true;
> +}

I'll have to look at the patches which use the above function, but
the first thing that came to mind when I saw this was does the above
need a WRITE_ONCE ?

The reads below seem to be protected by a lock; I haven't yet looked
at the other patches so maybe the write is also protected by
netdev->lock ?

>  /*
>   * Net namespace inlines
>   */
> diff --git a/net/core/dev.c b/net/core/dev.c
> index dbb63005bc2b..bc82c7f621b3 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c

[...]

>  
> +#ifdef CONFIG_RFS_ACCEL
>  static void netif_napi_affinity_release(struct kref *ref)
>  {
>  	struct napi_struct *napi =
> @@ -6901,7 +6908,7 @@ static int napi_irq_cpu_rmap_add(struct napi_struct *napi, int irq)
>  	if (!rmap)
>  		return -EINVAL;
>  
> -	napi->notify.notify = netif_irq_cpu_rmap_notify;
> +	napi->notify.notify = netif_napi_irq_notify;

Same question as previous patch: does it make sense to only set the
callbacks below when all other operations have succeeded?

>  	napi->notify.release = netif_napi_affinity_release;
>  	cpu_rmap_get(rmap);
>  	rc = cpu_rmap_add(rmap, napi);

[...]

> @@ -6976,23 +6987,28 @@ void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
>  {
>  	int rc;
>  
> -	if (!napi->dev->rx_cpu_rmap_auto)
> -		goto out;
> -
> -	/* Remove existing rmap entries */
> -	if (napi->irq != irq && napi->irq > 0)
> +	/* Remove existing resources */
> +	if ((napi->dev->rx_cpu_rmap_auto || napi->dev->irq_affinity_auto) &&
> +	    napi->irq > 0 && napi->irq != irq)
>  		irq_set_affinity_notifier(napi->irq, NULL);
>  
> -	if (irq > 0) {
> +	if (irq > 0 && napi->dev->rx_cpu_rmap_auto) {
>  		rc = napi_irq_cpu_rmap_add(napi, irq);
>  		if (rc) {
>  			netdev_warn(napi->dev, "Unable to update ARFS map (%d)\n",
>  				    rc);
>  			netif_disable_cpu_rmap(napi->dev);
>  		}
> +	} else if (irq > 0 && napi->config && napi->dev->irq_affinity_auto) {
> +		napi->notify.notify = netif_napi_irq_notify;
> +		napi->notify.release = netif_napi_affinity_release;
> +
> +		rc = irq_set_affinity_notifier(irq, &napi->notify);
> +		if (rc)
> +			netdev_warn(napi->dev, "Unable to set IRQ notifier (%d)\n",
> +				    rc);

I see now that my comments on the previous patch are stale after
this patch is applied. I wonder if the "irq > 0" part can be pulled
out to simplify the branches here?

  parent reply	other threads:[~2025-01-23 20:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-18  0:33 [PATCH net-next v6 0/5] net: napi: add CPU affinity to napi->config Ahmed Zaki
2025-01-18  0:33 ` [PATCH net-next v6 1/5] net: move ARFS rmap management to core Ahmed Zaki
2025-01-21  0:59   ` Jakub Kicinski
2025-01-21 14:52     ` Ahmed Zaki
2025-01-23 19:28   ` Joe Damato
2025-01-23 20:13     ` Ahmed Zaki
2025-01-23 20:20       ` Joe Damato
2025-01-18  0:33 ` [PATCH net-next v6 2/5] net: napi: add CPU affinity to napi_config Ahmed Zaki
2025-01-21  1:03   ` Jakub Kicinski
2025-01-23 20:18   ` Joe Damato [this message]
2025-02-03 21:32     ` Ahmed Zaki
2025-01-18  0:33 ` [PATCH net-next v6 3/5] bnxt: use napi's irq affinity Ahmed Zaki
2025-01-18  0:33 ` [PATCH net-next v6 4/5] ice: " Ahmed Zaki
2025-01-18  0:33 ` [PATCH net-next v6 5/5] idpf: " Ahmed Zaki
2025-01-21  1:03 ` [PATCH net-next v6 0/5] net: napi: add CPU affinity to napi->config Jakub Kicinski
2025-01-21 14:54   ` Ahmed Zaki
2025-01-23 20:27 ` Joe Damato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5KkEF-2NiX4SuB_@LQ3V64L9R2 \
    --to=jdamato@fastly.com \
    --cc=ahmed.zaki@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew+netdev@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=kalesh-anakkur.purayil@broadcom.com \
    --cc=kuba@kernel.org \
    --cc=michael.chan@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=shayagr@amazon.com \
    --cc=shayd@nvidia.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox