From: "Arinzon, David" <darinzon@amazon.com>
To: Ahmed Zaki <ahmed.zaki@intel.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"jdamato@fastly.com" <jdamato@fastly.com>
Cc: "intel-wired-lan@lists.osuosl.org"
<intel-wired-lan@lists.osuosl.org>,
"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
"edumazet@google.com" <edumazet@google.com>,
"kuba@kernel.org" <kuba@kernel.org>,
"horms@kernel.org" <horms@kernel.org>,
"pabeni@redhat.com" <pabeni@redhat.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"michael.chan@broadcom.com" <michael.chan@broadcom.com>,
"tariqt@nvidia.com" <tariqt@nvidia.com>,
"anthony.l.nguyen@intel.com" <anthony.l.nguyen@intel.com>,
"przemyslaw.kitszel@intel.com" <przemyslaw.kitszel@intel.com>,
"jdamato@fastly.com" <jdamato@fastly.com>,
"shayd@nvidia.com" <shayd@nvidia.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Allen, Neil" <shayagr@amazon.com>
Subject: Re: [Intel-wired-lan] [PATCH net-next v9 2/6] net: ena: use napi's aRFS rmap notifers
Date: Wed, 5 Mar 2025 06:33:38 +0000 [thread overview]
Message-ID: <abc6a4b765f84eb09efd7b10a62c4391@amazon.com> (raw)
In-Reply-To: <54f50b81-7361-4140-8b88-acd765fd8f49@intel.com>
> [RE-SEND] I just realized I sent this only to iwl, sorry for spamming.
>
>
> On 2025-03-03 10:11 a.m., Arinzon, David wrote:
> >> Use the core's rmap notifiers and delete our own.
> >>
> >> Acked-by: David Arinzon <darinzon@amazon.com>
> >> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
> >> ---
> >> drivers/net/ethernet/amazon/ena/ena_netdev.c | 43 +-------------------
> >> 1 file changed, 1 insertion(+), 42 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> index c1295dfad0d0..6aab85a7c60a 100644
> >> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> @@ -5,9 +5,6 @@
> >>
> >> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> -#include <linux/cpu_rmap.h>
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> #include <linux/ethtool.h>
> >> #include <linux/kernel.h>
> >> #include <linux/module.h>
> >> @@ -162,30 +159,6 @@ int ena_xmit_common(struct ena_adapter
> *adapter,
> >> return 0;
> >> }
> >>
> >> -static int ena_init_rx_cpu_rmap(struct ena_adapter *adapter) -{
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - u32 i;
> >> - int rc;
> >> -
> >> - adapter->netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter-
> >>> num_io_queues);
> >> - if (!adapter->netdev->rx_cpu_rmap)
> >> - return -ENOMEM;
> >> - for (i = 0; i < adapter->num_io_queues; i++) {
> >> - int irq_idx = ENA_IO_IRQ_IDX(i);
> >> -
> >> - rc = irq_cpu_rmap_add(adapter->netdev->rx_cpu_rmap,
> >> - pci_irq_vector(adapter->pdev, irq_idx));
> >> - if (rc) {
> >> - free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
> >> - adapter->netdev->rx_cpu_rmap = NULL;
> >> - return rc;
> >> - }
> >> - }
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> - return 0;
> >> -}
> >> -
> >> static void ena_init_io_rings_common(struct ena_adapter *adapter,
> >> struct ena_ring *ring, u16 qid)
> >> { @@ -1596,7 +1569,7 @@ static int ena_enable_msix(struct ena_adapter
> *adapter)
> >> adapter->num_io_queues = irq_cnt - ENA_ADMIN_MSIX_VEC;
> >> }
> >>
> >> - if (ena_init_rx_cpu_rmap(adapter))
> >> + if (netif_enable_cpu_rmap(adapter->netdev,
> >> + adapter->num_io_queues))
> >> netif_warn(adapter, probe, adapter->netdev,
> >> "Failed to map IRQs to CPUs\n");
> >>
> >> @@ -1742,13 +1715,6 @@ static void ena_free_io_irq(struct ena_adapter
> >> *adapter)
> >> struct ena_irq *irq;
> >> int i;
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - if (adapter->msix_vecs >= 1) {
> >> - free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
> >> - adapter->netdev->rx_cpu_rmap = NULL;
> >> - }
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> -
> >> for (i = ENA_IO_IRQ_FIRST_IDX; i <
> >> ENA_MAX_MSIX_VEC(io_queue_count); i++) {
> >> irq = &adapter->irq_tbl[i];
> >> irq_set_affinity_hint(irq->vector, NULL); @@
> >> -4131,13 +4097,6 @@ static void __ena_shutoff(struct pci_dev *pdev,
> bool shutdown)
> >> ena_dev = adapter->ena_dev;
> >> netdev = adapter->netdev;
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - if ((adapter->msix_vecs >= 1) && (netdev->rx_cpu_rmap)) {
> >> - free_irq_cpu_rmap(netdev->rx_cpu_rmap);
> >> - netdev->rx_cpu_rmap = NULL;
> >> - }
> >> -
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> /* Make sure timer and reset routine won't be called after
> >> * freeing device resources.
> >> */
> >> --
> >> 2.43.0
> >
> > Hi Ahmed,
> >
> > After the merging of this patch, I see the below stack trace when the IRQs
> are freed.
> > It can be reproduced by unloading and loading the driver using
> > `modprobe -r ena; modprobe ena` (happens during unload)
> >
> > Based on the patchset and the changes to other drivers, I think
> > there's a missing call to the function that releases the affinity
> > notifier (The warn is in
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.gi
> > t/tree/kernel/irq/manage.c#n2031)
> >
> > I saw in the intel code in the patchset that ` netif_napi_set_irq(<napi>, -1);`
> is added?
> >
> > After adding the code snippet I don't see this anymore, but I want to
> understand whether it's the right call by design.
>
> Yes, in ena_down() the IRQs are freed before napis are deleted (where IRQ
> notifiers are released). The code below is fine (and is better IMO) but you
> can also delete napis then free IRQs.
>
>
Thanks for the clarification. Some book-keeping, as this change fixes the issue.
The need to use `netif_napi_set_irq` was introduced in https://lore.kernel.org/netdev/20241002001331.65444-2-jdamato@fastly.com/,
But, technically, there was not need to use the call with the -1 until the introduction of this patch.
Is my understanding correct?
If it's correct, then the fix is for this patch.
(Also adding Joe who authored the mentioned patch)
> >
> > @@ -1716,8 +1716,11 @@ static void ena_free_io_irq(struct ena_adapter
> *adapter)
> > int i;
> >
> > for (i = ENA_IO_IRQ_FIRST_IDX; i <
> > ENA_MAX_MSIX_VEC(io_queue_count); i++) {
> > + struct ena_napi *ena_napi;
> > +
> > irq = &adapter->irq_tbl[i];
> > irq_set_affinity_hint(irq->vector, NULL);
> > + ena_napi = (struct ena_napi *)irq->data;
> > + netif_napi_set_irq(&ena_napi->napi, -1);
> > free_irq(irq->vector, irq->data);
> > }
> > }
> >
> > [ 484.544586] ? __warn+0x84/0x130
> > [ 484.544843] ? free_irq+0x5c/0x70
> > [ 484.545105] ? report_bug+0x18a/0x1a0 [ 484.545390] ?
> > handle_bug+0x53/0x90 [ 484.545664] ? exc_invalid_op+0x14/0x70 [
> > 484.545959] ? asm_exc_invalid_op+0x16/0x20 [ 484.546279] ?
> > free_irq+0x5c/0x70 [ 484.546545] ? free_irq+0x10/0x70 [ 484.546807]
> > ena_free_io_irq+0x5f/0x70 [ena] [ 484.547138] ena_down+0x250/0x3e0
> > [ena] [ 484.547435] ena_destroy_device+0x118/0x150 [ena] [
> > 484.547796] __ena_shutoff+0x5a/0xe0 [ena] [ 484.548110]
> > pci_device_remove+0x3b/0xb0 [ 484.548412]
> > device_release_driver_internal+0x193/0x200
> > [ 484.548804] driver_detach+0x44/0x90 [ 484.549084]
> > bus_remove_driver+0x69/0xf0 [ 484.549386]
> > pci_unregister_driver+0x2a/0xb0 [ 484.549717] ena_cleanup+0xc/0x130
> > [ena] [ 484.550021] __do_sys_delete_module.constprop.0+0x176/0x310
> > [ 484.550438] ? syscall_trace_enter+0xfb/0x1c0 [ 484.550782]
> > do_syscall_64+0x5b/0x170 [ 484.551067]
> > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > Thanks,
> > David
WARNING: multiple messages have this Message-ID (diff)
From: "Arinzon, David" <darinzon@amazon.com>
To: Ahmed Zaki <ahmed.zaki@intel.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"jdamato@fastly.com" <jdamato@fastly.com>
Cc: "intel-wired-lan@lists.osuosl.org"
<intel-wired-lan@lists.osuosl.org>,
"andrew+netdev@lunn.ch" <andrew+netdev@lunn.ch>,
"edumazet@google.com" <edumazet@google.com>,
"kuba@kernel.org" <kuba@kernel.org>,
"horms@kernel.org" <horms@kernel.org>,
"pabeni@redhat.com" <pabeni@redhat.com>,
"davem@davemloft.net" <davem@davemloft.net>,
"michael.chan@broadcom.com" <michael.chan@broadcom.com>,
"tariqt@nvidia.com" <tariqt@nvidia.com>,
"anthony.l.nguyen@intel.com" <anthony.l.nguyen@intel.com>,
"przemyslaw.kitszel@intel.com" <przemyslaw.kitszel@intel.com>,
"jdamato@fastly.com" <jdamato@fastly.com>,
"shayd@nvidia.com" <shayd@nvidia.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"Allen, Neil" <shayagr@amazon.com>
Subject: RE: [Intel-wired-lan] [PATCH net-next v9 2/6] net: ena: use napi's aRFS rmap notifers
Date: Wed, 5 Mar 2025 06:33:38 +0000 [thread overview]
Message-ID: <abc6a4b765f84eb09efd7b10a62c4391@amazon.com> (raw)
In-Reply-To: <54f50b81-7361-4140-8b88-acd765fd8f49@intel.com>
> [RE-SEND] I just realized I sent this only to iwl, sorry for spamming.
>
>
> On 2025-03-03 10:11 a.m., Arinzon, David wrote:
> >> Use the core's rmap notifiers and delete our own.
> >>
> >> Acked-by: David Arinzon <darinzon@amazon.com>
> >> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
> >> ---
> >> drivers/net/ethernet/amazon/ena/ena_netdev.c | 43 +-------------------
> >> 1 file changed, 1 insertion(+), 42 deletions(-)
> >>
> >> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> index c1295dfad0d0..6aab85a7c60a 100644
> >> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> >> @@ -5,9 +5,6 @@
> >>
> >> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> -#include <linux/cpu_rmap.h>
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> #include <linux/ethtool.h>
> >> #include <linux/kernel.h>
> >> #include <linux/module.h>
> >> @@ -162,30 +159,6 @@ int ena_xmit_common(struct ena_adapter
> *adapter,
> >> return 0;
> >> }
> >>
> >> -static int ena_init_rx_cpu_rmap(struct ena_adapter *adapter) -{
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - u32 i;
> >> - int rc;
> >> -
> >> - adapter->netdev->rx_cpu_rmap = alloc_irq_cpu_rmap(adapter-
> >>> num_io_queues);
> >> - if (!adapter->netdev->rx_cpu_rmap)
> >> - return -ENOMEM;
> >> - for (i = 0; i < adapter->num_io_queues; i++) {
> >> - int irq_idx = ENA_IO_IRQ_IDX(i);
> >> -
> >> - rc = irq_cpu_rmap_add(adapter->netdev->rx_cpu_rmap,
> >> - pci_irq_vector(adapter->pdev, irq_idx));
> >> - if (rc) {
> >> - free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
> >> - adapter->netdev->rx_cpu_rmap = NULL;
> >> - return rc;
> >> - }
> >> - }
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> - return 0;
> >> -}
> >> -
> >> static void ena_init_io_rings_common(struct ena_adapter *adapter,
> >> struct ena_ring *ring, u16 qid)
> >> { @@ -1596,7 +1569,7 @@ static int ena_enable_msix(struct ena_adapter
> *adapter)
> >> adapter->num_io_queues = irq_cnt - ENA_ADMIN_MSIX_VEC;
> >> }
> >>
> >> - if (ena_init_rx_cpu_rmap(adapter))
> >> + if (netif_enable_cpu_rmap(adapter->netdev,
> >> + adapter->num_io_queues))
> >> netif_warn(adapter, probe, adapter->netdev,
> >> "Failed to map IRQs to CPUs\n");
> >>
> >> @@ -1742,13 +1715,6 @@ static void ena_free_io_irq(struct ena_adapter
> >> *adapter)
> >> struct ena_irq *irq;
> >> int i;
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - if (adapter->msix_vecs >= 1) {
> >> - free_irq_cpu_rmap(adapter->netdev->rx_cpu_rmap);
> >> - adapter->netdev->rx_cpu_rmap = NULL;
> >> - }
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> -
> >> for (i = ENA_IO_IRQ_FIRST_IDX; i <
> >> ENA_MAX_MSIX_VEC(io_queue_count); i++) {
> >> irq = &adapter->irq_tbl[i];
> >> irq_set_affinity_hint(irq->vector, NULL); @@
> >> -4131,13 +4097,6 @@ static void __ena_shutoff(struct pci_dev *pdev,
> bool shutdown)
> >> ena_dev = adapter->ena_dev;
> >> netdev = adapter->netdev;
> >>
> >> -#ifdef CONFIG_RFS_ACCEL
> >> - if ((adapter->msix_vecs >= 1) && (netdev->rx_cpu_rmap)) {
> >> - free_irq_cpu_rmap(netdev->rx_cpu_rmap);
> >> - netdev->rx_cpu_rmap = NULL;
> >> - }
> >> -
> >> -#endif /* CONFIG_RFS_ACCEL */
> >> /* Make sure timer and reset routine won't be called after
> >> * freeing device resources.
> >> */
> >> --
> >> 2.43.0
> >
> > Hi Ahmed,
> >
> > After the merging of this patch, I see the below stack trace when the IRQs
> are freed.
> > It can be reproduced by unloading and loading the driver using
> > `modprobe -r ena; modprobe ena` (happens during unload)
> >
> > Based on the patchset and the changes to other drivers, I think
> > there's a missing call to the function that releases the affinity
> > notifier (The warn is in
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.gi
> > t/tree/kernel/irq/manage.c#n2031)
> >
> > I saw in the intel code in the patchset that ` netif_napi_set_irq(<napi>, -1);`
> is added?
> >
> > After adding the code snippet I don't see this anymore, but I want to
> understand whether it's the right call by design.
>
> Yes, in ena_down() the IRQs are freed before napis are deleted (where IRQ
> notifiers are released). The code below is fine (and is better IMO) but you
> can also delete napis then free IRQs.
>
>
Thanks for the clarification. Some book-keeping, as this change fixes the issue.
The need to use `netif_napi_set_irq` was introduced in https://lore.kernel.org/netdev/20241002001331.65444-2-jdamato@fastly.com/,
But, technically, there was not need to use the call with the -1 until the introduction of this patch.
Is my understanding correct?
If it's correct, then the fix is for this patch.
(Also adding Joe who authored the mentioned patch)
> >
> > @@ -1716,8 +1716,11 @@ static void ena_free_io_irq(struct ena_adapter
> *adapter)
> > int i;
> >
> > for (i = ENA_IO_IRQ_FIRST_IDX; i <
> > ENA_MAX_MSIX_VEC(io_queue_count); i++) {
> > + struct ena_napi *ena_napi;
> > +
> > irq = &adapter->irq_tbl[i];
> > irq_set_affinity_hint(irq->vector, NULL);
> > + ena_napi = (struct ena_napi *)irq->data;
> > + netif_napi_set_irq(&ena_napi->napi, -1);
> > free_irq(irq->vector, irq->data);
> > }
> > }
> >
> > [ 484.544586] ? __warn+0x84/0x130
> > [ 484.544843] ? free_irq+0x5c/0x70
> > [ 484.545105] ? report_bug+0x18a/0x1a0 [ 484.545390] ?
> > handle_bug+0x53/0x90 [ 484.545664] ? exc_invalid_op+0x14/0x70 [
> > 484.545959] ? asm_exc_invalid_op+0x16/0x20 [ 484.546279] ?
> > free_irq+0x5c/0x70 [ 484.546545] ? free_irq+0x10/0x70 [ 484.546807]
> > ena_free_io_irq+0x5f/0x70 [ena] [ 484.547138] ena_down+0x250/0x3e0
> > [ena] [ 484.547435] ena_destroy_device+0x118/0x150 [ena] [
> > 484.547796] __ena_shutoff+0x5a/0xe0 [ena] [ 484.548110]
> > pci_device_remove+0x3b/0xb0 [ 484.548412]
> > device_release_driver_internal+0x193/0x200
> > [ 484.548804] driver_detach+0x44/0x90 [ 484.549084]
> > bus_remove_driver+0x69/0xf0 [ 484.549386]
> > pci_unregister_driver+0x2a/0xb0 [ 484.549717] ena_cleanup+0xc/0x130
> > [ena] [ 484.550021] __do_sys_delete_module.constprop.0+0x176/0x310
> > [ 484.550438] ? syscall_trace_enter+0xfb/0x1c0 [ 484.550782]
> > do_syscall_64+0x5b/0x170 [ 484.551067]
> > entry_SYSCALL_64_after_hwframe+0x76/0x7e
> >
> > Thanks,
> > David
next prev parent reply other threads:[~2025-03-05 6:33 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-24 23:22 [Intel-wired-lan] [PATCH net-next v9 0/6] net: napi: add CPU affinity to napi->config Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 1/6] net: move aRFS rmap management and CPU affinity to core Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 2/6] net: ena: use napi's aRFS rmap notifers Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-03-03 17:11 ` [Intel-wired-lan] " Arinzon, David
2025-03-03 17:11 ` Arinzon, David
2025-03-03 18:03 ` [Intel-wired-lan] " Ahmed Zaki
2025-03-04 22:37 ` Ahmed Zaki
2025-03-05 6:33 ` Arinzon, David [this message]
2025-03-05 6:33 ` Arinzon, David
2025-03-05 14:00 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 3/6] ice: clear NAPI's IRQ numbers in ice_vsi_clear_napi_queues() Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 4/6] ice: use napi's irq affinity and rmap IRQ notifiers Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 5/6] idpf: use napi's irq affinity Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-24 23:22 ` [Intel-wired-lan] [PATCH net-next v9 6/6] selftests: drv-net: add tests for napi IRQ affinity notifiers Ahmed Zaki
2025-02-24 23:22 ` Ahmed Zaki
2025-02-27 4:10 ` [Intel-wired-lan] [PATCH net-next v9 0/6] net: napi: add CPU affinity to napi->config patchwork-bot+netdevbpf
2025-02-27 4:10 ` patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abc6a4b765f84eb09efd7b10a62c4391@amazon.com \
--to=darinzon@amazon.com \
--cc=ahmed.zaki@intel.com \
--cc=akpm@linux-foundation.org \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jdamato@fastly.com \
--cc=kuba@kernel.org \
--cc=michael.chan@broadcom.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=shayagr@amazon.com \
--cc=shayd@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.