From: Joe Damato <jdamato@fastly.com>
To: "Lifshits, Vitaly" <vitaly.lifshits@intel.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"Keller, Jacob E" <jacob.e.keller@intel.com>,
"kurt@linutronix.de" <kurt@linutronix.de>,
"Gomes, Vinicius" <vinicius.gomes@intel.com>,
"Nguyen, Anthony L" <anthony.l.nguyen@intel.com>,
"Kitszel, Przemyslaw" <przemyslaw.kitszel@intel.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Alexei Starovoitov <ast@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Jesper Dangaard Brouer <hawk@kernel.org>,
John Fastabend <john.fastabend@gmail.com>,
"moderated list:INTEL ETHERNET DRIVERS"
<intel-wired-lan@lists.osuosl.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:XDP (eXpress Data Path)" <bpf@vger.kernel.org>,
stanislaw.gruszka@linux.intel.com
Subject: Re: [Intel-wired-lan] [iwl-next v4 2/2] igc: Link queues to NAPI instances
Date: Mon, 28 Oct 2024 08:50:38 -0700 [thread overview]
Message-ID: <Zx-yzhq4unv0gsVX@LQ3V64L9R2> (raw)
In-Reply-To: <d7799132-7e4a-0ac2-cbda-c919ce434fe2@intel.com>
On Sun, Oct 27, 2024 at 11:49:33AM +0200, Lifshits, Vitaly wrote:
>
> On 10/23/2024 12:52 AM, Joe Damato wrote:
> > Link queues to NAPI instances via netdev-genl API so that users can
> > query this information with netlink. Handle a few cases in the driver:
> > 1. Link/unlink the NAPIs when XDP is enabled/disabled
> > 2. Handle IGC_FLAG_QUEUE_PAIRS enabled and disabled
> >
> > Example output when IGC_FLAG_QUEUE_PAIRS is enabled:
> >
> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
> > --dump queue-get --json='{"ifindex": 2}'
> >
> > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
> > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'rx'},
> > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'rx'},
> > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
> > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'tx'},
> > {'id': 2, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
> > {'id': 3, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> >
> > Since IGC_FLAG_QUEUE_PAIRS is enabled, you'll note that the same NAPI ID
> > is present for both rx and tx queues at the same index, for example
> > index 0:
> >
> > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> > {'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'tx'},
> >
> > To test IGC_FLAG_QUEUE_PAIRS disabled, a test system was booted using
> > the grub command line option "maxcpus=2" to force
> > igc_set_interrupt_capability to disable IGC_FLAG_QUEUE_PAIRS.
> >
> > Example output when IGC_FLAG_QUEUE_PAIRS is disabled:
> >
> > $ lscpu | grep "On-line CPU"
> > On-line CPU(s) list: 0,2
> >
> > $ ethtool -l enp86s0 | tail -5
> > Current hardware settings:
> > RX: n/a
> > TX: n/a
> > Other: 1
> > Combined: 2
> >
> > $ cat /proc/interrupts | grep enp
> > 144: [...] enp86s0
> > 145: [...] enp86s0-rx-0
> > 146: [...] enp86s0-rx-1
> > 147: [...] enp86s0-tx-0
> > 148: [...] enp86s0-tx-1
> >
> > 1 "other" IRQ, and 2 IRQs for each of RX and Tx, so we expect netlink to
> > report 4 IRQs with unique NAPI IDs:
> >
> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
> > --dump napi-get --json='{"ifindex": 2}'
> > [{'id': 8196, 'ifindex': 2, 'irq': 148},
> > {'id': 8195, 'ifindex': 2, 'irq': 147},
> > {'id': 8194, 'ifindex': 2, 'irq': 146},
> > {'id': 8193, 'ifindex': 2, 'irq': 145}]
> >
> > Now we examine which queues these NAPIs are associated with, expecting
> > that since IGC_FLAG_QUEUE_PAIRS is disabled each RX and TX queue will
> > have its own NAPI instance:
> >
> > $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
> > --dump queue-get --json='{"ifindex": 2}'
> > [{'id': 0, 'ifindex': 2, 'napi-id': 8193, 'type': 'rx'},
> > {'id': 1, 'ifindex': 2, 'napi-id': 8194, 'type': 'rx'},
> > {'id': 0, 'ifindex': 2, 'napi-id': 8195, 'type': 'tx'},
> > {'id': 1, 'ifindex': 2, 'napi-id': 8196, 'type': 'tx'}]
> >
> > Signed-off-by: Joe Damato <jdamato@fastly.com>
> > Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
> > ---
> > v4:
> > - Add rtnl_lock/rtnl_unlock in two paths: igc_resume and
> > igc_io_error_detected. The code added to the latter is inspired by
> > a similar implementation in ixgbe's ixgbe_io_error_detected.
> >
> > v3:
> > - Replace igc_unset_queue_napi with igc_set_queue_napi(adapater, i,
> > NULL), as suggested by Vinicius Costa Gomes
> > - Simplify implemention of igc_set_queue_napi as suggested by Kurt
> > Kanzenbach, with a tweak to use ring->queue_index
> >
> > v2:
> > - Update commit message to include tests for IGC_FLAG_QUEUE_PAIRS
> > disabled
> > - Refactored code to move napi queue mapping and unmapping to helper
> > functions igc_set_queue_napi and igc_unset_queue_napi
> > - Adjust the code to handle IGC_FLAG_QUEUE_PAIRS disabled
> > - Call helpers to map/unmap queues to NAPIs in igc_up, __igc_open,
> > igc_xdp_enable_pool, and igc_xdp_disable_pool
> >
> > drivers/net/ethernet/intel/igc/igc.h | 2 ++
> > drivers/net/ethernet/intel/igc/igc_main.c | 41 ++++++++++++++++++++---
> > drivers/net/ethernet/intel/igc/igc_xdp.c | 2 ++
> > 3 files changed, 40 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h
> > index eac0f966e0e4..b8111ad9a9a8 100644
> > --- a/drivers/net/ethernet/intel/igc/igc.h
> > +++ b/drivers/net/ethernet/intel/igc/igc.h
> > @@ -337,6 +337,8 @@ struct igc_adapter {
> > struct igc_led_classdev *leds;
> > };
> > +void igc_set_queue_napi(struct igc_adapter *adapter, int q_idx,
> > + struct napi_struct *napi);
> > void igc_up(struct igc_adapter *adapter);
> > void igc_down(struct igc_adapter *adapter);
> > int igc_open(struct net_device *netdev);
> > diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
> > index 7964bbedb16c..04aa216ef612 100644
> > --- a/drivers/net/ethernet/intel/igc/igc_main.c
> > +++ b/drivers/net/ethernet/intel/igc/igc_main.c
> > @@ -4948,6 +4948,22 @@ static int igc_sw_init(struct igc_adapter *adapter)
> > return 0;
> > }
> > +void igc_set_queue_napi(struct igc_adapter *adapter, int vector,
> > + struct napi_struct *napi)
> > +{
> > + struct igc_q_vector *q_vector = adapter->q_vector[vector];
> > +
> > + if (q_vector->rx.ring)
> > + netif_queue_set_napi(adapter->netdev,
> > + q_vector->rx.ring->queue_index,
> > + NETDEV_QUEUE_TYPE_RX, napi);
> > +
> > + if (q_vector->tx.ring)
> > + netif_queue_set_napi(adapter->netdev,
> > + q_vector->tx.ring->queue_index,
> > + NETDEV_QUEUE_TYPE_TX, napi);
> > +}
> > +
> > /**
> > * igc_up - Open the interface and prepare it to handle traffic
> > * @adapter: board private structure
> > @@ -4955,6 +4971,7 @@ static int igc_sw_init(struct igc_adapter *adapter)
> > void igc_up(struct igc_adapter *adapter)
> > {
> > struct igc_hw *hw = &adapter->hw;
> > + struct napi_struct *napi;
> > int i = 0;
> > /* hardware has been reset, we need to reload some things */
> > @@ -4962,8 +4979,11 @@ void igc_up(struct igc_adapter *adapter)
> > clear_bit(__IGC_DOWN, &adapter->state);
> > - for (i = 0; i < adapter->num_q_vectors; i++)
> > - napi_enable(&adapter->q_vector[i]->napi);
> > + for (i = 0; i < adapter->num_q_vectors; i++) {
> > + napi = &adapter->q_vector[i]->napi;
> > + napi_enable(napi);
> > + igc_set_queue_napi(adapter, i, napi);
> > + }
> > if (adapter->msix_entries)
> > igc_configure_msix(adapter);
> > @@ -5192,6 +5212,7 @@ void igc_down(struct igc_adapter *adapter)
> > for (i = 0; i < adapter->num_q_vectors; i++) {
> > if (adapter->q_vector[i]) {
> > napi_synchronize(&adapter->q_vector[i]->napi);
> > + igc_set_queue_napi(adapter, i, NULL);
> > napi_disable(&adapter->q_vector[i]->napi);
> > }
> > }
> > @@ -6021,6 +6042,7 @@ static int __igc_open(struct net_device *netdev, bool resuming)
> > struct igc_adapter *adapter = netdev_priv(netdev);
> > struct pci_dev *pdev = adapter->pdev;
> > struct igc_hw *hw = &adapter->hw;
> > + struct napi_struct *napi;
> > int err = 0;
> > int i = 0;
> > @@ -6056,8 +6078,11 @@ static int __igc_open(struct net_device *netdev, bool resuming)
> > clear_bit(__IGC_DOWN, &adapter->state);
> > - for (i = 0; i < adapter->num_q_vectors; i++)
> > - napi_enable(&adapter->q_vector[i]->napi);
> > + for (i = 0; i < adapter->num_q_vectors; i++) {
> > + napi = &adapter->q_vector[i]->napi;
> > + napi_enable(napi);
> > + igc_set_queue_napi(adapter, i, napi);
> > + }
> > /* Clear any pending interrupts. */
> > rd32(IGC_ICR);
> > @@ -7385,7 +7410,9 @@ static int igc_resume(struct device *dev)
> > wr32(IGC_WUS, ~0);
> > if (netif_running(netdev)) {
> > + rtnl_lock();
>
> This change will bring back the deadlock issue that was fixed in commit:
> 6f31d6b: "igc: Refactor runtime power management flow".
OK, thanks for letting me know.
I think I better understand what the issue is. It seems that:
- igc_resume can be called with rtnl held via ethtool (which I
didn't know), which calls __igc_open
- __igc_open re-enables NAPIs and re-links queues to NAPI IDs (which
requires rtnl)
so, it seems like the rtnl_lock() I've added to igc_resume is
unnecessary.
I suppose I don't know all of the paths where the pm functions can
be called -- are there others where RTNL is _not_ already held?
I looked at e1000e and it seems that driver does not re-enable NAPIs
in its resume path and thus does not suffer from the same issue as
igc.
So my questions are:
1. Are there are other contexts where igc_resume is called where
RTNL is not held?
2. If the answer is that RTNL is always held when igc_resume is
called, then I can send a v5 that removes the
rtnl_lock/rtnl_unlock. What do you think?
[...]
>
> Hi Joe,
>
>
> The current version will cause a regression, a possible deadlock, due to the
> addition of the rtnl_lock in igc_resume that was fixed previously.
>
> You can refer to the following link:
>
> https://github.com/torvalds/linux/commit/6f31d6b643a32cc126cf86093fca1ea575948bf0#diff-d5b32b873e9902b496280a5f42c246043c8f0691d8b3a6bbd56df99ce8ceb394L7190
Thanks for the link.
next prev parent reply other threads:[~2024-10-28 15:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-22 21:52 [iwl-next v4 0/2] igc: Link IRQs and queues to NAPIs Joe Damato
2024-10-22 21:52 ` [iwl-next v4 1/2] igc: Link IRQs to NAPI instances Joe Damato
2024-10-22 21:52 ` [iwl-next v4 2/2] igc: Link queues " Joe Damato
2024-10-27 9:49 ` [Intel-wired-lan] " Lifshits, Vitaly
2024-10-28 15:50 ` Joe Damato [this message]
2024-10-28 16:00 ` Joe Damato
2024-10-28 18:51 ` Joe Damato
2024-10-28 18:53 ` Jacob Keller
2024-10-28 18:59 ` Joe Damato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zx-yzhq4unv0gsVX@LQ3V64L9R2 \
--to=jdamato@fastly.com \
--cc=andrew+netdev@lunn.ch \
--cc=anthony.l.nguyen@intel.com \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=intel-wired-lan@lists.osuosl.org \
--cc=jacob.e.keller@intel.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=kurt@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=przemyslaw.kitszel@intel.com \
--cc=stanislaw.gruszka@linux.intel.com \
--cc=vinicius.gomes@intel.com \
--cc=vitaly.lifshits@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox