From: Jakub Kicinski <kuba@kernel.org>
To: Joe Damato <jdamato@fastly.com>
Cc: netdev@vger.kernel.org, edumazet@google.com,
amritha.nambiar@intel.com, sridhar.samudrala@intel.com,
sdf@fomichev.me, bjorn@rivosinc.com, hch@infradead.org,
willy@infradead.org, willemdebruijn.kernel@gmail.com,
skhawaja@google.com, Martin Karsten <mkarsten@uwaterloo.ca>,
Donald Hunter <donald.hunter@gmail.com>,
"David S. Miller" <davem@davemloft.net>,
Paolo Abeni <pabeni@redhat.com>,
Jesper Dangaard Brouer <hawk@kernel.org>,
Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
Daniel Jurgens <danielj@nvidia.com>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values
Date: Mon, 2 Sep 2024 17:49:44 -0700 [thread overview]
Message-ID: <20240902174944.293dfe4b@kernel.org> (raw)
In-Reply-To: <ZtNSkWa1G40jRX5N@LQ3V64L9R2>
On Sat, 31 Aug 2024 18:27:45 +0100 Joe Damato wrote:
> > How do you feel about making this configuration opt-in / require driver
> > changes? What I'm thinking is that having the new "netif_napi_add()"
> > variant (or perhaps extending netif_napi_set_irq()) to take an extra
> > "index" parameter would make the whole thing much simpler.
>
> I think if we are going to go this way, then opt-in is probably the
> way to go. This series would include the necessary changes for mlx5,
> in that case (because that's what I have access to) so that the new
> variant has a user?
SG! FWIW for bnxt the "id" is struct bnxt_napi::index (I haven't looked
at bnxt before writing the suggestion :))
> > Index would basically be an integer 0..n, where n is the number of
> > IRQs configured for the driver. The index of a NAPI instance would
> > likely match the queue ID of the queue the NAPI serves.
> >
> > We can then allocate an array of "napi_configs" in net_device -
> > like we allocate queues, the array size would be max(num_rx_queue,
> > num_tx_queues). We just need to store a couple of ints so it will
> > be tiny compared to queue structs, anyway.
>
> I assume napi_storage exists for both combined RX/TX NAPIs (i.e.
> drivers that multiplex RX/TX on a single NAPI like mlx5) as well
> as drivers which create NAPIs that are RX or TX-only, right?
Hm.
> If so, it seems like we'd either need to:
> - Do something more complicated when computing how much NAPI
> storage to make, or
> - Provide a different path for drivers which don't multiplex and
> create some number of (for example) TX-only NAPIs ?
>
> I guess I'm just imagining a weird case where a driver has 8 RX
> queues but 64 TX queues. max of that is 64, so we'd be missing 8
> napi_storage ?
>
> Sorry, I'm probably just missing something about the implementation
> details you summarized above.
I wouldn't worry about it. We can added a variant of alloc_netdev_mqs()
later which takes the NAPI count explicitly. For now we can simply
assume max(rx, tx) is good enough, and maybe add a WARN_ON_ONCE() to
the set function to catch drivers which need something more complicated.
Modern NICs have far more queues than IRQs (~NAPIs).
> > The NAPI_SET netlink op can then work based on NAPI index rather
> > than the ephemeral NAPI ID. It can apply the config to all live
> > NAPI instances with that index (of which there really should only
> > be one, unless driver is mid-reconfiguration somehow but even that
> > won't cause issues, we can give multiple instances the same settings)
> > and also store the user config in the array in net_device.
>
> I understand what you are proposing. I suppose napi-get could be
> extended to include the NAPI index, too?
Yup!
> Then users could map queues to NAPI indexes to queues (via NAPI ID)?
Yes.
> > When new NAPI instance is associate with a NAPI index it should get
> > all the config associated with that index applied.
> >
> > Thoughts? Does that makes sense, and if so do you think it's an
> > over-complication?
>
> It feels a bit tricky, to me, as it seems there are some edge cases
> to be careful with (queue count change). I could probably give the
> implementation a try and see where I end up.
>
> Having these settings per-NAPI would be really useful and being able
> to support IRQ suspension would be useful, too.
>
> I think being thoughtful about how we get there is important; I'm a
> little wary of getting side tracked, but I trust your judgement and
> if you think this is worth exploring I'll think on it some more.
I understand, we can abandon it if the implementation drags out due to
various nit picks and back-and-forths. But I don't expect much of that
🤞️
next prev parent reply other threads:[~2024-09-03 0:49 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-29 13:11 [PATCH net-next 0/5] Add support for per-NAPI config via netlink Joe Damato
2024-08-29 13:11 ` [PATCH net-next 1/5] net: napi: Make napi_defer_hard_irqs per-NAPI Joe Damato
2024-08-29 13:46 ` Eric Dumazet
2024-08-29 22:05 ` Jakub Kicinski
2024-08-30 9:14 ` Joe Damato
2024-08-30 20:21 ` Jakub Kicinski
2024-08-30 20:23 ` Joe Damato
2024-08-30 8:36 ` Simon Horman
2024-08-30 9:11 ` Joe Damato
2024-08-30 16:50 ` kernel test robot
2024-08-29 13:11 ` [PATCH net-next 2/5] netdev-genl: Dump napi_defer_hard_irqs Joe Damato
2024-08-29 22:08 ` Jakub Kicinski
2024-08-30 9:10 ` Joe Damato
2024-08-30 20:28 ` Jakub Kicinski
2024-08-30 20:31 ` Joe Damato
2024-08-30 21:22 ` Jakub Kicinski
2024-08-29 13:11 ` [PATCH net-next 3/5] net: napi: Make gro_flush_timeout per-NAPI Joe Damato
2024-08-29 13:48 ` Eric Dumazet
2024-08-29 13:57 ` Joe Damato
2024-08-29 15:28 ` Joe Damato
2024-08-29 15:31 ` Eric Dumazet
2024-08-29 15:39 ` Joe Damato
2024-08-30 16:18 ` kernel test robot
2024-08-30 16:18 ` kernel test robot
2024-08-29 13:12 ` [PATCH net-next 4/5] netdev-genl: Dump gro_flush_timeout Joe Damato
2024-08-29 22:09 ` Jakub Kicinski
2024-08-30 9:17 ` Joe Damato
2024-08-29 13:12 ` [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values Joe Damato
2024-08-29 22:31 ` Jakub Kicinski
2024-08-30 10:43 ` Joe Damato
2024-08-30 21:22 ` Jakub Kicinski
2024-08-31 17:27 ` Joe Damato
2024-09-03 0:49 ` Jakub Kicinski [this message]
2024-09-02 16:56 ` Joe Damato
2024-09-03 1:02 ` Jakub Kicinski
2024-09-03 19:04 ` Samiullah Khawaja
2024-09-03 19:40 ` Jakub Kicinski
2024-09-03 21:58 ` Samiullah Khawaja
2024-09-05 9:20 ` Joe Damato
2024-09-08 15:54 ` Joe Damato
2024-09-04 23:40 ` Stanislav Fomichev
2024-09-04 23:54 ` Jakub Kicinski
2024-09-05 9:32 ` Joe Damato
2024-09-08 15:57 ` Joe Damato
2024-09-09 23:03 ` Jakub Kicinski
2024-09-05 9:30 ` Joe Damato
2024-09-05 16:56 ` Stanislav Fomichev
2024-09-05 17:05 ` Joe Damato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240902174944.293dfe4b@kernel.org \
--to=kuba@kernel.org \
--cc=amritha.nambiar@intel.com \
--cc=bjorn@rivosinc.com \
--cc=danielj@nvidia.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=hch@infradead.org \
--cc=jdamato@fastly.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mkarsten@uwaterloo.ca \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=skhawaja@google.com \
--cc=sridhar.samudrala@intel.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=willy@infradead.org \
--cc=xuanzhuo@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.