netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Joe Damato <jdamato@fastly.com>
Cc: netdev@vger.kernel.org, edumazet@google.com,
	amritha.nambiar@intel.com, sridhar.samudrala@intel.com,
	sdf@fomichev.me, bjorn@rivosinc.com, hch@infradead.org,
	willy@infradead.org, willemdebruijn.kernel@gmail.com,
	skhawaja@google.com, Martin Karsten <mkarsten@uwaterloo.ca>,
	Donald Hunter <donald.hunter@gmail.com>,
	"David S. Miller" <davem@davemloft.net>,
	Paolo Abeni <pabeni@redhat.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Xuan Zhuo <xuanzhuo@linux.alibaba.com>,
	Daniel Jurgens <danielj@nvidia.com>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values
Date: Mon, 2 Sep 2024 17:49:44 -0700	[thread overview]
Message-ID: <20240902174944.293dfe4b@kernel.org> (raw)
In-Reply-To: <ZtNSkWa1G40jRX5N@LQ3V64L9R2>

On Sat, 31 Aug 2024 18:27:45 +0100 Joe Damato wrote:
> > How do you feel about making this configuration opt-in / require driver
> > changes? What I'm thinking is that having the new "netif_napi_add()"
> > variant (or perhaps extending netif_napi_set_irq()) to take an extra
> > "index" parameter would make the whole thing much simpler.  
> 
> I think if we are going to go this way, then opt-in is probably the
> way to go. This series would include the necessary changes for mlx5,
> in that case (because that's what I have access to) so that the new
> variant has a user?

SG! FWIW for bnxt the "id" is struct bnxt_napi::index (I haven't looked
at bnxt before writing the suggestion :))

> > Index would basically be an integer 0..n, where n is the number of
> > IRQs configured for the driver. The index of a NAPI instance would
> > likely match the queue ID of the queue the NAPI serves.
> > 
> > We can then allocate an array of "napi_configs" in net_device -
> > like we allocate queues, the array size would be max(num_rx_queue,
> > num_tx_queues). We just need to store a couple of ints so it will
> > be tiny compared to queue structs, anyway.  
> 
> I assume napi_storage exists for both combined RX/TX NAPIs (i.e.
> drivers that multiplex RX/TX on a single NAPI like mlx5) as well
> as drivers which create NAPIs that are RX or TX-only, right?

Hm.

> If so, it seems like we'd either need to:
>   - Do something more complicated when computing how much NAPI
>     storage to make, or
>   - Provide a different path for drivers which don't multiplex and
>     create some number of (for example) TX-only NAPIs ?
> 
> I guess I'm just imagining a weird case where a driver has 8 RX
> queues but 64 TX queues. max of that is 64, so we'd be missing 8
> napi_storage ?
> 
> Sorry, I'm probably just missing something about the implementation
> details you summarized above.

I wouldn't worry about it. We can added a variant of alloc_netdev_mqs()
later which takes the NAPI count explicitly. For now we can simply
assume max(rx, tx) is good enough, and maybe add a WARN_ON_ONCE() to 
the set function to catch drivers which need something more complicated.

Modern NICs have far more queues than IRQs (~NAPIs).

> > The NAPI_SET netlink op can then work based on NAPI index rather 
> > than the ephemeral NAPI ID. It can apply the config to all live
> > NAPI instances with that index (of which there really should only 
> > be one, unless driver is mid-reconfiguration somehow but even that
> > won't cause issues, we can give multiple instances the same settings)
> > and also store the user config in the array in net_device.  
> 
> I understand what you are proposing. I suppose napi-get could be
> extended to include the NAPI index, too?

Yup!

> Then users could map queues to NAPI indexes to queues (via NAPI ID)?

Yes.

> > When new NAPI instance is associate with a NAPI index it should get
> > all the config associated with that index applied.
> > 
> > Thoughts? Does that makes sense, and if so do you think it's an
> > over-complication?  
> 
> It feels a bit tricky, to me, as it seems there are some edge cases
> to be careful with (queue count change). I could probably give the
> implementation a try and see where I end up.
> 
> Having these settings per-NAPI would be really useful and being able
> to support IRQ suspension would be useful, too.
> 
> I think being thoughtful about how we get there is important; I'm a
> little wary of getting side tracked, but I trust your judgement and
> if you think this is worth exploring I'll think on it some more.

I understand, we can abandon it if the implementation drags out due to
various nit picks and back-and-forths. But I don't expect much of that
🤞️

  reply	other threads:[~2024-09-03  0:49 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-29 13:11 [PATCH net-next 0/5] Add support for per-NAPI config via netlink Joe Damato
2024-08-29 13:11 ` [PATCH net-next 1/5] net: napi: Make napi_defer_hard_irqs per-NAPI Joe Damato
2024-08-29 13:46   ` Eric Dumazet
2024-08-29 22:05   ` Jakub Kicinski
2024-08-30  9:14     ` Joe Damato
2024-08-30 20:21       ` Jakub Kicinski
2024-08-30 20:23         ` Joe Damato
2024-08-30  8:36   ` Simon Horman
2024-08-30  9:11     ` Joe Damato
2024-08-30 16:50   ` kernel test robot
2024-08-29 13:11 ` [PATCH net-next 2/5] netdev-genl: Dump napi_defer_hard_irqs Joe Damato
2024-08-29 22:08   ` Jakub Kicinski
2024-08-30  9:10     ` Joe Damato
2024-08-30 20:28       ` Jakub Kicinski
2024-08-30 20:31         ` Joe Damato
2024-08-30 21:22           ` Jakub Kicinski
2024-08-29 13:11 ` [PATCH net-next 3/5] net: napi: Make gro_flush_timeout per-NAPI Joe Damato
2024-08-29 13:48   ` Eric Dumazet
2024-08-29 13:57     ` Joe Damato
2024-08-29 15:28     ` Joe Damato
2024-08-29 15:31       ` Eric Dumazet
2024-08-29 15:39         ` Joe Damato
2024-08-30 16:18   ` kernel test robot
2024-08-30 16:18   ` kernel test robot
2024-08-29 13:12 ` [PATCH net-next 4/5] netdev-genl: Dump gro_flush_timeout Joe Damato
2024-08-29 22:09   ` Jakub Kicinski
2024-08-30  9:17     ` Joe Damato
2024-08-29 13:12 ` [PATCH net-next 5/5] netdev-genl: Support setting per-NAPI config values Joe Damato
2024-08-29 22:31   ` Jakub Kicinski
2024-08-30 10:43     ` Joe Damato
2024-08-30 21:22       ` Jakub Kicinski
2024-08-31 17:27         ` Joe Damato
2024-09-03  0:49           ` Jakub Kicinski [this message]
2024-09-02 16:56         ` Joe Damato
2024-09-03  1:02           ` Jakub Kicinski
2024-09-03 19:04             ` Samiullah Khawaja
2024-09-03 19:40               ` Jakub Kicinski
2024-09-03 21:58                 ` Samiullah Khawaja
2024-09-05  9:20                   ` Joe Damato
2024-09-08 15:54                 ` Joe Damato
2024-09-04 23:40           ` Stanislav Fomichev
2024-09-04 23:54             ` Jakub Kicinski
2024-09-05  9:32               ` Joe Damato
2024-09-08 15:57               ` Joe Damato
2024-09-09 23:03                 ` Jakub Kicinski
2024-09-05  9:30             ` Joe Damato
2024-09-05 16:56               ` Stanislav Fomichev
2024-09-05 17:05                 ` Joe Damato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240902174944.293dfe4b@kernel.org \
    --to=kuba@kernel.org \
    --cc=amritha.nambiar@intel.com \
    --cc=bjorn@rivosinc.com \
    --cc=danielj@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=hch@infradead.org \
    --cc=jdamato@fastly.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mkarsten@uwaterloo.ca \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=skhawaja@google.com \
    --cc=sridhar.samudrala@intel.com \
    --cc=willemdebruijn.kernel@gmail.com \
    --cc=willy@infradead.org \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).