public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: Jakub Kicinski <kuba@kernel.org>
Cc: William Tu <witu@nvidia.com>,
	Jacob Keller <jacob.e.keller@intel.com>,
	bodong@nvidia.com, jiri@nvidia.com, netdev@vger.kernel.org,
	saeedm@nvidia.com,
	"aleksander.lobakin@intel.com" <aleksander.lobakin@intel.com>
Subject: Re: [RFC PATCH v3 net-next] Documentation: devlink: Add devlink-sd
Date: Thu, 15 Feb 2024 14:19:40 +0100	[thread overview]
Message-ID: <Zc4Pa4QWGQegN4mI@nanopsycho> (raw)
In-Reply-To: <20240208172633.010b1c3f@kernel.org>

Fri, Feb 09, 2024 at 02:26:33AM CET, kuba@kernel.org wrote:
>On Fri, 2 Feb 2024 08:46:56 +0100 Jiri Pirko wrote:
>> Fri, Feb 02, 2024 at 05:00:41AM CET, kuba@kernel.org wrote:
>> >On Thu, 1 Feb 2024 11:13:57 +0100 Jiri Pirko wrote:  
>> >> Wait a sec.  
>> >
>> >No, you wait a sec ;) Why do you think this belongs to devlink?
>> >Two months ago you were complaining bitterly when people were
>> >considering using devlink rate to control per-queue shapers.
>> >And now it's fine to add queues as a concept to devlink?  
>> 
>> Do you have a better suggestion how to model common pool object for
>> multiple netdevices? This is the reason why devlink was introduced to
>> provide a platform for common/shared things for a device that contains
>> multiple netdevs/ports/whatever. But I may be missing something here,
>> for sure.
>
>devlink just seems like the lowest common denominator, but the moment
>we start talking about multi-PF devices it also gets wobbly :(

You mean you see real to have a multi-PF device that allows to share the
pools between the PFs? If, in theory, that exists, could this just be a
limitation perhaps?


>I think it's better to focus on the object, without scoping it to some
>ancestor which may not be sufficient tomorrow (meaning its own family
>or a new object in netdev like page pool).

Ok.


>
>> >> With this API, user can configure sharing of the descriptors.
>> >> So there would be a pool (or multiple pools) of descriptors and the
>> >> descriptors could be used by many queues/representors.
>> >> 
>> >> So in the example above, for 1k representors you have only 1k
>> >> descriptors.
>> >> 
>> >> The infra allows great flexibility in terms of configuring multiple
>> >> pools of different sizes and assigning queues from representors to
>> >> different pools. So you can have multiple "classes" of representors.
>> >> For example the ones you expect heavy trafic could have a separate pool,
>> >> the rest can share another pool together, etc.  
>> >
>> >Well, it does not extend naturally to the design described in that blog
>> >post. There I only care about a netdev level pool, but every queue can
>> >bind multiple pools.
>> >
>> >It also does not cater naturally to a very interesting application
>> >of such tech to lightweight container interfaces, macvlan-offload style.
>> >As I said at the beginning, why is the pool a devlink thing if the only
>> >objects that connect to it are netdevs?  
>> 
>> Okay. Let's model it differently, no problem. I find devlink device
>> as a good fit for object to contain shared things like pools.
>> But perhaps there could be something else. Something new?
>
>We need something new for more advanced memory providers, anyway.
>The huge page example I posted a year ago needs something to get
>a huge page from CMA and slice it up for the page pools to draw from.
>That's very similar, also not really bound to a netdev. I don't think
>the cross-netdev aspect is the most important aspect of this problem.

Well, in our case, the shared entity is not floating, it is bound to a
device related to netdev.


>
>> >Another netdev thing where this will be awkward is page pool
>> >integration. It lives in netdev genl, are we going to add devlink pool
>> >reference to indicate which pool a pp is feeding?  
>> 
>> Page pool is per-netdev, isn't it? It could be extended to be bound per
>> devlink-pool as you suggest. It is a bit awkward, I agree.
>> 
>> So instead of devlink, should be add the descriptor-pool object into
>> netdev genl and make possible for multiple netdevs to use it there?
>> I would still miss the namespace of the pool, as it naturally aligns
>> with devlink device. IDK :/
>
>Maybe the first thing to iron out is the life cycle. Right now we
>throw all configuration requests at the driver which ends really badly
>for those of us who deal with heterogeneous environments. Applications
>which try to do advanced stuff like pinning and XDP break because of
>all the behavior differences between drivers. So I don't think we
>should expose configuration of unstable objects (those which user
>doesn't create explicitly - queues, irqs, page pools etc) to the driver.
>The driver should get or read the config from the core when the object
>is created.

I see. But again, for global objects, I understand. But this is
device-specific object and configuration. How do you tie it up together?


>
>This gets back to the proposed descriptor pool because there's a
>chicken and an egg problem between creating the representors and
>creating the descriptor pool, right? Either:
> - create reprs first with individual queues, reconfigure them to bind
>   them to a pool
> - create pool first bind the reprs which don't exist to them,
>   assuming the driver somehow maintains the mapping, pretty weird
>   to configure objects which don't exist
> - create pool first, add an extra knob elsewhere (*cough* "shared-descs
>   enable") which produces somewhat loosely defined reasonable behavior
>
>Because this is a general problem (again, any queue config needs it)
>I think we'll need to create some sort of a rule engine in netdev :(
>Instead of configuring a page pool you'd add a configuration rule
>which can match on netdev and queue id and gives any related page pool
>some parameters. NAPI is another example of something user can't
>reasonably configure directly. And if we create such a rule engine 
>it should probably be shared...

  reply	other threads:[~2024-02-15 13:19 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25  4:56 [RFC PATCH v2 net-next] Documentation: devlink: Add devlink-sd William Tu
2024-01-25 21:12 ` [RFC PATCH v3 " William Tu
2024-01-25 22:36 ` William Tu
2024-01-29 10:56   ` Simon Horman
2024-01-29 22:23     ` William Tu
2024-01-31  1:07   ` Jakub Kicinski
2024-01-31 18:47     ` William Tu
2024-01-31 19:06       ` Jakub Kicinski
2024-01-31 19:16         ` William Tu
2024-01-31 20:45           ` Jakub Kicinski
2024-01-31 21:37             ` William Tu
2024-01-31 21:41               ` Jacob Keller
2024-01-31 22:30                 ` Jakub Kicinski
2024-01-31 23:02                   ` William Tu
2024-01-31 23:17                     ` Jakub Kicinski
2024-02-01  2:23                       ` Samudrala, Sridhar
2024-02-01 14:00                         ` William Tu
2024-02-02  8:48                           ` Michal Swiatkowski
2024-02-02 15:27                             ` William Tu
2024-02-01 10:13                       ` Jiri Pirko
2024-02-02  4:00                         ` Jakub Kicinski
2024-02-02  7:46                           ` Jiri Pirko
2024-02-09  1:26                             ` Jakub Kicinski
2024-02-15 13:19                               ` Jiri Pirko [this message]
2024-02-15 17:41                                 ` Jacob Keller
2024-02-16  2:07                                   ` Jakub Kicinski
2024-02-16  8:15                                     ` Jiri Pirko
2024-02-16 21:42                                     ` Jacob Keller
2024-02-16 21:47                                     ` Jacob Keller
2024-02-19  8:59                                       ` Jiri Pirko
2024-02-16  8:10                                   ` Jiri Pirko
2024-02-16 21:44                                     ` Jacob Keller
2024-02-16  1:58                                 ` Jakub Kicinski
2024-02-16  8:06                                   ` Jiri Pirko
2024-02-17  2:43                                     ` Jakub Kicinski
2024-02-19  9:06                                       ` Jiri Pirko
2024-02-20 22:17                                         ` Jakub Kicinski
2024-02-01 19:16                       ` William Tu
2024-02-02  3:30                         ` Jakub Kicinski
2024-02-02  4:26                           ` William Tu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zc4Pa4QWGQegN4mI@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=aleksander.lobakin@intel.com \
    --cc=bodong@nvidia.com \
    --cc=jacob.e.keller@intel.com \
    --cc=jiri@nvidia.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=witu@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox