Re: [RFC net-next v2 1/2] devlink: add whole device devlink instance

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Przemek Kitszel <przemyslaw.kitszel@intel.com>
To: Jiri Pirko <jiri@resnulli.us>
Cc: <intel-wired-lan@lists.osuosl.org>,
	Tony Nguyen <anthony.l.nguyen@intel.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Cosmin Ratiu <cratiu@nvidia.com>,
	Tariq Toukan <tariqt@nvidia.com>, <netdev@vger.kernel.org>,
	Konrad Knitter <konrad.knitter@intel.com>,
	"Jacob Keller" <jacob.e.keller@intel.com>, <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>, Andrew Lunn <andrew@lunn.ch>,
	<linux-kernel@vger.kernel.org>,
	ITP Upstream <nxne.cnse.osdt.itp.upstreaming@intel.com>,
	Carolina Jubran <cjubran@nvidia.com>
Subject: Re: [RFC net-next v2 1/2] devlink: add whole device devlink instance
Date: Tue, 25 Feb 2025 16:40:49 +0100	[thread overview]
Message-ID: <e027f9e5-ff3a-4bc1-8297-9400a4ff62a6@intel.com> (raw)
In-Reply-To: <zzyls3te4he2l5spf4wzfb53imuoemopwl774dzq5t5s22sg7l@37fk7fvgvnrr>

On 2/25/25 15:35, Jiri Pirko wrote:
> Tue, Feb 25, 2025 at 12:30:49PM +0100, przemyslaw.kitszel@intel.com wrote:
>>
>>>> Thanks to Wojciech Drewek for very nice naming of the devlink instance:
>>>> PF0:		pci/0000:00:18.0
>>>> whole-dev:	pci/0000:00:18
>>>> But I made this a param for now (driver is free to pass just "whole-dev").
>>>>
>>>> $ devlink dev # (Interesting part of output only)
>>>> pci/0000:af:00:
>>>>    nested_devlink:
>>>>      pci/0000:af:00.0
>>>>      pci/0000:af:00.1
>>>>      pci/0000:af:00.2
>>>>      pci/0000:af:00.3
>>>>      pci/0000:af:00.4
>>>>      pci/0000:af:00.5
>>>>      pci/0000:af:00.6
>>>>      pci/0000:af:00.7
>>>
>>>
>>> In general, I like this approach. In fact, I have quite similar
>>> patch/set in my sandbox git.
>>>
>>> The problem I didn't figure out how to handle, was a backing entity
>>> for the parent devlink.
>>>
>>> You use part of PCI BDF, which is obviously wrong:
>>> 1) bus_name/dev_name the user expects to be the backing device bus and
>>>      address on it (pci/usb/i2c). With using part of BDF, you break this
>>>      assumption.
>>> 2) 2 PFs can have totally different BDF (in VM for example). Then your
>>>      approach is broken.
>>
>> To make the hard part of it easy, I like to have the name to be provided
>> by what the PF/driver has available (whichever will be the first of
>> given device PFs), as of now, we resolve this issue (and provide ~what
>> your devlink_shared does) via ice_adapter.
> 
> I don't understand. Can you provide some examples please?

Right now we have one object of struct ice_adapter per device/card,
it is refcounted and freed after last PF put()s their copy.
In the struct one could have a mutex or spinlock to guard shared stuff,
existing example is ptp_gltsyn_time_lock of ice driver.

> 
> 
>>
>> Making it a devlink instance gives user an easy way to see the whole
>> picture of all resources handled as "shared per device", my current

This part is what is missing in current devlink impl and likely would
still be after your series. I would still like to have it :)
(And the rest is sugar coating for me)

>> output, for all PFs and VFs on given device:
>>
>> pci/0000:af:00:
>>   name rss size 8 unit entry size_min 0 size_max 24 size_gran 1
>>     resources:
>>       name lut_512 size 0 unit entry size_min 0 size_max 16 size_gran 1
>>       name lut_2048 size 8 unit entry size_min 0 size_max 8 size_gran 1
>>
>> What is contributing to the hardness, this is not just one for all ice
>> PFs, but one per device, which we distinguish via pci BDF.
> 
> How?

code is in ice_adapter_index()
Now I get what DSN is, looks like it could be used equally well instead
pci BDF.

Still we need more instances, each card has their own PTP clock, their
own "global RSS LUT" pool, etc.

> 
> 
>>
>>>
>>> I was thinking about having an auxiliary device created for the parent,
>>> but auxiliary assumes it is child. The is upside-down.
>>>
>>> I was thinking about having some sort of made-up per-driver bus, like
>>> "ice" of "mlx5" with some thing like DSN that would act as a "dev_name".
>>> I have a patch that introduces:
>>>
>>> struct devlink_shared_inst;
>>>
>>> struct devlink *devlink_shared_alloc(const struct devlink_ops *ops,
>>>                                        size_t priv_size, struct net *net,
>>>                                        struct module *module, u64 per_module_id,
>>>                                        void *inst_priv,
>>>                                        struct devlink_shared_inst **p_inst);
>>> void devlink_shared_free(struct devlink *devlink,
>>>                           struct devlink_shared_inst *inst);
>>>
>>> I took a stab at it here:
>>> https://github.com/jpirko/linux_mlxsw/commits/wip_dl_pfs_parent/
>>> The work is not finished.
>>>
>>>
>>> Also, I was thinking about having some made-up bus, like "pci_ids",
>>> where instead of BDFs as addresses, there would be DSN for example.
>>>
>>> None of these 3 is nice.
>>
>> how one would invent/infer/allocate the DSN?
> 
> Driver knows DSN, it can obtain from pci layer.

Aaach, I got the abbreviation wrong, pci_get_dsn() does the thing, thank
you. BTW, again, by Jake :D

> 
> 
>>
>> faux_bus mentioned by Jake would be about the same level of "fakeness"
>> as simply allocating a new instance of devlink by the first PF, IMO :)
> 
> Hmm, briefly looking at faux, this looks like fills the gap I missed in
> auxdev. Will try to use it in my patchset.
> 
> Thanks!
> 
>

next prev parent reply	other threads:[~2025-02-25 15:41 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-19 16:32 [RFC net-next v2 0/2] devlink: whole-device, resource .occ_set() Przemek Kitszel
2025-02-19 16:32 ` [RFC net-next v2 1/2] devlink: add whole device devlink instance Przemek Kitszel
2025-02-19 22:11   ` Jacob Keller
2025-02-21  1:45   ` Jakub Kicinski
2025-02-21 22:50     ` Jacob Keller
2025-02-24 10:15       ` Przemek Kitszel
2025-02-24 13:03     ` Jiri Pirko
2025-02-24 22:09       ` Jacob Keller
2025-02-24 16:14   ` Jiri Pirko
2025-02-24 22:12     ` Jacob Keller
2025-02-25 11:30     ` Przemek Kitszel
2025-02-25 14:35       ` Jiri Pirko
2025-02-25 15:40         ` Przemek Kitszel [this message]
2025-02-25 18:16           ` Jacob Keller
2025-02-26 14:48           ` Jiri Pirko
2025-02-26 15:06             ` Przemek Kitszel
2025-02-26 15:25               ` Jiri Pirko
2025-03-18 15:42   ` Jiri Pirko
2025-02-19 16:32 ` [RFC net-next v2 2/2] devlink: give user option to allocate resources Przemek Kitszel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e027f9e5-ff3a-4bc1-8297-9400a4ff62a6@intel.com \
    --to=przemyslaw.kitszel@intel.com \
    --cc=andrew@lunn.ch \
    --cc=anthony.l.nguyen@intel.com \
    --cc=cjubran@nvidia.com \
    --cc=cratiu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=intel-wired-lan@lists.osuosl.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jiri@resnulli.us \
    --cc=konrad.knitter@intel.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=nxne.cnse.osdt.itp.upstreaming@intel.com \
    --cc=pabeni@redhat.com \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox