From: Jakub Kicinski <kuba@kernel.org>
To: Tariq Toukan <tariqt@nvidia.com>
Cc: Eric Dumazet <edumazet@google.com>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Donald Hunter <donald.hunter@gmail.com>,
Jiri Pirko <jiri@resnulli.us>, Jonathan Corbet <corbet@lwn.net>,
Saeed Mahameed <saeedm@nvidia.com>,
"Leon Romanovsky" <leon@kernel.org>,
Mark Bloch <mbloch@nvidia.com>, <netdev@vger.kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-doc@vger.kernel.org>,
<linux-rdma@vger.kernel.org>, Gal Pressman <gal@nvidia.com>,
Moshe Shemesh <moshe@nvidia.com>,
Carolina Jubran <cjubran@nvidia.com>,
Cosmin Ratiu <cratiu@nvidia.com>, Jiri Pirko <jiri@nvidia.com>,
Randy Dunlap <rdunlap@infradead.org>,
Simon Horman <horms@kernel.org>,
Krzysztof Kozlowski <krzk@kernel.org>
Subject: Re: [PATCH net-next V7 01/14] documentation: networking: add shared devlink documentation
Date: Mon, 2 Feb 2026 19:40:23 -0800 [thread overview]
Message-ID: <20260202194023.412bb454@kernel.org> (raw)
In-Reply-To: <20260128112544.1661250-2-tariqt@nvidia.com>
On Wed, 28 Jan 2026 13:25:31 +0200 Tariq Toukan wrote:
> From: Jiri Pirko <jiri@nvidia.com>
>
> Document shared devlink instances for multiple PFs on the same chip.
> diff --git a/Documentation/networking/devlink/devlink-shared.rst b/Documentation/networking/devlink/devlink-shared.rst
> new file mode 100644
> index 000000000000..74655dc671bc
> --- /dev/null
> +++ b/Documentation/networking/devlink/devlink-shared.rst
> @@ -0,0 +1,95 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +============================
> +Devlink Shared Instances
> +============================
Shouldn't the length of the ==== lines match the title length?
> +Overview
> +========
> +
> +Shared devlink instances allow multiple physical functions (PFs) on the same
> +chip to share an additional devlink instance for chip-wide operations. This
> +is implemented within individual drivers alongside the individual PF devlink
> +instances, not replacing them.
> +
> +Multiple PFs may reside on the same physical chip, running a single firmware.
> +Some of the resources and configurations may be shared among these PFs. The
> +shared devlink instance provides an object to pin configuration knobs on.
> +
> +The shared devlink instance is backed by a faux device and provides a common
> +interface for operations that affect the entire chip rather than individual PFs.
> +A faux device is used as a backing device for the 'entire chip' since there's no
> +additional real device instantiated by hardware besides the PF devices.
There needs to be a note here clearly stating the the use of "shared
devlink instace" is a hack for legacy drivers, and new drivers should
have a single devlink instance for the entire device. The fact that
single instance is always preferred, and *more correct* must be made
very clear to the reader. Ideally the single instance multiple function
implementation would leverage the infra added here for collecting the
functions, however.
> +Implementation
> +==============
> +
> +Architecture
> +------------
> +
> +The implementation uses:
> +
> +* **Faux device**: Virtual device backing the shared devlink instance
"backing"? It isn't backing anything, its just another hack because we
made the mistake of tying devlink instances to $bus/$device as an id.
Now we need a fake device to have an identifier.
> +* **Chip identification**: PFs are grouped by chip using a driver-specific identifier
> +* **Shared instance management**: Global list of shared instances with reference counting
> +
> +API Functions
> +-------------
> +
> +The following functions are provided for managing shared devlink instances:
> +
> +* ``devlink_shd_get()``: Get or create a shared devlink instance identified by a string ID
> +* ``devlink_shd_put()``: Release a reference on a shared devlink instance
> +* ``devlink_shd_get_priv()``: Get private data from shared devlink instance
> +
> +Initialization Flow
> +-------------------
> +
> +1. **PF calls shared devlink init** during driver probe
> +2. **Chip identification** using driver-specific method to determine device identity
This isn't very clear.
> +3. **Get or create shared instance** using ``devlink_shd_get()``:
Just "Call ``devlink_shd_get()`` with the identifier constructed in
step 2" (?) and then have the points below explain that it gets or
recreates
> + * The function looks up existing instance by identifier
> + * If none exists, creates new instance:
> + - Creates faux device with chip identifier as name
> + - Allocates and registers devlink instance
> + - Adds to global shared instances list
> + - Increments reference count
> +
> +4. **Set nested devlink instance** for the PF devlink instance using
> + ``devl_nested_devlink_set()`` before registering the PF devlink instance
> +
> +Cleanup Flow
> +------------
> +
> +1. **Cleanup** when PF is removed
"``.remove()`` callback for a PCIe device is called"
> +2. **Call** ``devlink_shd_put()`` to release reference (decrements reference count)
> +3. **Shared instance is automatically destroyed** when the last PF removes (device list becomes empty)
> +
> +Chip Identification
> +-------------------
> +
> +PFs belonging to the same chip are identified using a driver-specific method.
> +The driver is free to choose any identifier that is suitable for determining
> +whether two PFs are part of the same device. Examples include:
> +
> +* **PCI VPD serial numbers**: Extract from PCI VPD
> +* **Device tree properties**: Read chip identifier from device tree
> +* **Other hardware-specific identifiers**: Any unique identifier that groups PFs by chip
> +
> +Locking
> +-------
> +
> +A global mutex (``shd_mutex``) protects the shared instances list during
> +registration/deregistration.
> +
> +Similarly to other nested devlink instance relationships, devlink lock of
> +the shared instance should be always taken after the devlink lock of PF.
of an instance, not a PF
> +
> +Reference Counting
> +------------------
> +
> +Each shared devlink instance maintains a reference count (``refcount_t refcount``).
> +The reference count is incremented when ``devlink_shd_get()`` is called and
> +decremented when ``devlink_shd_put()`` is called. When the reference count
> +reaches zero, the shared instance is automatically destroyed.
I think AI went too far with the text generation here, this is very
obvious from the previous sections.
--
pw-bot: cr
next prev parent reply other threads:[~2026-02-03 3:40 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-28 11:25 [PATCH net-next V7 00/14] devlink and mlx5: Support cross-function rate scheduling Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 01/14] documentation: networking: add shared devlink documentation Tariq Toukan
2026-02-03 3:40 ` Jakub Kicinski [this message]
2026-02-03 9:18 ` Jiri Pirko
2026-02-04 3:01 ` Jakub Kicinski
2026-02-04 7:12 ` Jiri Pirko
2026-02-05 2:02 ` Jakub Kicinski
2026-02-06 10:52 ` Jiri Pirko
2026-02-07 1:50 ` Jakub Kicinski
2026-01-28 11:25 ` [PATCH net-next V7 02/14] devlink: introduce shared devlink instance for PFs on same chip Tariq Toukan
2026-02-03 3:49 ` Jakub Kicinski
2026-02-03 9:44 ` Jiri Pirko
2026-02-04 2:42 ` Jakub Kicinski
2026-02-04 7:15 ` Jiri Pirko
2026-02-05 2:06 ` Jakub Kicinski
2026-01-28 11:25 ` [PATCH net-next V7 03/14] devlink: Reverse locking order for nested instances Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 04/14] devlink: Add helpers to lock nested-in instances Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 05/14] devlink: Refactor devlink_rate_nodes_check Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 06/14] devlink: Decouple rate storage from associated devlink object Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 07/14] devlink: Add parent dev to devlink API Tariq Toukan
2026-02-03 4:00 ` Jakub Kicinski
2026-02-11 16:28 ` Cosmin Ratiu
2026-02-11 16:57 ` Jakub Kicinski
2026-01-28 11:25 ` [PATCH net-next V7 08/14] devlink: Allow parent dev for rate-set and rate-new Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 09/14] devlink: Allow rate node parents from other devlinks Tariq Toukan
2026-02-03 4:04 ` Jakub Kicinski
2026-01-28 11:25 ` [PATCH net-next V7 10/14] net/mlx5: Add a shared devlink instance for PFs on same chip Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 11/14] net/mlx5: Expose a function to clear a vport's parent Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 12/14] net/mlx5: Store QoS sched nodes in the sh_devlink Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 13/14] net/mlx5: qos: Support cross-device tx scheduling Tariq Toukan
2026-01-28 11:25 ` [PATCH net-next V7 14/14] net/mlx5: Document devlink rates Tariq Toukan
2026-02-03 4:09 ` [PATCH net-next V7 00/14] devlink and mlx5: Support cross-function rate scheduling Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260202194023.412bb454@kernel.org \
--to=kuba@kernel.org \
--cc=andrew+netdev@lunn.ch \
--cc=cjubran@nvidia.com \
--cc=corbet@lwn.net \
--cc=cratiu@nvidia.com \
--cc=davem@davemloft.net \
--cc=donald.hunter@gmail.com \
--cc=edumazet@google.com \
--cc=gal@nvidia.com \
--cc=horms@kernel.org \
--cc=jiri@nvidia.com \
--cc=jiri@resnulli.us \
--cc=krzk@kernel.org \
--cc=leon@kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mbloch@nvidia.com \
--cc=moshe@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=rdunlap@infradead.org \
--cc=saeedm@nvidia.com \
--cc=tariqt@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox