public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: netdev@vger.kernel.org
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, horms@kernel.org, donald.hunter@gmail.com,
	corbet@lwn.net, skhan@linuxfoundation.org, saeedm@nvidia.com,
	leon@kernel.org, tariqt@nvidia.com, mbloch@nvidia.com,
	przemyslaw.kitszel@intel.com, mschmidt@redhat.com,
	andrew+netdev@lunn.ch, rostedt@goodmis.org, mhiramat@kernel.org,
	mathieu.desnoyers@efficios.com, chuck.lever@oracle.com,
	matttbe@kernel.org, cjubran@nvidia.com, daniel.zahka@gmail.com,
	linux-doc@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org
Subject: [PATCH net-next v4 00/13] devlink: introduce shared devlink instance for PFs on same chip
Date: Thu, 12 Mar 2026 11:03:54 +0100	[thread overview]
Message-ID: <20260312100407.551173-1-jiri@resnulli.us> (raw)

From: Jiri Pirko <jiri@nvidia.com>

Multiple PFs on a network adapter often reside on the same physical
chip, running a single firmware. Some resources and configurations
are inherently shared among these PFs - PTP clocks, VF group rates,
firmware parameters, and others. Today there is no good object in
the devlink model to attach these chip-wide configuration knobs to.
Drivers resort to workarounds like pinning shared state to PF0 or
maintaining ad-hoc internal structures (e.g., ice_adapter) that are
invisible to userspace.

This problem was discussed extensively starting with Przemek Kitszel's
"whole device devlink instance" RFC for the ice driver [1]. Several
approaches for representing the parent instance were considered:
using a partial PCI BDF as the dev_name (breaks when PFs have different
BDFs in VMs), creating a per-driver bus, using auxiliary devices, or
using faux devices. All of these required a backing struct device for
the parent devlink instance, which does not naturally exist - there is
no PCI device that represents the chip as a whole.

This patchset takes a different approach: allow devlink instances to
exist without any backing struct device. The instance is identified
purely by its internal index, exposed over devlin netlink. This avoids
fabricating fake devices and keeps the devlink handle semantics clean.

The first ten patches prepare the devlink core for device-less
instances by decoupling the handle from the parent device. The last
three introduce the shared devlink infrastructure and its first user
in the mlx5 driver.

Example output showing the shared instance and nesting:

  pci/0000:08:00.0: index 0
    nested_devlink:
      auxiliary/mlx5_core.eth.0
  devlink_index/1: index 1
    nested_devlink:
      pci/0000:08:00.0
      pci/0000:08:00.1
  auxiliary/mlx5_core.eth.0: index 2
  pci/0000:08:00.1: index 3
    nested_devlink:
      auxiliary/mlx5_core.eth.1
  auxiliary/mlx5_core.eth.1: index 4

[1] https://lore.kernel.org/netdev/20250219164410.35665-1-przemyslaw.kitszel@intel.com/

---
Decoupled from "devlink and mlx5: Support cross-function rate scheduling"
patchset to maintain 15-patches limit.

See individual patches for changelog.

Jiri Pirko (13):
  devlink: expose devlink instance index over netlink
  devlink: add helpers to get bus_name/dev_name
  devlink: avoid extra iterations when found devlink is not registered
  devlink: allow to use devlink index as a command handle
  devlink: support index-based lookup via bus_name/dev_name handle
  devlink: support index-based notification filtering
  devlink: introduce __devlink_alloc() with dev driver pointer
  devlink: add devlink_dev_driver_name() helper and use it in trace
    events
  devlink: add devl_warn() helper and use it in port warnings
  devlink: allow devlink instance allocation without a backing device
  devlink: introduce shared devlink instance for PFs on same chip
  documentation: networking: add shared devlink documentation
  net/mlx5: Add a shared devlink instance for PFs on same chip

 Documentation/netlink/specs/devlink.yaml      |  58 +++
 .../networking/devlink/devlink-shared.rst     |  97 +++++
 Documentation/networking/devlink/index.rst    |   1 +
 .../net/ethernet/mellanox/mlx5/core/Makefile  |   5 +-
 .../net/ethernet/mellanox/mlx5/core/main.c    |  17 +
 .../ethernet/mellanox/mlx5/core/sh_devlink.c  |  61 +++
 .../ethernet/mellanox/mlx5/core/sh_devlink.h  |  12 +
 include/linux/mlx5/driver.h                   |   1 +
 include/net/devlink.h                         |  10 +
 include/trace/events/devlink.h                |  36 +-
 include/uapi/linux/devlink.h                  |   4 +
 net/devlink/Makefile                          |   2 +-
 net/devlink/core.c                            |  91 ++++-
 net/devlink/dev.c                             |   8 +-
 net/devlink/devl_internal.h                   |  34 +-
 net/devlink/netlink.c                         |  52 ++-
 net/devlink/netlink_gen.c                     | 355 +++++++++++-------
 net/devlink/port.c                            |  19 +-
 net/devlink/sh_dev.c                          | 161 ++++++++
 19 files changed, 815 insertions(+), 209 deletions(-)
 create mode 100644 Documentation/networking/devlink/devlink-shared.rst
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.c
 create mode 100644 drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.h
 create mode 100644 net/devlink/sh_dev.c

-- 
2.51.1


             reply	other threads:[~2026-03-12 10:04 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-12 10:03 Jiri Pirko [this message]
2026-03-12 10:03 ` [PATCH net-next v4 01/13] devlink: expose devlink instance index over netlink Jiri Pirko
2026-03-12 10:03 ` [PATCH net-next v4 02/13] devlink: add helpers to get bus_name/dev_name Jiri Pirko
2026-03-12 10:03 ` [PATCH net-next v4 03/13] devlink: avoid extra iterations when found devlink is not registered Jiri Pirko
2026-03-12 10:03 ` [PATCH net-next v4 04/13] devlink: allow to use devlink index as a command handle Jiri Pirko
2026-03-12 10:03 ` [PATCH net-next v4 05/13] devlink: support index-based lookup via bus_name/dev_name handle Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 06/13] devlink: support index-based notification filtering Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 07/13] devlink: introduce __devlink_alloc() with dev driver pointer Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 08/13] devlink: add devlink_dev_driver_name() helper and use it in trace events Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 09/13] devlink: add devl_warn() helper and use it in port warnings Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 10/13] devlink: allow devlink instance allocation without a backing device Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 11/13] devlink: introduce shared devlink instance for PFs on same chip Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 12/13] documentation: networking: add shared devlink documentation Jiri Pirko
2026-03-12 10:04 ` [PATCH net-next v4 13/13] net/mlx5: Add a shared devlink instance for PFs on same chip Jiri Pirko
2026-03-20 23:16   ` Adam Young
2026-03-20 23:37     ` Adam Young
2026-03-23 15:05       ` Jiri Pirko
2026-03-24 13:02         ` Jiri Pirko
2026-03-24 17:49           ` Adam Young
2026-03-24 19:57             ` Adam Young
2026-03-24 15:10   ` [PATCH net-next v4 13/15] " Ben Copeland
2026-03-24 15:21     ` Jiri Pirko
2026-03-24 15:37       ` Ben Copeland
2026-03-14 20:20 ` [PATCH net-next v4 00/13] devlink: introduce " patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260312100407.551173-1-jiri@resnulli.us \
    --to=jiri@resnulli.us \
    --cc=andrew+netdev@lunn.ch \
    --cc=chuck.lever@oracle.com \
    --cc=cjubran@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=daniel.zahka@gmail.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=edumazet@google.com \
    --cc=horms@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matttbe@kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=mhiramat@kernel.org \
    --cc=mschmidt@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=przemyslaw.kitszel@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=saeedm@nvidia.com \
    --cc=skhan@linuxfoundation.org \
    --cc=tariqt@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox