Netdev List
 help / color / mirror / Atom feed
From: Tariq Toukan <tariqt@nvidia.com>
To: Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, <netdev@vger.kernel.org>,
	Paolo Abeni <pabeni@redhat.com>
Cc: Adithya Jayachandran <ajayachandra@nvidia.com>,
	Bobby Eshleman <bobbyeshleman@meta.com>,
	Carolina Jubran <cjubran@nvidia.com>,
	Cosmin Ratiu <cratiu@nvidia.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Daniel Jurgens <danielj@nvidia.com>,
	Daniel Zahka <daniel.zahka@gmail.com>, David Wei <dw@davidwei.uk>,
	Donald Hunter <donald.hunter@gmail.com>,
	Dragos Tatulea <dtatulea@nvidia.com>,
	Jiri Pirko <jiri@nvidia.com>, Jiri Pirko <jiri@resnulli.us>,
	Joe Damato <joe@dama.to>, Jonathan Corbet <corbet@lwn.net>,
	Kees Cook <kees@kernel.org>, Leon Romanovsky <leon@kernel.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-kselftest@vger.kernel.org>, <linux-rdma@vger.kernel.org>,
	Mark Bloch <mbloch@nvidia.com>, Moshe Shemesh <moshe@nvidia.com>,
	Or Har-Toov <ohartoov@nvidia.com>,
	Parav Pandit <parav@nvidia.com>, Petr Machata <petrm@nvidia.com>,
	Ratheesh Kannoth <rkannoth@marvell.com>,
	Saeed Mahameed <saeedm@nvidia.com>,
	Shahar Shitrit <shshitrit@nvidia.com>,
	Shay Drori <shayd@nvidia.com>, Shuah Khan <shuah@kernel.org>,
	Shuah Khan <skhan@linuxfoundation.org>,
	Simon Horman <horms@kernel.org>,
	Stanislav Fomichev <sdf@fomichev.me>,
	Tariq Toukan <tariqt@nvidia.com>,
	Willem de Bruijn <willemb@google.com>,
	Gal Pressman <gal@nvidia.com>
Subject: [PATCH net-next V10 05/14] devlink: Add parent dev to devlink API
Date: Wed, 1 Jul 2026 10:32:45 +0300	[thread overview]
Message-ID: <20260701073254.754518-6-tariqt@nvidia.com> (raw)
In-Reply-To: <20260701073254.754518-1-tariqt@nvidia.com>

From: Cosmin Ratiu <cratiu@nvidia.com>

Upcoming changes to the rate commands need the parent devlink specified.
This change adds a nested 'parent-dev' attribute to the API and helpers
to obtain and put a reference to the parent devlink instance in
info->ctx.

To avoid deadlocks, the parent devlink is unlocked before obtaining the
main devlink instance that is the target of the request.
A reference to the parent is kept until the end of the request to avoid
it suddenly disappearing.

This means that this reference is of limited use without additional
protection.

Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
 Documentation/netlink/specs/devlink.yaml | 20 +++++++++++++
 include/uapi/linux/devlink.h             |  2 ++
 net/devlink/devl_internal.h              |  3 ++
 net/devlink/netlink.c                    | 36 ++++++++++++++++++++----
 4 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index 52ad1e7805d1..13d960b3abb1 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -895,6 +895,16 @@ attribute-sets:
           resource-dump response. Bit 0 (dev) selects device-level
           resources; bit 1 (port) selects port-level resources.
           When absent all classes are returned.
+      -
+        name: parent-dev
+        type: nest
+        nested-attributes: dl-parent-dev
+        doc: |
+          Identifies the devlink instance which owns the parent rate node.
+          Used with rate-set and rate-new to parent a rate object to a node on
+          a different devlink instance, enabling cross-device rate scheduling.
+          When absent, the parent node is resolved on the same instance.
+
   -
     name: dl-dev-stats
     subset-of: devlink
@@ -1317,6 +1327,16 @@ attribute-sets:
              Specifies the bandwidth share assigned to the Traffic Class.
              The bandwidth for the traffic class is determined
              in proportion to the sum of the shares of all configured classes.
+  -
+    name: dl-parent-dev
+    subset-of: devlink
+    attributes:
+      -
+        name: bus-name
+      -
+        name: dev-name
+      -
+        name: index
 
 operations:
   enum-model: directional
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index ca713bcc47b9..a6801feb7744 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -648,6 +648,8 @@ enum devlink_attr {
 	DEVLINK_ATTR_INDEX,			/* uint */
 	DEVLINK_ATTR_RESOURCE_SCOPE_MASK,	/* u32 */
 
+	DEVLINK_ATTR_PARENT_DEV,		/* nested */
+
 	/* Add new attributes above here, update the spec in
 	 * Documentation/netlink/specs/devlink.yaml and re-generate
 	 * net/devlink/netlink_gen.c.
diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h
index 52c8bf359dd4..cdf894ba5a9d 100644
--- a/net/devlink/devl_internal.h
+++ b/net/devlink/devl_internal.h
@@ -154,6 +154,7 @@ int devlink_rel_devlink_handle_put(struct sk_buff *msg, struct devlink *devlink,
 struct devlink_nl_ctx {
 	struct devlink *devlink;
 	struct devlink_port *devlink_port;
+	struct devlink *parent_devlink;
 };
 
 static inline struct devlink_nl_ctx *
@@ -197,6 +198,8 @@ typedef int devlink_nl_dump_one_func_t(struct sk_buff *msg,
 struct devlink *
 devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs,
 			    bool dev_lock);
+struct devlink *
+devlink_get_parent_from_attrs_lock(struct net *net, struct nlattr **attrs);
 
 int devlink_nl_dumpit(struct sk_buff *msg, struct netlink_callback *cb,
 		      devlink_nl_dump_one_func_t *dump_one);
diff --git a/net/devlink/netlink.c b/net/devlink/netlink.c
index f0a857e286bc..5a057dc86b0f 100644
--- a/net/devlink/netlink.c
+++ b/net/devlink/netlink.c
@@ -12,6 +12,7 @@
 #define DEVLINK_NL_FLAG_NEED_PORT		BIT(0)
 #define DEVLINK_NL_FLAG_NEED_DEVLINK_OR_PORT	BIT(1)
 #define DEVLINK_NL_FLAG_NEED_DEV_LOCK		BIT(2)
+#define DEVLINK_NL_FLAG_OPTIONAL_PARENT_DEV	BIT(3)
 
 static const struct genl_multicast_group devlink_nl_mcgrps[] = {
 	[DEVLINK_MCGRP_CONFIG] = { .name = DEVLINK_GENL_MCGRP_CONFIG_NAME },
@@ -239,19 +240,39 @@ devlink_get_from_attrs_lock(struct net *net, struct nlattr **attrs,
 	return ERR_PTR(-ENODEV);
 }
 
+struct devlink *
+devlink_get_parent_from_attrs_lock(struct net *net, struct nlattr **attrs)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+
 static int __devlink_nl_pre_doit(struct sk_buff *skb, struct genl_info *info,
 				 u8 flags)
 {
+	bool parent_dev = flags & DEVLINK_NL_FLAG_OPTIONAL_PARENT_DEV;
 	bool dev_lock = flags & DEVLINK_NL_FLAG_NEED_DEV_LOCK;
+	struct devlink *devlink, *parent_devlink = NULL;
+	struct net *net = genl_info_net(info);
+	struct nlattr **attrs = info->attrs;
 	struct devlink_port *devlink_port;
-	struct devlink *devlink;
 	int err;
 
-	devlink = devlink_get_from_attrs_lock(genl_info_net(info), info->attrs,
-					      dev_lock);
-	if (IS_ERR(devlink))
-		return PTR_ERR(devlink);
+	if (parent_dev && attrs[DEVLINK_ATTR_PARENT_DEV]) {
+		parent_devlink = devlink_get_parent_from_attrs_lock(net, attrs);
+		if (IS_ERR(parent_devlink))
+			return PTR_ERR(parent_devlink);
+		devlink_nl_ctx(info)->parent_devlink = parent_devlink;
+		/* Drop the parent devlink lock but don't release the reference.
+		 * This will keep it alive until the end of the request.
+		 */
+		devl_unlock(parent_devlink);
+	}
 
+	devlink = devlink_get_from_attrs_lock(net, attrs, dev_lock);
+	if (IS_ERR(devlink)) {
+		err = PTR_ERR(devlink);
+		goto parent_put;
+	}
 	devlink_nl_ctx(info)->devlink = devlink;
 	if (flags & DEVLINK_NL_FLAG_NEED_PORT) {
 		devlink_port = devlink_port_get_from_info(devlink, info);
@@ -270,6 +291,9 @@ static int __devlink_nl_pre_doit(struct sk_buff *skb, struct genl_info *info,
 unlock:
 	devl_dev_unlock(devlink, dev_lock);
 	devlink_put(devlink);
+parent_put:
+	if (parent_dev && parent_devlink)
+		devlink_put(parent_devlink);
 	return err;
 }
 
@@ -307,6 +331,8 @@ static void __devlink_nl_post_doit(struct sk_buff *skb, struct genl_info *info,
 	devlink = devlink_nl_ctx(info)->devlink;
 	devl_dev_unlock(devlink, dev_lock);
 	devlink_put(devlink);
+	if (devlink_nl_ctx(info)->parent_devlink)
+		devlink_put(devlink_nl_ctx(info)->parent_devlink);
 }
 
 void devlink_nl_post_doit(const struct genl_split_ops *ops,
-- 
2.44.0


  parent reply	other threads:[~2026-07-01  7:34 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01  7:32 [PATCH net-next V10 00/14] devlink and mlx5: Support cross-function rate scheduling Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 01/14] devlink: Update nested instance locking comment Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 02/14] devlink: Add a helper for getting a nested-in instance Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 03/14] devlink: Migrate from info->user_ptr to info->ctx Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 04/14] devlink: Decouple rate storage from associated devlink object Tariq Toukan
2026-07-01  7:32 ` Tariq Toukan [this message]
2026-07-01  7:32 ` [PATCH net-next V10 06/14] devlink: Allow parent dev for rate-set and rate-new Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 07/14] devlink: Allow rate node parents from other devlinks Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 08/14] net/mlx5: qos: Use mlx5_lag_query_bond_speed to query LAG speed Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 09/14] net/mlx5: qos: Refactor vport QoS cleanup Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 10/14] net/mlx5: qos: Model the root node in the scheduling hierarchy Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 11/14] net/mlx5: qos: Remove qos domains and use shd Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 12/14] net/mlx5: qos: Support cross-device tx scheduling Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 13/14] selftests: drv-net: Add test for cross-esw rate scheduling Tariq Toukan
2026-07-01  7:32 ` [PATCH net-next V10 14/14] net/mlx5: Document devlink rates Tariq Toukan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260701073254.754518-6-tariqt@nvidia.com \
    --to=tariqt@nvidia.com \
    --cc=ajayachandra@nvidia.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=bobbyeshleman@meta.com \
    --cc=cjubran@nvidia.com \
    --cc=corbet@lwn.net \
    --cc=cratiu@nvidia.com \
    --cc=daniel.zahka@gmail.com \
    --cc=daniel@iogearbox.net \
    --cc=danielj@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=donald.hunter@gmail.com \
    --cc=dtatulea@nvidia.com \
    --cc=dw@davidwei.uk \
    --cc=edumazet@google.com \
    --cc=gal@nvidia.com \
    --cc=horms@kernel.org \
    --cc=jiri@nvidia.com \
    --cc=jiri@resnulli.us \
    --cc=joe@dama.to \
    --cc=kees@kernel.org \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mbloch@nvidia.com \
    --cc=moshe@nvidia.com \
    --cc=netdev@vger.kernel.org \
    --cc=ohartoov@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=parav@nvidia.com \
    --cc=petrm@nvidia.com \
    --cc=rkannoth@marvell.com \
    --cc=saeedm@nvidia.com \
    --cc=sdf@fomichev.me \
    --cc=shayd@nvidia.com \
    --cc=shshitrit@nvidia.com \
    --cc=shuah@kernel.org \
    --cc=skhan@linuxfoundation.org \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox