* [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
@ 2025-04-23 13:50 Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 1/5] " Moshe Shemesh
` (5 more replies)
0 siblings, 6 replies; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Moshe Shemesh
A function unique identifier (UID) is a vendor defined string of
arbitrary length that universally identifies a function. The function
UID can be reported by device drivers via devlink dev info command.
This patch set adds UID attribute to devlink port function that reports
the UID of the function that pertains to the devlink port. Code is also
added to mlx5 as the first user to implement this attribute.
The main purpose of adding this attribute is to allow users to
unambiguously map between a function and the devlink port that manages
it, which might be on another host.
For example, one can retrieve the UID of a function using the "devlink
dev info" command and then search for the same UID in the output of
"devlink port show" command.
The "devlink dev info" support for UID of a function is added by a
separate patchset [1]. This patchset is submitted as an RFC to
illustrate the other side of the solution.
Other existing identifiers such as serial_number or board.serial_number
are not good enough as they don't guarantee uniqueness per function. For
example, in a multi-host NIC all PFs report the same value.
Example output:
$ devlink port show pci/0000:03:00.0/327680 -jp
{
"port": {
"pci/0000:03:00.0/327680": {
"type": "eth",
"netdev": "pf0hpf",
"flavour": "pcipf",
"controller": 1,
"pfnum": 0,
"external": true,
"splittable": false,
"function": {
"hw_addr": "5c:25:73:37:70:5a",
"roce": "enable",
"max_io_eqs": 120,
"uid":
"C6A76AD20605BE026D23C14E70B90704F4A5F5B3F304D83B37000732BF861D48MLNXS0D0F0"
}
}
}
}
[1] https://lore.kernel.org/netdev/20250416214133.10582-1-jiri@resnulli.us/
Avihai Horon (5):
devlink: Add unique identifier to devlink port function
net/mlx5: Move mlx5_cmd_query_vuid() from IB to core
net/mlx5: Add vhca_id argument to mlx5_core_query_vuid()
net/mlx5: Add define for max VUID string size
net/mlx5: Expose unique identifier in devlink port function
Documentation/netlink/specs/devlink.yaml | 3 ++
.../networking/devlink/devlink-port.rst | 12 +++++++
drivers/infiniband/hw/mlx5/cmd.c | 21 ------------
drivers/infiniband/hw/mlx5/cmd.h | 2 --
drivers/infiniband/hw/mlx5/main.c | 5 +--
.../mellanox/mlx5/core/esw/devlink_port.c | 2 ++
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 2 ++
.../mellanox/mlx5/core/eswitch_offloads.c | 34 +++++++++++++++++++
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 22 ++++++++++++
include/linux/mlx5/driver.h | 3 ++
include/net/devlink.h | 8 +++++
include/uapi/linux/devlink.h | 1 +
net/devlink/port.c | 32 +++++++++++++++++
13 files changed, 122 insertions(+), 25 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 25+ messages in thread
* [RFC net-next 1/5] devlink: Add unique identifier to devlink port function
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
@ 2025-04-23 13:50 ` Moshe Shemesh
2025-04-28 12:33 ` Simon Horman
2025-04-23 13:50 ` [RFC net-next 2/5] net/mlx5: Move mlx5_cmd_query_vuid() from IB to core Moshe Shemesh
` (4 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Avihai Horon
From: Avihai Horon <avihaih@nvidia.com>
A function unique identifier (UID) is a vendor defined string of
arbitrary length that universally identifies a function. The function
UID can be reported via devlink dev info.
Add UID attribute to devlink port function that reports the UID of the
function that pertains to the devlink port.
This can be used to unambiguously map between a function and the devlink
port that manages it, and vice versa.
Example output:
$ devlink port show pci/0000:03:00.0/327680 -jp
{
"port": {
"pci/0000:03:00.0/327680": {
"type": "eth",
"netdev": "pf0hpf",
"flavour": "pcipf",
"controller": 1,
"pfnum": 0,
"external": true,
"splittable": false,
"function": {
"hw_addr": "5c:25:73:37:70:5a",
"roce": "enable",
"max_io_eqs": 120,
"uid": "C6A76AD20605BE026D23C14E70B90704F4A5F5B3F304D83B37000732BF861D48MLNXS0D0F0"
}
}
}
}
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
Documentation/netlink/specs/devlink.yaml | 3 ++
.../networking/devlink/devlink-port.rst | 12 +++++++
include/net/devlink.h | 8 +++++
include/uapi/linux/devlink.h | 1 +
net/devlink/port.c | 32 +++++++++++++++++++
5 files changed, 56 insertions(+)
diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
index bd9726269b4f..f4dade0e3c70 100644
--- a/Documentation/netlink/specs/devlink.yaml
+++ b/Documentation/netlink/specs/devlink.yaml
@@ -894,6 +894,9 @@ attribute-sets:
type: bitfield32
enum: port-fn-attr-cap
enum-as-flags: True
+ -
+ name: uid
+ type: string
-
name: dl-dpipe-tables
diff --git a/Documentation/networking/devlink/devlink-port.rst b/Documentation/networking/devlink/devlink-port.rst
index 9d22d41a7cd1..bb6f0970b322 100644
--- a/Documentation/networking/devlink/devlink-port.rst
+++ b/Documentation/networking/devlink/devlink-port.rst
@@ -328,6 +328,18 @@ interrupt vector.
function:
hw_addr 00:00:00:00:00:00 ipsec_packet disabled max_io_eqs 32
+Function unique identifier
+--------------------------
+A function unique identifier (UID) is a vendor defined string of arbitrary
+length that universally identifies a function. The function UID can be reported
+via devlink dev info.
+
+The devlink port function UID attribute reports the UID of the function that
+pertains to the devlink port.
+
+This can be used to unambiguously map between a function and the devlink port
+that manages it, and vice versa.
+
Subfunction
============
diff --git a/include/net/devlink.h b/include/net/devlink.h
index b8783126c1ed..46fd5b3f3253 100644
--- a/include/net/devlink.h
+++ b/include/net/devlink.h
@@ -1627,6 +1627,11 @@ void devlink_free(struct devlink *devlink);
* of event queues. Should be used by device drivers to
* configure maximum number of event queues
* of a function managed by the devlink port.
+ * @port_fn_uid_get: Callback used to get port function's uid. Should be used by
+ * device drivers to report the uid of the function managed by
+ * the devlink port.
+ * @port_fn_uid_max_size: The maximum size of the port function's uid including
+ * the null terminating byte.
*
* Note: Driver should return -EOPNOTSUPP if it doesn't support
* port function (@port_fn_*) handling for a particular port.
@@ -1682,6 +1687,9 @@ struct devlink_port_ops {
int (*port_fn_max_io_eqs_set)(struct devlink_port *devlink_port,
u32 max_eqs,
struct netlink_ext_ack *extack);
+ int (*port_fn_uid_get)(struct devlink_port *devlink_port, char *fuid,
+ struct netlink_ext_ack *extack);
+ size_t port_fn_uid_max_size;
};
void devlink_port_init(struct devlink *devlink,
diff --git a/include/uapi/linux/devlink.h b/include/uapi/linux/devlink.h
index 9401aa343673..7b9821433a72 100644
--- a/include/uapi/linux/devlink.h
+++ b/include/uapi/linux/devlink.h
@@ -687,6 +687,7 @@ enum devlink_port_function_attr {
DEVLINK_PORT_FN_ATTR_CAPS, /* bitfield32 */
DEVLINK_PORT_FN_ATTR_DEVLINK, /* nested */
DEVLINK_PORT_FN_ATTR_MAX_IO_EQS, /* u32 */
+ DEVLINK_PORT_FN_ATTR_UID, /* string */
__DEVLINK_PORT_FUNCTION_ATTR_MAX,
DEVLINK_PORT_FUNCTION_ATTR_MAX = __DEVLINK_PORT_FUNCTION_ATTR_MAX - 1
diff --git a/net/devlink/port.c b/net/devlink/port.c
index 939081a0e615..4d14d1bfab33 100644
--- a/net/devlink/port.c
+++ b/net/devlink/port.c
@@ -207,6 +207,35 @@ static int devlink_port_fn_max_io_eqs_fill(struct devlink_port *port,
return 0;
}
+static int devlink_port_fn_uid_fill(struct devlink_port *port,
+ struct sk_buff *msg,
+ struct netlink_ext_ack *extack,
+ bool *msg_updated)
+{
+ char *fuid;
+ int err;
+
+ if (!port->ops->port_fn_uid_get)
+ return 0;
+
+ fuid = kzalloc(port->ops->port_fn_uid_max_size, GFP_KERNEL);
+ if (!fuid)
+ return -ENOMEM;
+
+ err = port->ops->port_fn_uid_get(port, fuid, extack);
+ if (err) {
+ kfree(fuid);
+ return err == -EOPNOTSUPP ? 0 : err;
+ }
+
+ err = nla_put_string(msg, DEVLINK_PORT_FN_ATTR_UID, fuid);
+ if (!err)
+ *msg_updated = true;
+
+ kfree(fuid);
+ return err;
+}
+
int devlink_nl_port_handle_fill(struct sk_buff *msg, struct devlink_port *devlink_port)
{
if (devlink_nl_put_handle(msg, devlink_port->devlink))
@@ -468,6 +497,9 @@ devlink_nl_port_function_attrs_put(struct sk_buff *msg, struct devlink_port *por
if (err)
goto out;
err = devlink_port_fn_max_io_eqs_fill(port, msg, extack, &msg_updated);
+ if (err)
+ goto out;
+ err = devlink_port_fn_uid_fill(port, msg, extack, &msg_updated);
if (err)
goto out;
err = devlink_rel_devlink_handle_put(msg, port->devlink,
--
2.27.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [RFC net-next 2/5] net/mlx5: Move mlx5_cmd_query_vuid() from IB to core
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 1/5] " Moshe Shemesh
@ 2025-04-23 13:50 ` Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 3/5] net/mlx5: Add vhca_id argument to mlx5_core_query_vuid() Moshe Shemesh
` (3 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Avihai Horon
From: Avihai Horon <avihaih@nvidia.com>
Querying of VUID will be needed in the following patches to get the
function unique identifier of a devlink port function.
Move the existing function mlx5_cmd_query_vuid() to fw.c so it can be
used in both core and IB. Rename it to mlx5_core_query_vuid().
No functional changes intended.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
drivers/infiniband/hw/mlx5/cmd.c | 21 -------------------
drivers/infiniband/hw/mlx5/cmd.h | 2 --
drivers/infiniband/hw/mlx5/main.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 22 ++++++++++++++++++++
include/linux/mlx5/driver.h | 2 ++
5 files changed, 25 insertions(+), 24 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index 7c08e3008927..895b62cc528d 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -245,24 +245,3 @@ int mlx5_cmd_uar_dealloc(struct mlx5_core_dev *dev, u32 uarn, u16 uid)
MLX5_SET(dealloc_uar_in, in, uid, uid);
return mlx5_cmd_exec_in(dev, dealloc_uar, in);
}
-
-int mlx5_cmd_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
- char *out_vuid)
-{
- u8 out[MLX5_ST_SZ_BYTES(query_vuid_out) +
- MLX5_ST_SZ_BYTES(array1024_auto)] = {};
- u8 in[MLX5_ST_SZ_BYTES(query_vuid_in)] = {};
- char *vuid;
- int err;
-
- MLX5_SET(query_vuid_in, in, opcode, MLX5_CMD_OPCODE_QUERY_VUID);
- MLX5_SET(query_vuid_in, in, vhca_id, MLX5_CAP_GEN(dev, vhca_id));
- MLX5_SET(query_vuid_in, in, data_direct, data_direct);
- err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
- if (err)
- return err;
-
- vuid = MLX5_ADDR_OF(query_vuid_out, out, vuid);
- memcpy(out_vuid, vuid, MLX5_ST_SZ_BYTES(array1024_auto));
- return 0;
-}
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index e6c88b6ebd0d..e5cd31270443 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -58,6 +58,4 @@ int mlx5_cmd_mad_ifc(struct mlx5_ib_dev *dev, const void *inb, void *outb,
u16 opmod, u8 port);
int mlx5_cmd_uar_alloc(struct mlx5_core_dev *dev, u32 *uarn, u16 uid);
int mlx5_cmd_uar_dealloc(struct mlx5_core_dev *dev, u32 uarn, u16 uid);
-int mlx5_cmd_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
- char *out_vuid);
#endif /* MLX5_IB_CMD_H */
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index d07cacaa0abd..d051c9d9a07d 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3594,7 +3594,7 @@ static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev)
!MLX5_CAP_GEN_2(dev->mdev, query_vuid))
return 0;
- ret = mlx5_cmd_query_vuid(dev->mdev, true, vuid);
+ ret = mlx5_core_query_vuid(dev->mdev, true, vuid);
if (ret)
return ret;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index 57476487e31f..beef8a279001 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -118,6 +118,28 @@ int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id)
}
EXPORT_SYMBOL(mlx5_core_query_vendor_id);
+int mlx5_core_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
+ char *out_vuid)
+{
+ u8 out[MLX5_ST_SZ_BYTES(query_vuid_out) +
+ MLX5_ST_SZ_BYTES(array1024_auto)] = {};
+ u8 in[MLX5_ST_SZ_BYTES(query_vuid_in)] = {};
+ char *vuid;
+ int err;
+
+ MLX5_SET(query_vuid_in, in, opcode, MLX5_CMD_OPCODE_QUERY_VUID);
+ MLX5_SET(query_vuid_in, in, vhca_id, MLX5_CAP_GEN(dev, vhca_id));
+ MLX5_SET(query_vuid_in, in, data_direct, data_direct);
+ err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+ if (err)
+ return err;
+
+ vuid = MLX5_ADDR_OF(query_vuid_out, out, vuid);
+ memcpy(out_vuid, vuid, MLX5_ST_SZ_BYTES(array1024_auto));
+ return 0;
+}
+EXPORT_SYMBOL(mlx5_core_query_vuid);
+
static int mlx5_get_pcam_reg(struct mlx5_core_dev *dev)
{
return mlx5_query_pcam_reg(dev, dev->caps.pcam,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index d1dfbad9a447..424090e62917 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1127,6 +1127,8 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev
void *data);
int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id);
+int mlx5_core_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
+ char *out_vuid);
int mlx5_cmd_create_vport_lag(struct mlx5_core_dev *dev);
int mlx5_cmd_destroy_vport_lag(struct mlx5_core_dev *dev);
--
2.27.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [RFC net-next 3/5] net/mlx5: Add vhca_id argument to mlx5_core_query_vuid()
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 1/5] " Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 2/5] net/mlx5: Move mlx5_cmd_query_vuid() from IB to core Moshe Shemesh
@ 2025-04-23 13:50 ` Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 4/5] net/mlx5: Add define for max VUID string size Moshe Shemesh
` (2 subsequent siblings)
5 siblings, 0 replies; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Avihai Horon
From: Avihai Horon <avihaih@nvidia.com>
Querying VUID of a specific vhca_id will be needed in the following
patches. To accommodate it, add vhca_id argument to
mlx5_core_query_vuid().
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
drivers/infiniband/hw/mlx5/main.c | 3 ++-
drivers/net/ethernet/mellanox/mlx5/core/fw.c | 6 +++---
include/linux/mlx5/driver.h | 4 ++--
3 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index d051c9d9a07d..5ebf97475ba9 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3594,7 +3594,8 @@ static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev)
!MLX5_CAP_GEN_2(dev->mdev, query_vuid))
return 0;
- ret = mlx5_core_query_vuid(dev->mdev, true, vuid);
+ ret = mlx5_core_query_vuid(dev->mdev, MLX5_CAP_GEN(dev->mdev, vhca_id),
+ true, vuid);
if (ret)
return ret;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index beef8a279001..a5e56380d3ea 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -118,8 +118,8 @@ int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id)
}
EXPORT_SYMBOL(mlx5_core_query_vendor_id);
-int mlx5_core_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
- char *out_vuid)
+int mlx5_core_query_vuid(struct mlx5_core_dev *dev, u16 vhca_id,
+ bool data_direct, char *out_vuid)
{
u8 out[MLX5_ST_SZ_BYTES(query_vuid_out) +
MLX5_ST_SZ_BYTES(array1024_auto)] = {};
@@ -128,7 +128,7 @@ int mlx5_core_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
int err;
MLX5_SET(query_vuid_in, in, opcode, MLX5_CMD_OPCODE_QUERY_VUID);
- MLX5_SET(query_vuid_in, in, vhca_id, MLX5_CAP_GEN(dev, vhca_id));
+ MLX5_SET(query_vuid_in, in, vhca_id, vhca_id);
MLX5_SET(query_vuid_in, in, data_direct, data_direct);
err = mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
if (err)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 424090e62917..575b1401c018 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1127,8 +1127,8 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev
void *data);
int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id);
-int mlx5_core_query_vuid(struct mlx5_core_dev *dev, bool data_direct,
- char *out_vuid);
+int mlx5_core_query_vuid(struct mlx5_core_dev *dev, u16 vhca_id,
+ bool data_direct, char *out_vuid);
int mlx5_cmd_create_vport_lag(struct mlx5_core_dev *dev);
int mlx5_cmd_destroy_vport_lag(struct mlx5_core_dev *dev);
--
2.27.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [RFC net-next 4/5] net/mlx5: Add define for max VUID string size
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
` (2 preceding siblings ...)
2025-04-23 13:50 ` [RFC net-next 3/5] net/mlx5: Add vhca_id argument to mlx5_core_query_vuid() Moshe Shemesh
@ 2025-04-23 13:50 ` Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 5/5] net/mlx5: Expose unique identifier in devlink port function Moshe Shemesh
2025-04-24 23:24 ` [RFC net-next 0/5] devlink: Add unique identifier to " Jakub Kicinski
5 siblings, 0 replies; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Avihai Horon
From: Avihai Horon <avihaih@nvidia.com>
mlx5_core_query_vuid() puts the queried VUID in a user provided buffer
without setting a null terminating byte at the end. Thus, users that use
the VUID as a string calculate the extra byte themselves.
To make it clearer, add a define for the max VUID string size that
includes the null terminating byte.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
drivers/infiniband/hw/mlx5/main.c | 2 +-
include/linux/mlx5/driver.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 5ebf97475ba9..b759707f5218 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -3587,7 +3587,7 @@ static bool mlx5_ib_bind_slave_port(struct mlx5_ib_dev *ibdev,
static int mlx5_ib_data_direct_init(struct mlx5_ib_dev *dev)
{
- char vuid[MLX5_ST_SZ_BYTES(array1024_auto) + 1] = {};
+ char vuid[MLX5_VUID_STR_MAX_SIZE] = {};
int ret;
if (!MLX5_CAP_GEN(dev->mdev, data_direct) ||
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 575b1401c018..e636f60a6392 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -1127,6 +1127,7 @@ int mlx5_blocking_notifier_call_chain(struct mlx5_core_dev *dev, unsigned int ev
void *data);
int mlx5_core_query_vendor_id(struct mlx5_core_dev *mdev, u32 *vendor_id);
+#define MLX5_VUID_STR_MAX_SIZE (MLX5_ST_SZ_BYTES(array1024_auto) + 1)
int mlx5_core_query_vuid(struct mlx5_core_dev *dev, u16 vhca_id,
bool data_direct, char *out_vuid);
--
2.27.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* [RFC net-next 5/5] net/mlx5: Expose unique identifier in devlink port function
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
` (3 preceding siblings ...)
2025-04-23 13:50 ` [RFC net-next 4/5] net/mlx5: Add define for max VUID string size Moshe Shemesh
@ 2025-04-23 13:50 ` Moshe Shemesh
2025-04-24 23:24 ` [RFC net-next 0/5] devlink: Add unique identifier to " Jakub Kicinski
5 siblings, 0 replies; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-23 13:50 UTC (permalink / raw)
To: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn
Cc: Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch,
Avihai Horon
From: Avihai Horon <avihaih@nvidia.com>
The devlink port function unique identifier (UID) attribute allows to
report the UID of the function that pertains to the devlink port.
Get the port function's VUID and report it as its unique identifier.
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
---
.../mellanox/mlx5/core/esw/devlink_port.c | 2 ++
.../net/ethernet/mellanox/mlx5/core/eswitch.h | 2 ++
.../mellanox/mlx5/core/eswitch_offloads.c | 34 +++++++++++++++++++
3 files changed, 38 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
index b7102e14d23d..0f86c7d3df5f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/devlink_port.c
@@ -100,6 +100,8 @@ static const struct devlink_port_ops mlx5_esw_pf_vf_dl_port_ops = {
#endif /* CONFIG_XFRM_OFFLOAD */
.port_fn_max_io_eqs_get = mlx5_devlink_port_fn_max_io_eqs_get,
.port_fn_max_io_eqs_set = mlx5_devlink_port_fn_max_io_eqs_set,
+ .port_fn_uid_get = mlx5_devlink_port_fn_uid_get,
+ .port_fn_uid_max_size = MLX5_VUID_STR_MAX_SIZE,
};
static void mlx5_esw_offloads_sf_devlink_port_attrs_set(struct mlx5_eswitch *esw,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
index 8573d36785f4..bedbb4d12903 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.h
@@ -580,6 +580,8 @@ int mlx5_devlink_port_fn_max_io_eqs_set(struct devlink_port *port,
struct netlink_ext_ack *extack);
int mlx5_devlink_port_fn_max_io_eqs_set_sf_default(struct devlink_port *port,
struct netlink_ext_ack *extack);
+int mlx5_devlink_port_fn_uid_get(struct devlink_port *port, char *fuid,
+ struct netlink_ext_ack *extack);
void *mlx5_eswitch_get_uplink_priv(struct mlx5_eswitch *esw, u8 rep_type);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index a6a8eea5980c..749e2f379eb4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -4701,3 +4701,37 @@ mlx5_devlink_port_fn_max_io_eqs_set_sf_default(struct devlink_port *port,
MLX5_ESW_DEFAULT_SF_COMP_EQS,
extack);
}
+
+int mlx5_devlink_port_fn_uid_get(struct devlink_port *port, char *fuid,
+ struct netlink_ext_ack *extack)
+{
+ struct mlx5_vport *vport = mlx5_devlink_port_vport_get(port);
+ u16 vport_num = vport->vport;
+ struct mlx5_eswitch *esw;
+ u16 vhca_id;
+ int err;
+
+ if (vport_num != MLX5_VPORT_PF)
+ return -EOPNOTSUPP;
+
+ esw = mlx5_devlink_eswitch_nocheck_get(port->devlink);
+ if (!MLX5_CAP_GEN(esw->dev, vhca_resource_manager))
+ return -EOPNOTSUPP;
+
+ if (!MLX5_CAP_GEN_2(esw->dev, query_vuid))
+ return -EOPNOTSUPP;
+
+ err = mlx5_vport_get_vhca_id(esw->dev, vport_num, &vhca_id);
+ if (err) {
+ NL_SET_ERR_MSG_MOD(extack, "Failed getting vhca_id of vport");
+ return err;
+ }
+
+ err = mlx5_core_query_vuid(esw->dev, vhca_id, false, fuid);
+ if (err) {
+ NL_SET_ERR_MSG_MOD(extack, "Failed querying vuid");
+ return err;
+ }
+
+ return 0;
+}
--
2.27.0
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
` (4 preceding siblings ...)
2025-04-23 13:50 ` [RFC net-next 5/5] net/mlx5: Expose unique identifier in devlink port function Moshe Shemesh
@ 2025-04-24 23:24 ` Jakub Kicinski
2025-04-25 11:26 ` Jiri Pirko
2025-04-28 12:11 ` Moshe Shemesh
5 siblings, 2 replies; 25+ messages in thread
From: Jakub Kicinski @ 2025-04-24 23:24 UTC (permalink / raw)
To: Moshe Shemesh
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch
On Wed, 23 Apr 2025 16:50:37 +0300 Moshe Shemesh wrote:
> A function unique identifier (UID) is a vendor defined string of
> arbitrary length that universally identifies a function. The function
> UID can be reported by device drivers via devlink dev info command.
>
> This patch set adds UID attribute to devlink port function that reports
> the UID of the function that pertains to the devlink port. Code is also
> added to mlx5 as the first user to implement this attribute.
>
> The main purpose of adding this attribute is to allow users to
> unambiguously map between a function and the devlink port that manages
> it, which might be on another host.
>
> For example, one can retrieve the UID of a function using the "devlink
> dev info" command and then search for the same UID in the output of
> "devlink port show" command.
>
> The "devlink dev info" support for UID of a function is added by a
> separate patchset [1]. This patchset is submitted as an RFC to
> illustrate the other side of the solution.
>
> Other existing identifiers such as serial_number or board.serial_number
> are not good enough as they don't guarantee uniqueness per function. For
> example, in a multi-host NIC all PFs report the same value.
Makes sense, tho, could you please use UUID?
Let's use industry standards when possible, not "arbitrary strings".
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-24 23:24 ` [RFC net-next 0/5] devlink: Add unique identifier to " Jakub Kicinski
@ 2025-04-25 11:26 ` Jiri Pirko
2025-04-25 17:51 ` Jakub Kicinski
2025-04-28 12:11 ` Moshe Shemesh
1 sibling, 1 reply; 25+ messages in thread
From: Jiri Pirko @ 2025-04-25 11:26 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch
Fri, Apr 25, 2025 at 01:24:25AM +0200, kuba@kernel.org wrote:
>On Wed, 23 Apr 2025 16:50:37 +0300 Moshe Shemesh wrote:
>> A function unique identifier (UID) is a vendor defined string of
>> arbitrary length that universally identifies a function. The function
>> UID can be reported by device drivers via devlink dev info command.
>>
>> This patch set adds UID attribute to devlink port function that reports
>> the UID of the function that pertains to the devlink port. Code is also
>> added to mlx5 as the first user to implement this attribute.
>>
>> The main purpose of adding this attribute is to allow users to
>> unambiguously map between a function and the devlink port that manages
>> it, which might be on another host.
>>
>> For example, one can retrieve the UID of a function using the "devlink
>> dev info" command and then search for the same UID in the output of
>> "devlink port show" command.
>>
>> The "devlink dev info" support for UID of a function is added by a
>> separate patchset [1]. This patchset is submitted as an RFC to
>> illustrate the other side of the solution.
>>
>> Other existing identifiers such as serial_number or board.serial_number
>> are not good enough as they don't guarantee uniqueness per function. For
>> example, in a multi-host NIC all PFs report the same value.
>
>Makes sense, tho, could you please use UUID?
>Let's use industry standards when possible, not "arbitrary strings".
Well, you could make the same request for serial number of asic and board.
Could be uuids too, but they aren't. I mean, it makes sense to have all
uids as uuid, but since the fw already exposes couple of uids as
arbitrary strings, why this one should be treated differently all of the
sudden?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-25 11:26 ` Jiri Pirko
@ 2025-04-25 17:51 ` Jakub Kicinski
2025-04-28 16:30 ` Jiri Pirko
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-04-25 17:51 UTC (permalink / raw)
To: Jiri Pirko
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Mark Bloch
On Fri, 25 Apr 2025 13:26:01 +0200 Jiri Pirko wrote:
>> Makes sense, tho, could you please use UUID?
>> Let's use industry standards when possible, not "arbitrary strings".
>
> Well, you could make the same request for serial number of asic and board.
> Could be uuids too, but they aren't. I mean, it makes sense to have all
> uids as uuid, but since the fw already exposes couple of uids as
> arbitrary strings, why this one should be treated differently all of the
> sudden?
Are you asking me what the difference is here, or you're just telling
me that I'm wrong and inconsistent?
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-24 23:24 ` [RFC net-next 0/5] devlink: Add unique identifier to " Jakub Kicinski
2025-04-25 11:26 ` Jiri Pirko
@ 2025-04-28 12:11 ` Moshe Shemesh
2025-04-28 18:19 ` Jakub Kicinski
1 sibling, 1 reply; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-28 12:11 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch
On 4/25/2025 2:24 AM, Jakub Kicinski wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, 23 Apr 2025 16:50:37 +0300 Moshe Shemesh wrote:
>> A function unique identifier (UID) is a vendor defined string of
>> arbitrary length that universally identifies a function. The function
>> UID can be reported by device drivers via devlink dev info command.
>>
>> This patch set adds UID attribute to devlink port function that reports
>> the UID of the function that pertains to the devlink port. Code is also
>> added to mlx5 as the first user to implement this attribute.
>>
>> The main purpose of adding this attribute is to allow users to
>> unambiguously map between a function and the devlink port that manages
>> it, which might be on another host.
>>
>> For example, one can retrieve the UID of a function using the "devlink
>> dev info" command and then search for the same UID in the output of
>> "devlink port show" command.
>>
>> The "devlink dev info" support for UID of a function is added by a
>> separate patchset [1]. This patchset is submitted as an RFC to
>> illustrate the other side of the solution.
>>
>> Other existing identifiers such as serial_number or board.serial_number
>> are not good enough as they don't guarantee uniqueness per function. For
>> example, in a multi-host NIC all PFs report the same value.
>
> Makes sense, tho, could you please use UUID?
> Let's use industry standards when possible, not "arbitrary strings".
UUID is limited, like it has to be 128 bits, while here it is variable
length up to the vendor.
We would like to keep it flexible per vendor. If vendor wants to use
UUID here, it will work too.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 1/5] devlink: Add unique identifier to devlink port function
2025-04-23 13:50 ` [RFC net-next 1/5] " Moshe Shemesh
@ 2025-04-28 12:33 ` Simon Horman
2025-04-29 9:33 ` Avihai Horon
0 siblings, 1 reply; 25+ messages in thread
From: Simon Horman @ 2025-04-28 12:33 UTC (permalink / raw)
To: Moshe Shemesh
Cc: netdev, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan, Saeed Mahameed, Leon Romanovsky,
Mark Bloch, Avihai Horon
On Wed, Apr 23, 2025 at 04:50:38PM +0300, Moshe Shemesh wrote:
> From: Avihai Horon <avihaih@nvidia.com>
>
> A function unique identifier (UID) is a vendor defined string of
> arbitrary length that universally identifies a function. The function
> UID can be reported via devlink dev info.
>
> Add UID attribute to devlink port function that reports the UID of the
> function that pertains to the devlink port.
>
> This can be used to unambiguously map between a function and the devlink
> port that manages it, and vice versa.
>
> Example output:
>
> $ devlink port show pci/0000:03:00.0/327680 -jp
> {
> "port": {
> "pci/0000:03:00.0/327680": {
> "type": "eth",
> "netdev": "pf0hpf",
> "flavour": "pcipf",
> "controller": 1,
> "pfnum": 0,
> "external": true,
> "splittable": false,
> "function": {
> "hw_addr": "5c:25:73:37:70:5a",
> "roce": "enable",
> "max_io_eqs": 120,
> "uid": "C6A76AD20605BE026D23C14E70B90704F4A5F5B3F304D83B37000732BF861D48MLNXS0D0F0"
> }
> }
> }
> }
>
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> ---
> Documentation/netlink/specs/devlink.yaml | 3 ++
> .../networking/devlink/devlink-port.rst | 12 +++++++
> include/net/devlink.h | 8 +++++
> include/uapi/linux/devlink.h | 1 +
> net/devlink/port.c | 32 +++++++++++++++++++
> 5 files changed, 56 insertions(+)
>
> diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
> index bd9726269b4f..f4dade0e3c70 100644
> --- a/Documentation/netlink/specs/devlink.yaml
> +++ b/Documentation/netlink/specs/devlink.yaml
> @@ -894,6 +894,9 @@ attribute-sets:
> type: bitfield32
> enum: port-fn-attr-cap
> enum-as-flags: True
> + -
> + name: uid
> + type: string
>
> -
> name: dl-dpipe-tables
Hi Avihai,
With this patch, after running tools/net/ynl/ynl-regen.sh -f, I see the
following when I run git diff. So I think this patch needs these changes
too.
diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
index f9786d51f..1dc90bde8 100644
--- a/net/devlink/netlink_gen.c
+++ b/net/devlink/netlink_gen.c
@@ -11,11 +11,12 @@
#include <uapi/linux/devlink.h>
/* Common nested types */
-const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_CAPS + 1] = {
+const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_UID + 1] = {
[DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY, },
[DEVLINK_PORT_FN_ATTR_STATE] = NLA_POLICY_MAX(NLA_U8, 1),
[DEVLINK_PORT_FN_ATTR_OPSTATE] = NLA_POLICY_MAX(NLA_U8, 1),
[DEVLINK_PORT_FN_ATTR_CAPS] = NLA_POLICY_BITFIELD32(15),
+ [DEVLINK_PORT_FN_ATTR_UID] = { .type = NLA_NUL_STRING, },
};
const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_ID_FLASH + 1] = {
diff --git a/net/devlink/netlink_gen.h b/net/devlink/netlink_gen.h
index 8f2bd50dd..3a12c18c6 100644
--- a/net/devlink/netlink_gen.h
+++ b/net/devlink/netlink_gen.h
@@ -12,7 +12,7 @@
#include <uapi/linux/devlink.h>
/* Common nested types */
-extern const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_CAPS + 1];
+extern const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_UID + 1];
extern const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_ID_FLASH + 1];
/* Ops table for devlink */
...
^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-25 17:51 ` Jakub Kicinski
@ 2025-04-28 16:30 ` Jiri Pirko
0 siblings, 0 replies; 25+ messages in thread
From: Jiri Pirko @ 2025-04-28 16:30 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Mark Bloch
Fri, Apr 25, 2025 at 07:51:45PM +0200, kuba@kernel.org wrote:
>On Fri, 25 Apr 2025 13:26:01 +0200 Jiri Pirko wrote:
>>> Makes sense, tho, could you please use UUID?
>>> Let's use industry standards when possible, not "arbitrary strings".
>>
>> Well, you could make the same request for serial number of asic and board.
>> Could be uuids too, but they aren't. I mean, it makes sense to have all
>> uids as uuid, but since the fw already exposes couple of uids as
>> arbitrary strings, why this one should be treated differently all of the
>> sudden?
>
>Are you asking me what the difference is here, or you're just telling
>me that I'm wrong and inconsistent?
Both I guess? I'm just trying to understand the rationale behind the
request, that's all.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-28 12:11 ` Moshe Shemesh
@ 2025-04-28 18:19 ` Jakub Kicinski
2025-04-29 8:37 ` Moshe Shemesh
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-04-28 18:19 UTC (permalink / raw)
To: Moshe Shemesh
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch
On Mon, 28 Apr 2025 15:11:04 +0300 Moshe Shemesh wrote:
> > Makes sense, tho, could you please use UUID?
> > Let's use industry standards when possible, not "arbitrary strings".
>
> UUID is limited, like it has to be 128 bits, while here it is variable
> length up to the vendor.
> We would like to keep it flexible per vendor. If vendor wants to use
> UUID here, it will work too.
Could you please provide at least one clear user scenario for
the discussion? Matching up the ports to function is presumably
a means to an end for the user.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-28 18:19 ` Jakub Kicinski
@ 2025-04-29 8:37 ` Moshe Shemesh
2025-05-02 0:39 ` Jakub Kicinski
0 siblings, 1 reply; 25+ messages in thread
From: Moshe Shemesh @ 2025-04-29 8:37 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Saeed Mahameed, Leon Romanovsky, Mark Bloch
On 4/28/2025 9:19 PM, Jakub Kicinski wrote:
>
> On Mon, 28 Apr 2025 15:11:04 +0300 Moshe Shemesh wrote:
>>> Makes sense, tho, could you please use UUID?
>>> Let's use industry standards when possible, not "arbitrary strings".
>>
>> UUID is limited, like it has to be 128 bits, while here it is variable
>> length up to the vendor.
>> We would like to keep it flexible per vendor. If vendor wants to use
>> UUID here, it will work too.
>
> Could you please provide at least one clear user scenario for
> the discussion? Matching up the ports to function is presumably
> a means to an end for the user.
Sure. Multi-host system with smart-NIC, on the smart-NIC internal host
we will see a representor for each PF on each of the external hosts.
However, we can't tell which representor belongs to which host.
Actually, each host doesn't know about the others or where it is in the
topology. The function uid can help the user match the host PF to the
representor on the smart-NIC internal host and use the right representor
to config the required host function.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 1/5] devlink: Add unique identifier to devlink port function
2025-04-28 12:33 ` Simon Horman
@ 2025-04-29 9:33 ` Avihai Horon
0 siblings, 0 replies; 25+ messages in thread
From: Avihai Horon @ 2025-04-29 9:33 UTC (permalink / raw)
To: Simon Horman, Moshe Shemesh
Cc: netdev@vger.kernel.org, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Donald Hunter, Jiri Pirko,
Jonathan Corbet, Andrew Lunn, Tariq Toukan, Saeed Mahameed,
Leon Romanovsky, Mark Bloch
On 28/04/2025 15:33, Simon Horman wrote:
> External email: Use caution opening links or attachments
>
>
> On Wed, Apr 23, 2025 at 04:50:38PM +0300, Moshe Shemesh wrote:
>> From: Avihai Horon <avihaih@nvidia.com>
>>
>> A function unique identifier (UID) is a vendor defined string of
>> arbitrary length that universally identifies a function. The function
>> UID can be reported via devlink dev info.
>>
>> Add UID attribute to devlink port function that reports the UID of the
>> function that pertains to the devlink port.
>>
>> This can be used to unambiguously map between a function and the devlink
>> port that manages it, and vice versa.
>>
>> Example output:
>>
>> $ devlink port show pci/0000:03:00.0/327680 -jp
>> {
>> "port": {
>> "pci/0000:03:00.0/327680": {
>> "type": "eth",
>> "netdev": "pf0hpf",
>> "flavour": "pcipf",
>> "controller": 1,
>> "pfnum": 0,
>> "external": true,
>> "splittable": false,
>> "function": {
>> "hw_addr": "5c:25:73:37:70:5a",
>> "roce": "enable",
>> "max_io_eqs": 120,
>> "uid": "C6A76AD20605BE026D23C14E70B90704F4A5F5B3F304D83B37000732BF861D48MLNXS0D0F0"
>> }
>> }
>> }
>> }
>>
>> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
>> ---
>> Documentation/netlink/specs/devlink.yaml | 3 ++
>> .../networking/devlink/devlink-port.rst | 12 +++++++
>> include/net/devlink.h | 8 +++++
>> include/uapi/linux/devlink.h | 1 +
>> net/devlink/port.c | 32 +++++++++++++++++++
>> 5 files changed, 56 insertions(+)
>>
>> diff --git a/Documentation/netlink/specs/devlink.yaml b/Documentation/netlink/specs/devlink.yaml
>> index bd9726269b4f..f4dade0e3c70 100644
>> --- a/Documentation/netlink/specs/devlink.yaml
>> +++ b/Documentation/netlink/specs/devlink.yaml
>> @@ -894,6 +894,9 @@ attribute-sets:
>> type: bitfield32
>> enum: port-fn-attr-cap
>> enum-as-flags: True
>> + -
>> + name: uid
>> + type: string
>>
>> -
>> name: dl-dpipe-tables
> Hi Avihai,
>
> With this patch, after running tools/net/ynl/ynl-regen.sh -f, I see the
> following when I run git diff. So I think this patch needs these changes
> too.
Oh, right, I will add these in next version.
Thanks!
>
> diff --git a/net/devlink/netlink_gen.c b/net/devlink/netlink_gen.c
> index f9786d51f..1dc90bde8 100644
> --- a/net/devlink/netlink_gen.c
> +++ b/net/devlink/netlink_gen.c
> @@ -11,11 +11,12 @@
> #include <uapi/linux/devlink.h>
>
> /* Common nested types */
> -const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_CAPS + 1] = {
> +const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_UID + 1] = {
> [DEVLINK_PORT_FUNCTION_ATTR_HW_ADDR] = { .type = NLA_BINARY, },
> [DEVLINK_PORT_FN_ATTR_STATE] = NLA_POLICY_MAX(NLA_U8, 1),
> [DEVLINK_PORT_FN_ATTR_OPSTATE] = NLA_POLICY_MAX(NLA_U8, 1),
> [DEVLINK_PORT_FN_ATTR_CAPS] = NLA_POLICY_BITFIELD32(15),
> + [DEVLINK_PORT_FN_ATTR_UID] = { .type = NLA_NUL_STRING, },
> };
>
> const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_ID_FLASH + 1] = {
> diff --git a/net/devlink/netlink_gen.h b/net/devlink/netlink_gen.h
> index 8f2bd50dd..3a12c18c6 100644
> --- a/net/devlink/netlink_gen.h
> +++ b/net/devlink/netlink_gen.h
> @@ -12,7 +12,7 @@
> #include <uapi/linux/devlink.h>
>
> /* Common nested types */
> -extern const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_CAPS + 1];
> +extern const struct nla_policy devlink_dl_port_function_nl_policy[DEVLINK_PORT_FN_ATTR_UID + 1];
> extern const struct nla_policy devlink_dl_selftest_id_nl_policy[DEVLINK_ATTR_SELFTEST_ID_FLASH + 1];
>
> /* Ops table for devlink */
>
> ...
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-04-29 8:37 ` Moshe Shemesh
@ 2025-05-02 0:39 ` Jakub Kicinski
2025-05-04 17:46 ` Mark Bloch
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-05-02 0:39 UTC (permalink / raw)
To: Moshe Shemesh
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan, Mark Bloch
On Tue, 29 Apr 2025 11:37:51 +0300 Moshe Shemesh wrote:
> >> UUID is limited, like it has to be 128 bits, while here it is variable
> >> length up to the vendor.
> >> We would like to keep it flexible per vendor. If vendor wants to use
> >> UUID here, it will work too.
> >
> > Could you please provide at least one clear user scenario for
> > the discussion? Matching up the ports to function is presumably
> > a means to an end for the user.
>
> Sure. Multi-host system with smart-NIC, on the smart-NIC internal host
> we will see a representor for each PF on each of the external hosts.
> However, we can't tell which representor belongs to which host.
> Actually, each host doesn't know about the others or where it is in the
> topology. The function uid can help the user match the host PF to the
> representor on the smart-NIC internal host and use the right representor
> to config the required host function.
Insufficient information. There are many many hosts deployed with
multi-host NICs which do not need this sort of matching. I'm not
saying you don't have a use case. I'm saying you haven't explained it.
We exchanged so many emails on this topic, counting the emails with
Jiri. And you still haven't explained to me the use case. This is
ridiculous.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-02 0:39 ` Jakub Kicinski
@ 2025-05-04 17:46 ` Mark Bloch
2025-05-05 18:55 ` Jakub Kicinski
0 siblings, 1 reply; 25+ messages in thread
From: Mark Bloch @ 2025-05-04 17:46 UTC (permalink / raw)
To: Jakub Kicinski, Moshe Shemesh
Cc: netdev, David S. Miller, Eric Dumazet, Paolo Abeni, Simon Horman,
Donald Hunter, Jiri Pirko, Jonathan Corbet, Andrew Lunn,
Tariq Toukan
On 02/05/2025 3:39, Jakub Kicinski wrote:
> On Tue, 29 Apr 2025 11:37:51 +0300 Moshe Shemesh wrote:
>>>> UUID is limited, like it has to be 128 bits, while here it is variable
>>>> length up to the vendor.
>>>> We would like to keep it flexible per vendor. If vendor wants to use
>>>> UUID here, it will work too.
>>>
>>> Could you please provide at least one clear user scenario for
>>> the discussion? Matching up the ports to function is presumably
>>> a means to an end for the user.
>>
>> Sure. Multi-host system with smart-NIC, on the smart-NIC internal host
>> we will see a representor for each PF on each of the external hosts.
>> However, we can't tell which representor belongs to which host.
>> Actually, each host doesn't know about the others or where it is in the
>> topology. The function uid can help the user match the host PF to the
>> representor on the smart-NIC internal host and use the right representor
>> to config the required host function.
>
> Insufficient information. There are many many hosts deployed with
> multi-host NICs which do not need this sort of matching. I'm not
> saying you don't have a use case. I'm saying you haven't explained it.
>
> We exchanged so many emails on this topic, counting the emails with
> Jiri. And you still haven't explained to me the use case. This is
> ridiculous.
Hi Jakub,
I'll try to explain the use case more clearly, I realize that some
internal context at NVIDIA may not be obvious externally, and we sometimes
take that for granted.
We're dealing with a multi-host system using a DPU (smart-NIC). In such a
system, each external (x86) host has its own PFs/VFs/SFs, but the E-Switch
manager for each PF resides on the DPU's ARM core (the internal host).
To illustrate, consider a system with two external hosts:
Host 1: PF0
VF0 on PF0
Host 2: PF0
VF0 on PF0
Each host is unaware that it's part of a multi-host system, internally,
each sees its PF simply as PF0, with no notion of the global topology.
On the DPU (ARM), we see representors for each BDF. For simplicity,
assume each BDF corresponds to a single devlink port. So the ARM would
expose:
PF0_HOST0_REP
UPLINK0_REP
PF0_HOST1_REP
UPLINK1_REP
In devlink terms, we're referring to the c argument in phys_port_name,
which represents the controller, effectively indicating which host
the BDF belongs to.
The problem we're addressing is matching the PF seen on a host to its
corresponding representor on the DPU. From the ARM side, we know that
this rep X belongs to pf0 on host y, but we don't which host is which.
From within each host, you can't tell which host you are, because all
see their PF as PF0.
With the proposed feature (along with Jiri's changes), this becomes
trivial, you just match the function UID and you're done.
As a side note, I believe this feature has merit even beyond this
specific use case. It makes the mapping between representors and what
they represent more explicit and straightforward. Which is always a
good thing from a usability and clarity standpoint.
Mark
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-04 17:46 ` Mark Bloch
@ 2025-05-05 18:55 ` Jakub Kicinski
2025-05-06 11:25 ` Mark Bloch
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-05-05 18:55 UTC (permalink / raw)
To: Mark Bloch
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On Sun, 4 May 2025 20:46:51 +0300 Mark Bloch wrote:
> On the DPU (ARM), we see representors for each BDF. For simplicity,
> assume each BDF corresponds to a single devlink port. So the ARM would
> expose:
>
> PF0_HOST0_REP
> UPLINK0_REP
> PF0_HOST1_REP
> UPLINK1_REP
>
> In devlink terms, we're referring to the c argument in phys_port_name,
> which represents the controller, effectively indicating which host
> the BDF belongs to.
>
> The problem we're addressing is matching the PF seen on a host to its
> corresponding representor on the DPU. From the ARM side, we know that
> this rep X belongs to pf0 on host y, but we don't which host is which.
> From within each host, you can't tell which host you are, because all
> see their PF as PF0.
>
> With the proposed feature (along with Jiri's changes), this becomes
> trivial, you just match the function UID and you're done.
Thanks for explaining the setup. Could you please explain the user
scenario now? Perhaps thinking of it as a sequence diagram would
be helpful, but whatever is easiest, just make it concrete.
> As a side note, I believe this feature has merit even beyond this
> specific use case.
I also had that belief when I implemented something similar for the NFP
long time ago. Jiri didn't like the solution / understand the problem
at the time. But it turned out not to matter in practice.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-05 18:55 ` Jakub Kicinski
@ 2025-05-06 11:25 ` Mark Bloch
2025-05-06 15:20 ` Jakub Kicinski
0 siblings, 1 reply; 25+ messages in thread
From: Mark Bloch @ 2025-05-06 11:25 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On 05/05/2025 21:55, Jakub Kicinski wrote:
> On Sun, 4 May 2025 20:46:51 +0300 Mark Bloch wrote:
>> On the DPU (ARM), we see representors for each BDF. For simplicity,
>> assume each BDF corresponds to a single devlink port. So the ARM would
>> expose:
>>
>> PF0_HOST0_REP
>> UPLINK0_REP
>> PF0_HOST1_REP
>> UPLINK1_REP
>>
>> In devlink terms, we're referring to the c argument in phys_port_name,
>> which represents the controller, effectively indicating which host
>> the BDF belongs to.
>>
>> The problem we're addressing is matching the PF seen on a host to its
>> corresponding representor on the DPU. From the ARM side, we know that
>> this rep X belongs to pf0 on host y, but we don't which host is which.
>> From within each host, you can't tell which host you are, because all
>> see their PF as PF0.
>>
>> With the proposed feature (along with Jiri's changes), this becomes
>> trivial, you just match the function UID and you're done.
>
> Thanks for explaining the setup. Could you please explain the user
> scenario now? Perhaps thinking of it as a sequence diagram would
> be helpful, but whatever is easiest, just make it concrete.
>
It's a rough flow, but I believe it clearly illustrates the use case
we're targeting:
Some system configuration info:
- A static mapping file exists that defines the relationship between
a host and the corresponding ARM/DPU host that manages it.
- OVN, OVS and Kubernetes are used to manage network connectivity and
resource allocation.
Flow:
1. A user requests a container with networking connectivity.
2. Kubernetes allocates a VF on host X. An agent on the host handles VF
configuration and sends the PF number and VF index to the central
management software.
3. An agent on the DPU side detects the changes made on host X. Using
the PF number and VF index, it identifies the corresponding
representor, attaches it to an OVS bridge, and allows OVN to program
the relevant steering rules.
This setup works well when the mapping file defines a one-to-one
relationship between a host and a single ARM/DPU host.
It's already supported in upstream today [1]
However, in a slightly more generic scenario like:
Control Host A: External host X
External host Y
A single ARM/DPU host manages multiple external hosts. In this case, step
2—where only the PF number and VF index are sent is insufficient. During
step 3, the agent on the DPU reads the data but cannot determine which
external host created the VF. As a result, it cannot correctly associate
the representor with the appropriate OVS bridge.
To resolve this, we plan to modify step 2 to include the VUID along with
the PF number and VF index. The DPU-side agent will use the VUID to match
it with the FUID, identify the correct PF representor, and then use
standard devlink mechanisms to locate the corresponding VF representor.
1: https://github.com/ovn-kubernetes/ovn-kubernetes
You can look at: go-controller/pkg/util/dpu_annotations.go for more info.
Mark
>> As a side note, I believe this feature has merit even beyond this
>> specific use case.
>
> I also had that belief when I implemented something similar for the NFP
> long time ago. Jiri didn't like the solution / understand the problem
> at the time. But it turned out not to matter in practice.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-06 11:25 ` Mark Bloch
@ 2025-05-06 15:20 ` Jakub Kicinski
2025-05-06 15:34 ` Mark Bloch
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-05-06 15:20 UTC (permalink / raw)
To: Mark Bloch
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On Tue, 6 May 2025 14:25:10 +0300 Mark Bloch wrote:
> > Thanks for explaining the setup. Could you please explain the user
> > scenario now? Perhaps thinking of it as a sequence diagram would
> > be helpful, but whatever is easiest, just make it concrete.
> >
>
> It's a rough flow, but I believe it clearly illustrates the use case
> we're targeting:
>
> Some system configuration info:
>
> - A static mapping file exists that defines the relationship between
> a host and the corresponding ARM/DPU host that manages it.
>
> - OVN, OVS and Kubernetes are used to manage network connectivity and
> resource allocation.
>
> Flow:
> 1. A user requests a container with networking connectivity.
> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
> configuration and sends the PF number and VF index to the central
> management software.
What is "central management software" here? Deployment specific or
some part of k8s?
> 3. An agent on the DPU side detects the changes made on host X. Using
> the PF number and VF index, it identifies the corresponding
> representor, attaches it to an OVS bridge, and allows OVN to program
> the relevant steering rules.
What does it mean that DPU "detects it", what's the source and
mechanism of the notification?
Is it communicating with the central SW during the process?
> This setup works well when the mapping file defines a one-to-one
> relationship between a host and a single ARM/DPU host.
> It's already supported in upstream today [1]
>
> However, in a slightly more generic scenario like:
>
> Control Host A: External host X
> External host Y
>
> A single ARM/DPU host manages multiple external hosts. In this case, step
> 2—where only the PF number and VF index are sent is insufficient. During
> step 3, the agent on the DPU reads the data but cannot determine which
> external host created the VF. As a result, it cannot correctly associate
> the representor with the appropriate OVS bridge.
>
> To resolve this, we plan to modify step 2 to include the VUID along with
> the PF number and VF index. The DPU-side agent will use the VUID to match
> it with the FUID, identify the correct PF representor, and then use
> standard devlink mechanisms to locate the corresponding VF representor.
>
> 1: https://github.com/ovn-kubernetes/ovn-kubernetes
> You can look at: go-controller/pkg/util/dpu_annotations.go for more info.
A link to the actual file / relevant code would be more helpful :(
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-06 15:20 ` Jakub Kicinski
@ 2025-05-06 15:34 ` Mark Bloch
2025-05-08 0:43 ` Jakub Kicinski
0 siblings, 1 reply; 25+ messages in thread
From: Mark Bloch @ 2025-05-06 15:34 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On 06/05/2025 18:20, Jakub Kicinski wrote:
> On Tue, 6 May 2025 14:25:10 +0300 Mark Bloch wrote:
>>> Thanks for explaining the setup. Could you please explain the user
>>> scenario now? Perhaps thinking of it as a sequence diagram would
>>> be helpful, but whatever is easiest, just make it concrete.
>>>
>>
>> It's a rough flow, but I believe it clearly illustrates the use case
>> we're targeting:
>>
>> Some system configuration info:
>>
>> - A static mapping file exists that defines the relationship between
>> a host and the corresponding ARM/DPU host that manages it.
>>
>> - OVN, OVS and Kubernetes are used to manage network connectivity and
>> resource allocation.
>>
>> Flow:
>> 1. A user requests a container with networking connectivity.
>> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
>> configuration and sends the PF number and VF index to the central
>> management software.
>
> What is "central management software" here? Deployment specific or
> some part of k8s?
It's the k8s API server.
>
>> 3. An agent on the DPU side detects the changes made on host X. Using
>> the PF number and VF index, it identifies the corresponding
>> representor, attaches it to an OVS bridge, and allows OVN to program
>> the relevant steering rules.
>
> What does it mean that DPU "detects it", what's the source and
> mechanism of the notification?
> Is it communicating with the central SW during the process?
The agent (running in the ARM/DPU) listens for events from the k8s API server.
>
>> This setup works well when the mapping file defines a one-to-one
>> relationship between a host and a single ARM/DPU host.
>> It's already supported in upstream today [1]
>>
>> However, in a slightly more generic scenario like:
>>
>> Control Host A: External host X
>> External host Y
>>
>> A single ARM/DPU host manages multiple external hosts. In this case, step
>> 2—where only the PF number and VF index are sent is insufficient. During
>> step 3, the agent on the DPU reads the data but cannot determine which
>> external host created the VF. As a result, it cannot correctly associate
>> the representor with the appropriate OVS bridge.
>>
>> To resolve this, we plan to modify step 2 to include the VUID along with
>> the PF number and VF index. The DPU-side agent will use the VUID to match
>> it with the FUID, identify the correct PF representor, and then use
>> standard devlink mechanisms to locate the corresponding VF representor.
>>
>> 1: https://github.com/ovn-kubernetes/ovn-kubernetes
>> You can look at: go-controller/pkg/util/dpu_annotations.go for more info.
>
> A link to the actual file / relevant code would be more helpful :(
This code listens for events on the ARM/DPU from the Kubernetes API server:
https://github.com/ovn-kubernetes/ovn-kubernetes/blob/39e94d80c286f69c5416166d8acda5d1d2f9add5/go-controller/pkg/node/base_node_network_controller_dpu.go#L100
I’m not very familiar with this part of the code, so I asked our
k8s team to help me identify the relevant function. Hopefully, this
is what you were looking for.
Mark
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-06 15:34 ` Mark Bloch
@ 2025-05-08 0:43 ` Jakub Kicinski
2025-05-08 9:04 ` Mark Bloch
0 siblings, 1 reply; 25+ messages in thread
From: Jakub Kicinski @ 2025-05-08 0:43 UTC (permalink / raw)
To: Mark Bloch
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On Tue, 6 May 2025 18:34:22 +0300 Mark Bloch wrote:
> >> Flow:
> >> 1. A user requests a container with networking connectivity.
> >> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
> >> configuration and sends the PF number and VF index to the central
> >> management software.
> >
> > What is "central management software" here? Deployment specific or
> > some part of k8s?
>
> It's the k8s API server.
>
> >
> >> 3. An agent on the DPU side detects the changes made on host X. Using
> >> the PF number and VF index, it identifies the corresponding
> >> representor, attaches it to an OVS bridge, and allows OVN to program
> >> the relevant steering rules.
> >
> > What does it mean that DPU "detects it", what's the source and
> > mechanism of the notification?
> > Is it communicating with the central SW during the process?
>
> The agent (running in the ARM/DPU) listens for events from the k8s API server.
Interesting. So a deployment with no security boundaries. The internals
of the IPU and the k8s on the host are in the same domain of control.
So how does the user remotely power cycle the hosts?
What I'm getting at is that your mental model seems to be missing any
sort of HW inventory database, which lists all the hosts and how they
plug into the DC. The administrator of the system must already know
where each machine is exactly in the chassis for basic DC ops. And
that HW DB is normally queried in what you describe. If there is any
security domain crossing in the picture it will require cross checking
against that HW DB.
I don't think this is sufficiently well established to warrant new uAPI.
You can use a UUID and pass it via ndo_get_phys_port_id.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-08 0:43 ` Jakub Kicinski
@ 2025-05-08 9:04 ` Mark Bloch
2025-05-14 12:01 ` Mark Bloch
0 siblings, 1 reply; 25+ messages in thread
From: Mark Bloch @ 2025-05-08 9:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On 08/05/2025 3:43, Jakub Kicinski wrote:
> On Tue, 6 May 2025 18:34:22 +0300 Mark Bloch wrote:
>>>> Flow:
>>>> 1. A user requests a container with networking connectivity.
>>>> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
>>>> configuration and sends the PF number and VF index to the central
>>>> management software.
>>>
>>> What is "central management software" here? Deployment specific or
>>> some part of k8s?
>>
>> It's the k8s API server.
>>
>>>
>>>> 3. An agent on the DPU side detects the changes made on host X. Using
>>>> the PF number and VF index, it identifies the corresponding
>>>> representor, attaches it to an OVS bridge, and allows OVN to program
>>>> the relevant steering rules.
>>>
>>> What does it mean that DPU "detects it", what's the source and
>>> mechanism of the notification?
>>> Is it communicating with the central SW during the process?
>>
>> The agent (running in the ARM/DPU) listens for events from the k8s API server.
>
> Interesting. So a deployment with no security boundaries. The internals
> of the IPU and the k8s on the host are in the same domain of control.
The VF is created on host X, but the corresponding representor appears
on a different host, the IPU. Naturally, they need to be able to
synchronize and exchange information for everything to work correctly.
>
> So how does the user remotely power cycle the hosts?
Why should a user be able to power cycle the hosts?
Are you are asking about the administrator?
>
> What I'm getting at is that your mental model seems to be missing any
> sort of HW inventory database, which lists all the hosts and how they
> plug into the DC. The administrator of the system must already know
> where each machine is exactly in the chassis for basic DC ops. And
> that HW DB is normally queried in what you describe. If there is any
> security domain crossing in the picture it will require cross checking
> against that HW DB.
You're assuming that external host numbering and PCI enumeration are
stable, also users can determine the mapping only after creating
VFs. But even then, the mapping is indirect e.g: “I created a VF on
this PF, and I see a single representor appear on the IPU, so they
must be linked.” That approach is fragile and error prone.
Also, keep in mind: the external hosts and their kernels shouldn’t
be aware they’re part of a multi-host system. With our current
approach, you just need to provide a host-to-IPU mapping
upfront, no guesswork involved.
Just thinking out loud, once this feature is in place, we might
not even need a static mapping between external hosts and IPU hosts.
If VUID and FUID are globally unique, the following workflow
becomes possible:
- A user requests a container with network connectivity.
- k8s allocates and configures a VF on one of the hosts.
It then sends the VUID, PF number, and VF index for the new VF
to the k8S API server.
- Somewhere in the network, a representor appears. An agent detects
this and notifies the k8s API server, including its FUID,
PF number, and VF index.
- The API server matches the VF and representor data based on the
globally unique identifiers and sends the relevant information
back to the agent that reported the representor creation.
- The agent attaches the representor to the OVS bridge, and with
OVN configures the appropriate steering rules.
This would remove the need for pre defined host to IPU mappings
and allow for a more dynamic and flexible setup.
>
> I don't think this is sufficiently well established to warrant new uAPI.
> You can use a UUID and pass it via ndo_get_phys_port_id.
phys_port_id only applies to netdev interfaces, whereas this use case is
broader and more aligned with devlink. We believe devlink is a more
appropriate place for this functionality.
Mark
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-08 9:04 ` Mark Bloch
@ 2025-05-14 12:01 ` Mark Bloch
2025-05-14 14:52 ` Jakub Kicinski
0 siblings, 1 reply; 25+ messages in thread
From: Mark Bloch @ 2025-05-14 12:01 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On 08/05/2025 12:04, Mark Bloch wrote:
>
>
> On 08/05/2025 3:43, Jakub Kicinski wrote:
>> On Tue, 6 May 2025 18:34:22 +0300 Mark Bloch wrote:
>>>>> Flow:
>>>>> 1. A user requests a container with networking connectivity.
>>>>> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
>>>>> configuration and sends the PF number and VF index to the central
>>>>> management software.
>>>>
>>>> What is "central management software" here? Deployment specific or
>>>> some part of k8s?
>>>
>>> It's the k8s API server.
>>>
>>>>
>>>>> 3. An agent on the DPU side detects the changes made on host X. Using
>>>>> the PF number and VF index, it identifies the corresponding
>>>>> representor, attaches it to an OVS bridge, and allows OVN to program
>>>>> the relevant steering rules.
>>>>
>>>> What does it mean that DPU "detects it", what's the source and
>>>> mechanism of the notification?
>>>> Is it communicating with the central SW during the process?
>>>
>>> The agent (running in the ARM/DPU) listens for events from the k8s API server.
>>
>> Interesting. So a deployment with no security boundaries. The internals
>> of the IPU and the k8s on the host are in the same domain of control.
>
> The VF is created on host X, but the corresponding representor appears
> on a different host, the IPU. Naturally, they need to be able to
> synchronize and exchange information for everything to work correctly.
>
>>
>> So how does the user remotely power cycle the hosts?
>
> Why should a user be able to power cycle the hosts?
> Are you are asking about the administrator?
>
>>
>> What I'm getting at is that your mental model seems to be missing any
>> sort of HW inventory database, which lists all the hosts and how they
>> plug into the DC. The administrator of the system must already know
>> where each machine is exactly in the chassis for basic DC ops. And
>> that HW DB is normally queried in what you describe. If there is any
>> security domain crossing in the picture it will require cross checking
>> against that HW DB.
>
> You're assuming that external host numbering and PCI enumeration are
> stable, also users can determine the mapping only after creating
> VFs. But even then, the mapping is indirect e.g: “I created a VF on
> this PF, and I see a single representor appear on the IPU, so they
> must be linked.” That approach is fragile and error prone.
>
> Also, keep in mind: the external hosts and their kernels shouldn’t
> be aware they’re part of a multi-host system. With our current
> approach, you just need to provide a host-to-IPU mapping
> upfront, no guesswork involved.
>
> Just thinking out loud, once this feature is in place, we might
> not even need a static mapping between external hosts and IPU hosts.
>
> If VUID and FUID are globally unique, the following workflow
> becomes possible:
>
> - A user requests a container with network connectivity.
> - k8s allocates and configures a VF on one of the hosts.
> It then sends the VUID, PF number, and VF index for the new VF
> to the k8S API server.
> - Somewhere in the network, a representor appears. An agent detects
> this and notifies the k8s API server, including its FUID,
> PF number, and VF index.
> - The API server matches the VF and representor data based on the
> globally unique identifiers and sends the relevant information
> back to the agent that reported the representor creation.
> - The agent attaches the representor to the OVS bridge, and with
> OVN configures the appropriate steering rules.
>
> This would remove the need for pre defined host to IPU mappings
> and allow for a more dynamic and flexible setup.
>
>>
>> I don't think this is sufficiently well established to warrant new uAPI.
>> You can use a UUID and pass it via ndo_get_phys_port_id.
>
> phys_port_id only applies to netdev interfaces, whereas this use case is
> broader and more aligned with devlink. We believe devlink is a more
> appropriate place for this functionality.
>
> Mark
>
Hi Jakub,
Just checking in, have you had a chance to review my earlier email?
Would appreciate your thoughts or guidance on the right path forward.
Mark
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function
2025-05-14 12:01 ` Mark Bloch
@ 2025-05-14 14:52 ` Jakub Kicinski
0 siblings, 0 replies; 25+ messages in thread
From: Jakub Kicinski @ 2025-05-14 14:52 UTC (permalink / raw)
To: Mark Bloch
Cc: Moshe Shemesh, netdev, David S. Miller, Eric Dumazet, Paolo Abeni,
Simon Horman, Donald Hunter, Jiri Pirko, Jonathan Corbet,
Andrew Lunn, Tariq Toukan
On Wed, 14 May 2025 15:01:40 +0300 Mark Bloch wrote:
> Just checking in, have you had a chance to review my earlier email?
> Would appreciate your thoughts or guidance on the right path forward.
Based on your previous reply I'm afraid you don't have sufficient
understanding of real life deployments to be extending this uAPI.
Or you're not telling me something, but I'll go with Hanlon's razor.
^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2025-05-14 14:52 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-23 13:50 [RFC net-next 0/5] devlink: Add unique identifier to devlink port function Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 1/5] " Moshe Shemesh
2025-04-28 12:33 ` Simon Horman
2025-04-29 9:33 ` Avihai Horon
2025-04-23 13:50 ` [RFC net-next 2/5] net/mlx5: Move mlx5_cmd_query_vuid() from IB to core Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 3/5] net/mlx5: Add vhca_id argument to mlx5_core_query_vuid() Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 4/5] net/mlx5: Add define for max VUID string size Moshe Shemesh
2025-04-23 13:50 ` [RFC net-next 5/5] net/mlx5: Expose unique identifier in devlink port function Moshe Shemesh
2025-04-24 23:24 ` [RFC net-next 0/5] devlink: Add unique identifier to " Jakub Kicinski
2025-04-25 11:26 ` Jiri Pirko
2025-04-25 17:51 ` Jakub Kicinski
2025-04-28 16:30 ` Jiri Pirko
2025-04-28 12:11 ` Moshe Shemesh
2025-04-28 18:19 ` Jakub Kicinski
2025-04-29 8:37 ` Moshe Shemesh
2025-05-02 0:39 ` Jakub Kicinski
2025-05-04 17:46 ` Mark Bloch
2025-05-05 18:55 ` Jakub Kicinski
2025-05-06 11:25 ` Mark Bloch
2025-05-06 15:20 ` Jakub Kicinski
2025-05-06 15:34 ` Mark Bloch
2025-05-08 0:43 ` Jakub Kicinski
2025-05-08 9:04 ` Mark Bloch
2025-05-14 12:01 ` Mark Bloch
2025-05-14 14:52 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).