[PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support
@ 2025-09-23 14:19 Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
                   ` (11 more replies)
  0 siblings, 12 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

This series implements ethtool flow rules support for virtio_net using the
virtio flow filter (FF) specification. The implementation allows users to
configure packet filtering rules through ethtool commands, directing
packets to specific receive queues, or dropping them based on various
header fields.

The series starts with infrastructure changes to expose virtio PCI admin
capabilities and object management APIs. It then creates the virtio_net
directory structure and implements the flow filter functionality with support
for:

- Layer 2 (Ethernet) flow rules
- IPv4 and IPv6 flow rules  
- TCP and UDP flow rules (both IPv4 and IPv6)
- Rule querying and management operations

Setting, deleting and viewing flow filters, -1 action is drop, postive
integers steer to that RQ:

$ ethtool -u ens9
4 RX rings available
Total 0 rules

$ ethtool -U ens9 flow-type ether src 1c:34:da:4a:33:dd action 0
Added rule with ID 0
$ ethtool -U ens9 flow-type udp4 dst-port 5001 action 3
Added rule with ID 1
$ ethtool -U ens9 flow-type tcp6 src-ip fc00::2 dst-port 5001 action 2
Added rule with ID 2
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action 1
Added rule with ID 3
$ ethtool -U ens9 flow-type ip6 dst-ip fc00::1 action -1
Added rule with ID 4
$ ethtool -U ens9 flow-type ip6 src-ip fc00::2 action -1
Added rule with ID 5
$ ethtool -U ens9 delete 4
$ ethtool -u ens9
4 RX rings available
Total 5 rules

Filter: 0
        Flow Type: Raw Ethernet
        Src MAC addr: 1C:34:DA:4A:33:DD mask: 00:00:00:00:00:00
        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
        Ethertype: 0x0 mask: 0xFFFF
        Action: Direct to queue 0

Filter: 1
        Rule Type: UDP over IPv4
        Src IP addr: 0.0.0.0 mask: 255.255.255.255
        Dest IP addr: 0.0.0.0 mask: 255.255.255.255
        TOS: 0x0 mask: 0xff
        Src port: 0 mask: 0xffff
        Dest port: 5001 mask: 0x0
        Action: Direct to queue 3

Filter: 2
        Rule Type: TCP over IPv6
        Src IP addr: fc00::2 mask: ::
        Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
        Traffic Class: 0x0 mask: 0xff
        Src port: 0 mask: 0xffff
        Dest port: 5001 mask: 0x0
        Action: Direct to queue 2

Filter: 3
        Rule Type: Raw IPv4
        Src IP addr: 192.168.51.101 mask: 0.0.0.0
        Dest IP addr: 0.0.0.0 mask: 255.255.255.255
        TOS: 0x0 mask: 0xff
        Protocol: 0 mask: 0xff
        L4 bytes: 0x0 mask: 0xffffffff
        Action: Direct to queue 1

Filter: 5
        Rule Type: Raw IPv6
        Src IP addr: fc00::2 mask: ::
        Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
        Traffic Class: 0x0 mask: 0xff
        Protocol: 0 mask: 0xff
        L4 bytes: 0x0 mask: 0xffffffff
        Action: Drop

v2:
  - Fix sparse warnings
  - Fix memory leak on subsequent failure to allocate
  - Fix some Typos

v3:
  - Rebased
	- Added back get|set_rxnfc to virtio_net
  - Added admin_ops to virtio_device kdoc.

Daniel Jurgens (11):
  virtio-pci: Expose generic device capability operations
  virtio-pci: Expose object create and destroy API
  virtio_net: Create virtio_net directory
  virtio_net: Query and set flow filter caps
  virtio_net: Create a FF group for ethtool steering
  virtio_net: Implement layer 2 ethtool flow rules
  virtio_net: Use existing classifier if possible
  virtio_net: Implement IPv4 ethtool flow rules
  virtio_net: Add support for IPv6 ethtool steering
  virtio_net: Add support for TCP and UDP ethtool rules
  virtio_net: Add get ethtool flow rules ops

 MAINTAINERS                                   |    2 +-
 drivers/net/Makefile                          |    2 +-
 drivers/net/virtio_net/Makefile               |    8 +
 drivers/net/virtio_net/virtio_net_ff.c        | 1029 +++++++++++++++++
 drivers/net/virtio_net/virtio_net_ff.h        |   42 +
 .../virtio_net_main.c}                        |   46 +
 drivers/vfio/pci/virtio/migrate.c             |    8 +-
 drivers/virtio/virtio.c                       |  141 +++
 drivers/virtio/virtio_pci_common.h            |    1 -
 drivers/virtio/virtio_pci_modern.c            |  320 ++---
 include/linux/virtio.h                        |   22 +
 include/linux/virtio_admin.h                  |  101 ++
 include/linux/virtio_pci_admin.h              |    7 +-
 include/uapi/linux/virtio_net_ff.h            |   82 ++
 include/uapi/linux/virtio_pci.h               |    7 +-
 15 files changed, 1677 insertions(+), 141 deletions(-)
 create mode 100644 drivers/net/virtio_net/Makefile
 create mode 100644 drivers/net/virtio_net/virtio_net_ff.c
 create mode 100644 drivers/net/virtio_net/virtio_net_ff.h
 rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (99%)
 create mode 100644 include/linux/virtio_admin.h
 create mode 100644 include/uapi/linux/virtio_net_ff.h

-- 
2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-24  1:16   ` Jason Wang
  2025-09-24  6:16   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 02/11] virtio-pci: Expose object create and destroy API Daniel Jurgens
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens, Yishai Hadas

Currently querying and setting capabilities is restricted to a single
capability and contained within the virtio PCI driver. However, each
device type has generic and device specific capabilities, that may be
queried and set. In subsequent patches virtio_net will query and set
flow filter capabilities.

Move the admin related definitions to a new header file. It needs to be
abstracted away from the PCI specifics to be used by upper layer
drivers.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/virtio/virtio.c            |  82 ++++++++++++++++
 drivers/virtio/virtio_pci_common.h |   1 -
 drivers/virtio/virtio_pci_modern.c | 145 +++++++++++++++++------------
 include/linux/virtio.h             |  14 +++
 include/linux/virtio_admin.h       |  68 ++++++++++++++
 include/uapi/linux/virtio_pci.h    |   7 +-
 6 files changed, 255 insertions(+), 62 deletions(-)
 create mode 100644 include/linux/virtio_admin.h

diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index a09eb4d62f82..6bc268c11100 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -706,6 +706,88 @@ int virtio_device_reset_done(struct virtio_device *dev)
 }
 EXPORT_SYMBOL_GPL(virtio_device_reset_done);
 
+/**
+ * virtio_device_cap_id_list_query - Query the list of available capability IDs
+ * @vdev: the virtio device
+ * @data: pointer to store the capability ID list result
+ *
+ * This function queries the virtio device for the list of available capability
+ * IDs that can be used with virtio_device_cap_get() and virtio_device_cap_set().
+ * The result is stored in the provided data structure.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability queries, or a negative error code on other failures.
+ */
+int
+virtio_device_cap_id_list_query(struct virtio_device *vdev,
+				struct virtio_admin_cmd_query_cap_id_result *data)
+{
+	const struct virtio_admin_ops *admin = vdev->admin_ops;
+
+	if (!admin || !admin->cap_id_list_query)
+		return -EOPNOTSUPP;
+
+	return admin->cap_id_list_query(vdev, data);
+}
+EXPORT_SYMBOL_GPL(virtio_device_cap_id_list_query);
+
+/**
+ * virtio_device_cap_get - Get a capability from a virtio device
+ * @vdev: the virtio device
+ * @id: capability ID to retrieve
+ * @caps: buffer to store the capability data
+ * @cap_size: size of the capability buffer in bytes
+ *
+ * This function retrieves a specific capability from the virtio device.
+ * The capability data is stored in the provided buffer. The caller must
+ * ensure the buffer is large enough to hold the capability data.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability retrieval, or a negative error code on other failures.
+ */
+int virtio_device_cap_get(struct virtio_device *vdev,
+			  u16 id,
+			  void *caps,
+			  size_t cap_size)
+{
+	const struct virtio_admin_ops *admin = vdev->admin_ops;
+
+	if (!admin || !admin->cap_get)
+		return -EOPNOTSUPP;
+
+	return admin->cap_get(vdev, id, caps, cap_size);
+}
+EXPORT_SYMBOL_GPL(virtio_device_cap_get);
+
+/**
+ * virtio_device_cap_set - Set a capability on a virtio device
+ * @vdev: the virtio device
+ * @id: capability ID to set
+ * @caps: buffer containing the capability data to set
+ * @cap_size: size of the capability data in bytes
+ *
+ * This function sets a specific capability on the virtio device.
+ * The capability data is read from the provided buffer and applied
+ * to the device. The device may validate the capability data before
+ * applying it.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability setting, or a negative error code on other failures.
+ */
+int virtio_device_cap_set(struct virtio_device *vdev,
+			  u16 id,
+			  const void *caps,
+			  size_t cap_size)
+{
+	const struct virtio_admin_ops *admin = vdev->admin_ops;
+
+	if (!admin || !admin->cap_set)
+		return -EOPNOTSUPP;
+
+	return admin->cap_set(vdev, id, caps, cap_size);
+}
+EXPORT_SYMBOL_GPL(virtio_device_cap_set);
+
 static int virtio_init(void)
 {
 	BUILD_BUG_ON(offsetof(struct virtio_device, features) !=
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 8cd01de27baf..fc26e035e7a6 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -48,7 +48,6 @@ struct virtio_pci_admin_vq {
 	/* Protects virtqueue access. */
 	spinlock_t lock;
 	u64 supported_cmds;
-	u64 supported_caps;
 	u8 max_dev_parts_objects;
 	struct ida dev_parts_ida;
 	/* Name of the admin queue: avq.$vq_index. */
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index dd0e65f71d41..c8bbd807371d 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -19,6 +19,7 @@
 #define VIRTIO_PCI_NO_LEGACY
 #define VIRTIO_RING_NO_LEGACY
 #include "virtio_pci_common.h"
+#include <linux/virtio_admin.h>
 
 #define VIRTIO_AVQ_SGS_MAX	4
 
@@ -232,103 +233,123 @@ static void virtio_pci_admin_cmd_list_init(struct virtio_device *virtio_dev)
 	kfree(data);
 }
 
-static void
-virtio_pci_admin_cmd_dev_parts_objects_enable(struct virtio_device *virtio_dev)
+static int vp_modern_admin_cmd_cap_get(struct virtio_device *virtio_dev,
+				       u16 id,
+				       void *caps,
+				       size_t cap_size)
 {
-	struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
-	struct virtio_admin_cmd_cap_get_data *get_data;
-	struct virtio_admin_cmd_cap_set_data *set_data;
-	struct virtio_dev_parts_cap *result;
+	struct virtio_admin_cmd_cap_get_data *data __free(kfree) = NULL;
 	struct virtio_admin_cmd cmd = {};
 	struct scatterlist result_sg;
 	struct scatterlist data_sg;
-	u8 resource_objects_limit;
-	u16 set_data_size;
-	int ret;
 
-	get_data = kzalloc(sizeof(*get_data), GFP_KERNEL);
-	if (!get_data)
-		return;
-
-	result = kzalloc(sizeof(*result), GFP_KERNEL);
-	if (!result)
-		goto end;
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
 
-	get_data->id = cpu_to_le16(VIRTIO_DEV_PARTS_CAP);
-	sg_init_one(&data_sg, get_data, sizeof(*get_data));
-	sg_init_one(&result_sg, result, sizeof(*result));
+	data->id = cpu_to_le16(id);
+	sg_init_one(&data_sg, data, sizeof(*data));
+	sg_init_one(&result_sg, caps, cap_size);
 	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DEVICE_CAP_GET);
 	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
 	cmd.data_sg = &data_sg;
 	cmd.result_sg = &result_sg;
-	ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
-	if (ret)
-		goto err_get;
 
-	set_data_size = sizeof(*set_data) + sizeof(*result);
-	set_data = kzalloc(set_data_size, GFP_KERNEL);
-	if (!set_data)
-		goto err_get;
+	return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+}
 
-	set_data->id = cpu_to_le16(VIRTIO_DEV_PARTS_CAP);
+static int vp_modern_admin_cmd_cap_set(struct virtio_device *virtio_dev,
+				       u16 id,
+				       const void *caps,
+				       size_t cap_size)
+{
+	struct virtio_admin_cmd_cap_set_data *data  __free(kfree) = NULL;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist data_sg;
+	size_t data_size;
+
+	data_size = sizeof(*data) + cap_size;
+	data = kzalloc(data_size, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->id = cpu_to_le16(id);
+	memcpy(data->cap_specific_data, caps, cap_size);
+	sg_init_one(&data_sg, data, data_size);
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
+	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+	cmd.data_sg = &data_sg;
+	cmd.result_sg = NULL;
+
+	return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+}
+
+static void
+virtio_pci_admin_cmd_dev_parts_objects_enable(struct virtio_device *virtio_dev)
+{
+	struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
+	struct virtio_dev_parts_cap *dev_parts;
+	u8 resource_objects_limit;
+	int ret;
+
+	dev_parts = kzalloc(sizeof(*dev_parts), GFP_KERNEL);
+	if (!dev_parts)
+		return;
+
+	ret = vp_modern_admin_cmd_cap_get(virtio_dev, VIRTIO_DEV_PARTS_CAP,
+					  dev_parts, sizeof(*dev_parts));
+	if (ret)
+		goto err;
 
 	/* Set the limit to the minimum value between the GET and SET values
 	 * supported by the device. Since the obj_id for VIRTIO_DEV_PARTS_CAP
 	 * is a globally unique value per PF, there is no possibility of
 	 * overlap between GET and SET operations.
 	 */
-	resource_objects_limit = min(result->get_parts_resource_objects_limit,
-				     result->set_parts_resource_objects_limit);
-	result->get_parts_resource_objects_limit = resource_objects_limit;
-	result->set_parts_resource_objects_limit = resource_objects_limit;
-	memcpy(set_data->cap_specific_data, result, sizeof(*result));
-	sg_init_one(&data_sg, set_data, set_data_size);
-	cmd.data_sg = &data_sg;
-	cmd.result_sg = NULL;
-	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
-	ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+	resource_objects_limit = min(dev_parts->get_parts_resource_objects_limit,
+				     dev_parts->set_parts_resource_objects_limit);
+	dev_parts->get_parts_resource_objects_limit = resource_objects_limit;
+	dev_parts->set_parts_resource_objects_limit = resource_objects_limit;
+
+	ret = vp_modern_admin_cmd_cap_set(virtio_dev, VIRTIO_DEV_PARTS_CAP,
+					  dev_parts, sizeof(*dev_parts));
 	if (ret)
-		goto err_set;
+		goto err;
 
 	/* Allocate IDR to manage the dev caps objects */
 	ida_init(&vp_dev->admin_vq.dev_parts_ida);
 	vp_dev->admin_vq.max_dev_parts_objects = resource_objects_limit;
 
-err_set:
-	kfree(set_data);
-err_get:
-	kfree(result);
-end:
-	kfree(get_data);
+err:
+	kfree(dev_parts);
 }
 
-static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
+static int vp_modern_admin_cap_id_list_query(struct virtio_device *virtio_dev,
+					     struct virtio_admin_cmd_query_cap_id_result *data)
 {
-	struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
-	struct virtio_admin_cmd_query_cap_id_result *data;
 	struct virtio_admin_cmd cmd = {};
 	struct scatterlist result_sg;
-	int ret;
-
-	data = kzalloc(sizeof(*data), GFP_KERNEL);
-	if (!data)
-		return;
 
 	sg_init_one(&result_sg, data, sizeof(*data));
 	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
 	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
 	cmd.result_sg = &result_sg;
 
-	ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
-	if (ret)
-		goto end;
+	return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+}
 
-	/* Max number of caps fits into a single u64 */
-	BUILD_BUG_ON(sizeof(data->supported_caps) > sizeof(u64));
+static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
+{
+	struct virtio_admin_cmd_query_cap_id_result *data;
 
-	vp_dev->admin_vq.supported_caps = le64_to_cpu(data->supported_caps[0]);
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return;
+
+	if (vp_modern_admin_cap_id_list_query(virtio_dev, data))
+		goto end;
 
-	if (!(vp_dev->admin_vq.supported_caps & (1 << VIRTIO_DEV_PARTS_CAP)))
+	if (!(VIRTIO_CAP_IN_LIST(data, VIRTIO_DEV_PARTS_CAP)))
 		goto end;
 
 	virtio_pci_admin_cmd_dev_parts_objects_enable(virtio_dev);
@@ -1264,6 +1285,11 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
 };
 
+static const struct virtio_admin_ops virtio_pci_admin_ops = {
+	.cap_id_list_query = vp_modern_admin_cap_id_list_query,
+	.cap_get = vp_modern_admin_cmd_cap_get,
+	.cap_set = vp_modern_admin_cmd_cap_set,
+};
 /* the PCI probing function */
 int virtio_pci_modern_probe(struct virtio_pci_device *vp_dev)
 {
@@ -1282,6 +1308,7 @@ int virtio_pci_modern_probe(struct virtio_pci_device *vp_dev)
 	else
 		vp_dev->vdev.config = &virtio_pci_config_nodev_ops;
 
+	vp_dev->vdev.admin_ops = &virtio_pci_admin_ops;
 	vp_dev->config_vector = vp_config_vector;
 	vp_dev->setup_vq = setup_vq;
 	vp_dev->del_vq = del_vq;
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index db31fc6f4f1f..7ab4ea75ad44 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -12,6 +12,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/completion.h>
 #include <linux/virtio_features.h>
+#include <linux/virtio_admin.h>
 
 /**
  * struct virtqueue - a queue to register buffers for sending or receiving.
@@ -141,6 +142,7 @@ struct virtio_admin_cmd {
  * @id: the device type identification (used to match it with a driver).
  * @config: the configuration ops for this device.
  * @vringh_config: configuration ops for host vrings.
+ * @admin_ops: administration operations for this device.
  * @vqs: the list of virtqueues for this device.
  * @features: the 64 lower features supported by both driver and device.
  * @features_array: the full features space supported by both driver and
@@ -161,6 +163,7 @@ struct virtio_device {
 	struct virtio_device_id id;
 	const struct virtio_config_ops *config;
 	const struct vringh_config_ops *vringh_config;
+	const struct virtio_admin_ops *admin_ops;
 	struct list_head vqs;
 	VIRTIO_DECLARE_FEATURES(features);
 	void *priv;
@@ -195,6 +198,17 @@ int virtio_device_restore(struct virtio_device *dev);
 void virtio_reset_device(struct virtio_device *dev);
 int virtio_device_reset_prepare(struct virtio_device *dev);
 int virtio_device_reset_done(struct virtio_device *dev);
+int
+virtio_device_cap_id_list_query(struct virtio_device *vdev,
+				struct virtio_admin_cmd_query_cap_id_result *data);
+int virtio_device_cap_get(struct virtio_device *vdev,
+			  u16 id,
+			  void *caps,
+			  size_t cap_size);
+int virtio_device_cap_set(struct virtio_device *vdev,
+			  u16 id,
+			  const void *caps,
+			  size_t cap_size);
 
 size_t virtio_max_dma_size(const struct virtio_device *vdev);
 
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
new file mode 100644
index 000000000000..bbf543d20be4
--- /dev/null
+++ b/include/linux/virtio_admin.h
@@ -0,0 +1,68 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Header file for virtio admin operations
+ */
+#include <uapi/linux/virtio_pci.h>
+
+#ifndef _LINUX_VIRTIO_ADMIN_H
+#define _LINUX_VIRTIO_ADMIN_H
+
+struct virtio_device;
+
+/**
+ * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
+ * @cap_list: Pointer to capability list structure containing supported_caps array
+ * @cap: Capability ID to check
+ *
+ * The cap_list contains a supported_caps array of little-endian 64-bit integers
+ * where each bit represents a capability. Bit 0 of the first element represents
+ * capability ID 0, bit 1 represents capability ID 1, and so on.
+ *
+ * Return: 1 if capability is supported, 0 otherwise
+ */
+#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
+	(!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
+
+/**
+ * struct virtio_admin_ops - Operations for virtio admin functionality
+ *
+ * This structure contains function pointers for performing administrative
+ * operations on virtio devices. All data and caps pointers must be allocated
+ * on the heap by the caller.
+ */
+struct virtio_admin_ops {
+	/**
+	 * @cap_id_list_query: Query the list of supported capability IDs
+	 * @vdev: The virtio device to query
+	 * @data: Pointer to result structure (must be heap allocated)
+	 * Return: 0 on success, negative error code on failure
+	 */
+	int (*cap_id_list_query)(struct virtio_device *vdev,
+				 struct virtio_admin_cmd_query_cap_id_result *data);
+	/**
+	 * @cap_get: Get capability data for a specific capability ID
+	 * @vdev: The virtio device
+	 * @id: Capability ID to retrieve
+	 * @caps: Pointer to capability data structure (must be heap allocated)
+	 * @cap_size: Size of the capability data structure
+	 * Return: 0 on success, negative error code on failure
+	 */
+	int (*cap_get)(struct virtio_device *vdev,
+		       u16 id,
+		       void *caps,
+		       size_t cap_size);
+	/**
+	 * @cap_set: Set capability data for a specific capability ID
+	 * @vdev: The virtio device
+	 * @id: Capability ID to set
+	 * @caps: Pointer to capability data structure (must be heap allocated)
+	 * @cap_size: Size of the capability data structure
+	 * Return: 0 on success, negative error code on failure
+	 */
+	int (*cap_set)(struct virtio_device *vdev,
+		       u16 id,
+		       const void *caps,
+		       size_t cap_size);
+};
+
+#endif /* _LINUX_VIRTIO_ADMIN_H */
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index c691ac210ce2..0d5ca0cff629 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -315,15 +315,18 @@ struct virtio_admin_cmd_notify_info_result {
 
 #define VIRTIO_DEV_PARTS_CAP 0x0000
 
+/* Update this value to largest implemented cap number. */
+#define VIRTIO_ADMIN_MAX_CAP 0x0fff
+
 struct virtio_dev_parts_cap {
 	__u8 get_parts_resource_objects_limit;
 	__u8 set_parts_resource_objects_limit;
 };
 
-#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
+#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP, 64)
 
 struct virtio_admin_cmd_query_cap_id_result {
-	__le64 supported_caps[MAX_CAP_ID];
+	__le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
 };
 
 struct virtio_admin_cmd_cap_get_data {
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 02/11] virtio-pci: Expose object create and destroy API
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory Daniel Jurgens
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens, Yishai Hadas

Object create and destroy were implemented specifically for dev parts
device objects. Create general purpose APIs for use by upper layer
drivers.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
---
 drivers/vfio/pci/virtio/migrate.c  |   8 +-
 drivers/virtio/virtio.c            |  59 ++++++++++
 drivers/virtio/virtio_pci_modern.c | 175 +++++++++++++++++------------
 include/linux/virtio.h             |   8 ++
 include/linux/virtio_admin.h       |  32 ++++++
 include/linux/virtio_pci_admin.h   |   7 +-
 6 files changed, 212 insertions(+), 77 deletions(-)

diff --git a/drivers/vfio/pci/virtio/migrate.c b/drivers/vfio/pci/virtio/migrate.c
index ba92bb4e9af9..a2aa0e32f593 100644
--- a/drivers/vfio/pci/virtio/migrate.c
+++ b/drivers/vfio/pci/virtio/migrate.c
@@ -152,15 +152,15 @@ static int
 virtiovf_pci_alloc_obj_id(struct virtiovf_pci_core_device *virtvdev, u8 type,
 			  u32 *obj_id)
 {
-	return virtio_pci_admin_obj_create(virtvdev->core_device.pdev,
-					   VIRTIO_RESOURCE_OBJ_DEV_PARTS, type, obj_id);
+	return virtio_pci_admin_dev_parts_obj_create(virtvdev->core_device.pdev,
+						     type, obj_id);
 }
 
 static void
 virtiovf_pci_free_obj_id(struct virtiovf_pci_core_device *virtvdev, u32 obj_id)
 {
-	virtio_pci_admin_obj_destroy(virtvdev->core_device.pdev,
-			VIRTIO_RESOURCE_OBJ_DEV_PARTS, obj_id);
+	virtio_pci_admin_dev_parts_obj_destroy(virtvdev->core_device.pdev,
+					       obj_id);
 }
 
 static struct virtiovf_data_buffer *
diff --git a/drivers/virtio/virtio.c b/drivers/virtio/virtio.c
index 6bc268c11100..62233ab4501b 100644
--- a/drivers/virtio/virtio.c
+++ b/drivers/virtio/virtio.c
@@ -788,6 +788,65 @@ int virtio_device_cap_set(struct virtio_device *vdev,
 }
 EXPORT_SYMBOL_GPL(virtio_device_cap_set);
 
+/**
+ * virtio_device_object_create - Create an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to create
+ * @obj_id: ID for the new object
+ * @obj_specific_data: object-specific data for creation
+ * @obj_specific_data_size: size of the object-specific data in bytes
+ *
+ * Creates a new object on the virtio device with the specified type and ID.
+ * The object may require object-specific data for proper initialization.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object creation, or a negative error code on other failures.
+ */
+int virtio_device_object_create(struct virtio_device *vdev,
+				u16 obj_type,
+				u32 obj_id,
+				const void *obj_specific_data,
+				size_t obj_specific_data_size)
+{
+	const struct virtio_admin_ops *admin = vdev->admin_ops;
+
+	if (!admin || !admin->object_create)
+		return -EOPNOTSUPP;
+
+	/* All users of this interface use the self group with member id 0 */
+	return admin->object_create(vdev, obj_type, obj_id,
+				    VIRTIO_ADMIN_GROUP_TYPE_SELF, 0,
+				    obj_specific_data, obj_specific_data_size);
+}
+EXPORT_SYMBOL_GPL(virtio_device_object_create);
+
+/**
+ * virtio_device_object_destroy - Destroy an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to destroy
+ * @obj_id: ID of the object to destroy
+ *
+ * Destroys a existing object on the virtio device with the specified type
+ * and ID.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object destruction, or a negative error code on other failures.
+ */
+int virtio_device_object_destroy(struct virtio_device *vdev,
+				 u16 obj_type,
+				 u32 obj_id)
+{
+	const struct virtio_admin_ops *admin = vdev->admin_ops;
+
+	if (!admin || !admin->object_destroy)
+		return -EOPNOTSUPP;
+
+	/* All users of this interface use the self group with member id 0 */
+	return admin->object_destroy(vdev, obj_type, obj_id,
+				     VIRTIO_ADMIN_GROUP_TYPE_SELF, 0);
+}
+EXPORT_SYMBOL_GPL(virtio_device_object_destroy);
+
 static int virtio_init(void)
 {
 	BUILD_BUG_ON(offsetof(struct virtio_device, features) !=
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index c8bbd807371d..ef787a6334c8 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -967,28 +967,61 @@ int virtio_pci_admin_mode_set(struct pci_dev *pdev, u8 flags)
 }
 EXPORT_SYMBOL_GPL(virtio_pci_admin_mode_set);
 
-/*
- * virtio_pci_admin_obj_create - Creates an object for a given type and operation,
- * following the max objects that can be created for that request.
- * @pdev: VF pci_dev
- * @obj_type: Object type
- * @operation_type: Operation type
- * @obj_id: Output unique object id
+static int vp_modern_admin_cmd_obj_create(struct virtio_device *virtio_dev,
+					  u16 obj_type,
+					  u32 obj_id,
+					  u16 group_type,
+					  u64 group_member_id,
+					  const void *obj_specific_data,
+					  size_t obj_specific_data_size)
+{
+	size_t data_size = sizeof(struct virtio_admin_cmd_resource_obj_create_data);
+	struct virtio_admin_cmd_resource_obj_create_data *obj_create_data;
+	struct virtio_admin_cmd cmd = {};
+	void *data __free(kfree) = NULL;
+	struct scatterlist data_sg;
+
+	data_size += (obj_specific_data_size);
+	data = kzalloc(data_size, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	obj_create_data = data;
+	obj_create_data->hdr.type = cpu_to_le16(obj_type);
+	obj_create_data->hdr.id = cpu_to_le32(obj_id);
+	memcpy(obj_create_data->resource_obj_specific_data, obj_specific_data,
+	       obj_specific_data_size);
+	sg_init_one(&data_sg, data, data_size);
+
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_CREATE);
+	cmd.group_type = cpu_to_le16(group_type);
+	cmd.group_member_id = cpu_to_le64(group_member_id);
+	cmd.data_sg = &data_sg;
+
+	return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+}
+
+/**
+ * virtio_pci_admin_dev_parts_obj_create - Create a device parts object
+ * @pdev: VF PCI device
+ * @operation_type: operation type (GET or SET)
+ * @obj_id: pointer to store the output unique object ID
  *
- * Note: caller must serialize access for the given device.
- * Returns 0 on success, or negative on failure.
+ * This function creates a device parts object for the specified VF PCI device.
+ * The object is associated with the SRIOV group and can be used for GET or SET
+ * operations. The caller must serialize access for the given device.
+ *
+ * Return: 0 on success, -ENODEV if the virtio device is not found,
+ * -EINVAL if the operation type is invalid, -EOPNOTSUPP if device parts
+ * objects are not supported, or a negative error code on other failures.
  */
-int virtio_pci_admin_obj_create(struct pci_dev *pdev, u16 obj_type, u8 operation_type,
-				u32 *obj_id)
+int virtio_pci_admin_dev_parts_obj_create(struct pci_dev *pdev,
+					  u8 operation_type,
+					  u32 *obj_id)
 {
 	struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
-	u16 data_size = sizeof(struct virtio_admin_cmd_resource_obj_create_data);
-	struct virtio_admin_cmd_resource_obj_create_data *obj_create_data;
 	struct virtio_resource_obj_dev_parts obj_dev_parts = {};
 	struct virtio_pci_admin_vq *avq;
-	struct virtio_admin_cmd cmd = {};
-	struct scatterlist data_sg;
-	void *data;
 	int id = -1;
 	int vf_id;
 	int ret;
@@ -1000,9 +1033,6 @@ int virtio_pci_admin_obj_create(struct pci_dev *pdev, u16 obj_type, u8 operation
 	if (vf_id < 0)
 		return vf_id;
 
-	if (obj_type != VIRTIO_RESOURCE_OBJ_DEV_PARTS)
-		return -EOPNOTSUPP;
-
 	if (operation_type != VIRTIO_RESOURCE_OBJ_DEV_PARTS_TYPE_GET &&
 	    operation_type != VIRTIO_RESOURCE_OBJ_DEV_PARTS_TYPE_SET)
 		return -EINVAL;
@@ -1016,52 +1046,66 @@ int virtio_pci_admin_obj_create(struct pci_dev *pdev, u16 obj_type, u8 operation
 	if (id < 0)
 		return id;
 
-	*obj_id = id;
-	data_size += sizeof(obj_dev_parts);
-	data = kzalloc(data_size, GFP_KERNEL);
-	if (!data) {
-		ret = -ENOMEM;
-		goto end;
-	}
-
-	obj_create_data = data;
-	obj_create_data->hdr.type = cpu_to_le16(obj_type);
-	obj_create_data->hdr.id = cpu_to_le32(*obj_id);
 	obj_dev_parts.type = operation_type;
-	memcpy(obj_create_data->resource_obj_specific_data, &obj_dev_parts,
-	       sizeof(obj_dev_parts));
-	sg_init_one(&data_sg, data, data_size);
-	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_CREATE);
-	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
-	cmd.group_member_id = cpu_to_le64(vf_id + 1);
-	cmd.data_sg = &data_sg;
-	ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+	ret = vp_modern_admin_cmd_obj_create(virtio_dev,
+					     VIRTIO_RESOURCE_OBJ_DEV_PARTS,
+					     id,
+					     VIRTIO_ADMIN_GROUP_TYPE_SRIOV,
+					     vf_id + 1,
+					     &obj_dev_parts,
+					     sizeof(obj_dev_parts));
 
-	kfree(data);
-end:
 	if (ret)
 		ida_free(&avq->dev_parts_ida, id);
+	else
+		*obj_id = id;
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(virtio_pci_admin_obj_create);
+EXPORT_SYMBOL_GPL(virtio_pci_admin_dev_parts_obj_create);
 
-/*
- * virtio_pci_admin_obj_destroy - Destroys an object of a given type and id
- * @pdev: VF pci_dev
- * @obj_type: Object type
- * @id: Object id
+static int vp_modern_admin_cmd_obj_destroy(struct virtio_device *virtio_dev,
+					   u16 obj_type,
+					   u32 obj_id,
+					   u16 group_type,
+					   u64 group_member_id)
+{
+	struct virtio_admin_cmd_resource_obj_cmd_hdr *data __free(kfree) = NULL;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist data_sg;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->type = cpu_to_le16(obj_type);
+	data->id = cpu_to_le32(obj_id);
+	sg_init_one(&data_sg, data, sizeof(*data));
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_DESTROY);
+	cmd.group_type = cpu_to_le16(group_type);
+	cmd.group_member_id = cpu_to_le64(group_member_id);
+	cmd.data_sg = &data_sg;
+
+	return vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+}
+
+/**
+ * virtio_pci_admin_dev_parts_obj_destroy - Destroy a device parts object
+ * @pdev: VF PCI device
+ * @obj_id: ID of the object to destroy
  *
- * Note: caller must serialize access for the given device.
- * Returns 0 on success, or negative on failure.
+ * This function destroys a device parts object with the specified ID for the
+ * given VF PCI device. The object must have been previously created using
+ * virtio_pci_admin_dev_parts_obj_create(). The caller must serialize access
+ * for the given device.
+ *
+ * Return: 0 on success, -ENODEV if the virtio device is not found,
+ * or a negative error code on other failures.
  */
-int virtio_pci_admin_obj_destroy(struct pci_dev *pdev, u16 obj_type, u32 id)
+int virtio_pci_admin_dev_parts_obj_destroy(struct pci_dev *pdev, u32 obj_id)
 {
 	struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev);
-	struct virtio_admin_cmd_resource_obj_cmd_hdr *data;
 	struct virtio_pci_device *vp_dev;
-	struct virtio_admin_cmd cmd = {};
-	struct scatterlist data_sg;
 	int vf_id;
 	int ret;
 
@@ -1072,30 +1116,19 @@ int virtio_pci_admin_obj_destroy(struct pci_dev *pdev, u16 obj_type, u32 id)
 	if (vf_id < 0)
 		return vf_id;
 
-	if (obj_type != VIRTIO_RESOURCE_OBJ_DEV_PARTS)
-		return -EINVAL;
-
-	data = kzalloc(sizeof(*data), GFP_KERNEL);
-	if (!data)
-		return -ENOMEM;
-
-	data->type = cpu_to_le16(obj_type);
-	data->id = cpu_to_le32(id);
-	sg_init_one(&data_sg, data, sizeof(*data));
-	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_DESTROY);
-	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SRIOV);
-	cmd.group_member_id = cpu_to_le64(vf_id + 1);
-	cmd.data_sg = &data_sg;
-	ret = vp_modern_admin_cmd_exec(virtio_dev, &cmd);
+	ret = vp_modern_admin_cmd_obj_destroy(virtio_dev,
+					      VIRTIO_RESOURCE_OBJ_DEV_PARTS,
+					      obj_id,
+					      VIRTIO_ADMIN_GROUP_TYPE_SRIOV,
+					      vf_id + 1);
 	if (!ret) {
 		vp_dev = to_vp_device(virtio_dev);
-		ida_free(&vp_dev->admin_vq.dev_parts_ida, id);
+		ida_free(&vp_dev->admin_vq.dev_parts_ida, obj_id);
 	}
 
-	kfree(data);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(virtio_pci_admin_obj_destroy);
+EXPORT_SYMBOL_GPL(virtio_pci_admin_dev_parts_obj_destroy);
 
 /*
  * virtio_pci_admin_dev_parts_metadata_get - Gets the metadata of the device parts
@@ -1289,6 +1322,8 @@ static const struct virtio_admin_ops virtio_pci_admin_ops = {
 	.cap_id_list_query = vp_modern_admin_cap_id_list_query,
 	.cap_get = vp_modern_admin_cmd_cap_get,
 	.cap_set = vp_modern_admin_cmd_cap_set,
+	.object_create = vp_modern_admin_cmd_obj_create,
+	.object_destroy = vp_modern_admin_cmd_obj_destroy,
 };
 /* the PCI probing function */
 int virtio_pci_modern_probe(struct virtio_pci_device *vp_dev)
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index 7ab4ea75ad44..543ba266d24c 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -209,6 +209,14 @@ int virtio_device_cap_set(struct virtio_device *vdev,
 			  u16 id,
 			  const void *caps,
 			  size_t cap_size);
+int virtio_device_object_create(struct virtio_device *virtio_dev,
+				u16 obj_type,
+				u32 obj_id,
+				const void *obj_specific_data,
+				size_t obj_specific_data_size);
+int virtio_device_object_destroy(struct virtio_device *virtio_dev,
+				 u16 obj_type,
+				 u32 obj_id);
 
 size_t virtio_max_dma_size(const struct virtio_device *vdev);
 
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
index bbf543d20be4..cc6b82461c9f 100644
--- a/include/linux/virtio_admin.h
+++ b/include/linux/virtio_admin.h
@@ -63,6 +63,38 @@ struct virtio_admin_ops {
 		       u16 id,
 		       const void *caps,
 		       size_t cap_size);
+	/**
+	 * @object_create: Create a new object of specified type
+	 * @virtio_dev: The virtio device
+	 * @obj_type: Type of object to create
+	 * @obj_id: ID to assign to the created object
+	 * @group_type: Type of group the object belongs to
+	 * @group_member_id: Member ID within the group
+	 * @obj_specific_data: Object-specific data (must be heap allocated)
+	 * @obj_specific_data_size: Size of the object-specific data
+	 * Returns: 0 on success, negative error code on failure
+	 */
+	int (*object_create)(struct virtio_device *virtio_dev,
+			     u16 obj_type,
+			     u32 obj_id,
+			     u16 group_type,
+			     u64 group_member_id,
+			     const void *obj_specific_data,
+			     size_t obj_specific_data_size);
+	/**
+	 * @object_destroy: Destroy an existing object
+	 * @virtio_dev: The virtio device
+	 * @obj_type: Type of object to destroy
+	 * @obj_id: ID of the object to destroy
+	 * @group_type: Type of group the object belongs to
+	 * @group_member_id: Member ID within the group
+	 * Returns: 0 on success, negative error code on failure
+	 */
+	int (*object_destroy)(struct virtio_device *virtio_dev,
+			      u16 obj_type,
+			      u32 obj_id,
+			      u16 group_type,
+			      u64 group_member_id);
 };
 
 #endif /* _LINUX_VIRTIO_ADMIN_H */
diff --git a/include/linux/virtio_pci_admin.h b/include/linux/virtio_pci_admin.h
index dffc92c17ad2..da9b8495bce4 100644
--- a/include/linux/virtio_pci_admin.h
+++ b/include/linux/virtio_pci_admin.h
@@ -22,9 +22,10 @@ int virtio_pci_admin_legacy_io_notify_info(struct pci_dev *pdev,
 
 bool virtio_pci_admin_has_dev_parts(struct pci_dev *pdev);
 int virtio_pci_admin_mode_set(struct pci_dev *pdev, u8 mode);
-int virtio_pci_admin_obj_create(struct pci_dev *pdev, u16 obj_type, u8 operation_type,
-				u32 *obj_id);
-int virtio_pci_admin_obj_destroy(struct pci_dev *pdev, u16 obj_type, u32 id);
+int virtio_pci_admin_dev_parts_obj_create(struct pci_dev *pdev,
+					  u8 operation_type,
+					  u32 *obj_id);
+int virtio_pci_admin_dev_parts_obj_destroy(struct pci_dev *pdev, u32 obj_id);
 int virtio_pci_admin_dev_parts_metadata_get(struct pci_dev *pdev, u16 obj_type,
 					    u32 id, u8 metadata_type, u32 *out);
 int virtio_pci_admin_dev_parts_get(struct pci_dev *pdev, u16 obj_type, u32 id,
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 02/11] virtio-pci: Expose object create and destroy API Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25  3:56   ` Xuan Zhuo
  2025-09-25 21:17   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

The flow filter implementaion requires minimal changes to the
existing virtio_net implementation. It's cleaner to separate it into
another file. In order to do so, move virtio_net.c into the new
virtio_net directory, and create a makefile for it. Note the name is
changed to virtio_net_main.c, so the module can retain the name
virtio_net.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 MAINTAINERS                                               | 2 +-
 drivers/net/Makefile                                      | 2 +-
 drivers/net/virtio_net/Makefile                           | 8 ++++++++
 .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
 4 files changed, 10 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/virtio_net/Makefile
 rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)

diff --git a/MAINTAINERS b/MAINTAINERS
index a8a770714101..09d26c4225a9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
 F:	Documentation/driver-api/virtio/
 F:	drivers/block/virtio_blk.c
 F:	drivers/crypto/virtio/
-F:	drivers/net/virtio_net.c
+F:	drivers/net/virtio_net/
 F:	drivers/vdpa/
 F:	drivers/virtio/
 F:	include/linux/vdpa.h
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 73bc63ecd65f..cf28992658a6 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
 obj-$(CONFIG_TUN) += tun.o
 obj-$(CONFIG_TAP) += tap.o
 obj-$(CONFIG_VETH) += veth.o
-obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
+obj-$(CONFIG_VIRTIO_NET) += virtio_net/
 obj-$(CONFIG_VXLAN) += vxlan/
 obj-$(CONFIG_GENEVE) += geneve.o
 obj-$(CONFIG_BAREUDP) += bareudp.o
diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
new file mode 100644
index 000000000000..c0a4725ddd69
--- /dev/null
+++ b/drivers/net/virtio_net/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for the VirtIO Net driver
+#
+
+obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
+
+virtio_net-objs := virtio_net_main.o
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
similarity index 100%
rename from drivers/net/virtio_net.c
rename to drivers/net/virtio_net/virtio_net_main.c
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (2 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 21:01   ` Michael S. Tsirkin
                     ` (3 more replies)
  2025-09-23 14:19 ` [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
                   ` (7 subsequent siblings)
  11 siblings, 4 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

When probing a virtnet device, attempt to read the flow filter
capabilities. In order to use the feature the caps must also
be set. For now setting what was read is sufficient.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/Makefile          |   2 +-
 drivers/net/virtio_net/virtio_net_ff.c   | 145 +++++++++++++++++++++++
 drivers/net/virtio_net/virtio_net_ff.h   |  22 ++++
 drivers/net/virtio_net/virtio_net_main.c |   7 ++
 include/linux/virtio_admin.h             |   1 +
 include/uapi/linux/virtio_net_ff.h       |  55 +++++++++
 6 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/virtio_net/virtio_net_ff.c
 create mode 100644 drivers/net/virtio_net/virtio_net_ff.h
 create mode 100644 include/uapi/linux/virtio_net_ff.h

diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
index c0a4725ddd69..c41a587ffb5b 100644
--- a/drivers/net/virtio_net/Makefile
+++ b/drivers/net/virtio_net/Makefile
@@ -5,4 +5,4 @@
 
 obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
 
-virtio_net-objs := virtio_net_main.o
+virtio_net-objs := virtio_net_main.o virtio_net_ff.o
diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
new file mode 100644
index 000000000000..61cb45331c97
--- /dev/null
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -0,0 +1,145 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/virtio_admin.h>
+#include <linux/virtio.h>
+#include <net/ipv6.h>
+#include <net/ip.h>
+#include "virtio_net_ff.h"
+
+static size_t get_mask_size(u16 type)
+{
+	switch (type) {
+	case VIRTIO_NET_FF_MASK_TYPE_ETH:
+		return sizeof(struct ethhdr);
+	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+		return sizeof(struct iphdr);
+	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+		return sizeof(struct ipv6hdr);
+	case VIRTIO_NET_FF_MASK_TYPE_TCP:
+		return sizeof(struct tcphdr);
+	case VIRTIO_NET_FF_MASK_TYPE_UDP:
+		return sizeof(struct udphdr);
+	}
+
+	return 0;
+}
+
+void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
+{
+	struct virtio_admin_cmd_query_cap_id_result *cap_id_list __free(kfree) = NULL;
+	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
+			      sizeof(struct virtio_net_ff_selector) *
+			      VIRTIO_NET_FF_MASK_TYPE_MAX;
+	struct virtio_net_ff_selector *sel;
+	int err;
+	int i;
+
+	cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
+	if (!cap_id_list)
+		return;
+
+	err = virtio_device_cap_id_list_query(vdev, cap_id_list);
+	if (err)
+		return;
+
+	if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_RESOURCE_CAP) &&
+	      VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_SELECTOR_CAP) &&
+	      VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_ACTION_CAP)))
+		return;
+
+	ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
+	if (!ff->ff_caps)
+		return;
+
+	err = virtio_device_cap_get(vdev,
+				    VIRTIO_NET_FF_RESOURCE_CAP,
+				    ff->ff_caps,
+				    sizeof(*ff->ff_caps));
+
+	if (err)
+		goto err_ff;
+
+	/* VIRTIO_NET_FF_MASK_TYPE start at 1 */
+	for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
+		ff_mask_size += get_mask_size(i);
+
+	ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
+	if (!ff->ff_mask)
+		goto err_ff;
+
+	err = virtio_device_cap_get(vdev,
+				    VIRTIO_NET_FF_SELECTOR_CAP,
+				    ff->ff_mask,
+				    ff_mask_size);
+
+	if (err)
+		goto err_ff_mask;
+
+	ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
+					VIRTIO_NET_FF_ACTION_MAX,
+					GFP_KERNEL);
+	if (!ff->ff_actions)
+		goto err_ff_mask;
+
+	err = virtio_device_cap_get(vdev,
+				    VIRTIO_NET_FF_ACTION_CAP,
+				    ff->ff_actions,
+				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+
+	if (err)
+		goto err_ff_action;
+
+	err = virtio_device_cap_set(vdev,
+				    VIRTIO_NET_FF_RESOURCE_CAP,
+				    ff->ff_caps,
+				    sizeof(*ff->ff_caps));
+	if (err)
+		goto err_ff_action;
+
+	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
+	sel = &ff->ff_mask->selectors[0];
+
+	for (int i = 0; i < ff->ff_mask->count; i++) {
+		ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
+		sel = (struct virtio_net_ff_selector *)((u8 *)sel + sizeof(*sel) + sel->length);
+	}
+
+	err = virtio_device_cap_set(vdev,
+				    VIRTIO_NET_FF_SELECTOR_CAP,
+				    ff->ff_mask,
+				    ff_mask_size);
+	if (err)
+		goto err_ff_action;
+
+	err = virtio_device_cap_set(vdev,
+				    VIRTIO_NET_FF_ACTION_CAP,
+				    ff->ff_actions,
+				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+	if (err)
+		goto err_ff_action;
+
+	ff->vdev = vdev;
+	ff->ff_supported = true;
+
+	return;
+
+err_ff_action:
+	kfree(ff->ff_actions);
+err_ff_mask:
+	kfree(ff->ff_mask);
+err_ff:
+	kfree(ff->ff_caps);
+}
+
+void virtnet_ff_cleanup(struct virtnet_ff *ff)
+{
+	if (!ff->ff_supported)
+		return;
+
+	kfree(ff->ff_actions);
+	kfree(ff->ff_mask);
+	kfree(ff->ff_caps);
+}
diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
new file mode 100644
index 000000000000..4aac0bd08b63
--- /dev/null
+++ b/drivers/net/virtio_net/virtio_net_ff.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Header file for virtio_net flow filters
+ */
+#include <linux/virtio_admin.h>
+
+#ifndef _VIRTIO_NET_FF_H
+#define _VIRTIO_NET_FF_H
+
+struct virtnet_ff {
+	struct virtio_device *vdev;
+	bool ff_supported;
+	struct virtio_net_ff_cap_data *ff_caps;
+	struct virtio_net_ff_cap_mask_data *ff_mask;
+	struct virtio_net_ff_actions *ff_actions;
+};
+
+void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
+
+void virtnet_ff_cleanup(struct virtnet_ff *ff);
+
+#endif /* _VIRTIO_NET_FF_H */
diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
index 7da5a37917e9..ebf3e5db0d64 100644
--- a/drivers/net/virtio_net/virtio_net_main.c
+++ b/drivers/net/virtio_net/virtio_net_main.c
@@ -26,6 +26,7 @@
 #include <net/netdev_rx_queue.h>
 #include <net/netdev_queues.h>
 #include <net/xdp_sock_drv.h>
+#include "virtio_net_ff.h"
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -493,6 +494,8 @@ struct virtnet_info {
 	struct failover *failover;
 
 	u64 device_stats_cap;
+
+	struct virtnet_ff ff;
 };
 
 struct padded_vnet_hdr {
@@ -7116,6 +7119,8 @@ static int virtnet_probe(struct virtio_device *vdev)
 	}
 	vi->guest_offloads_capable = vi->guest_offloads;
 
+	virtnet_ff_init(&vi->ff, vi->vdev);
+
 	rtnl_unlock();
 
 	err = virtnet_cpu_notif_add(vi);
@@ -7131,6 +7136,7 @@ static int virtnet_probe(struct virtio_device *vdev)
 
 free_unregister_netdev:
 	unregister_netdev(dev);
+	virtnet_ff_cleanup(&vi->ff);
 free_failover:
 	net_failover_destroy(vi->failover);
 free_vqs:
@@ -7180,6 +7186,7 @@ static void virtnet_remove(struct virtio_device *vdev)
 	virtnet_free_irq_moder(vi);
 
 	unregister_netdev(vi->dev);
+	virtnet_ff_cleanup(&vi->ff);
 
 	net_failover_destroy(vi->failover);
 
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
index cc6b82461c9f..f8f1369d1175 100644
--- a/include/linux/virtio_admin.h
+++ b/include/linux/virtio_admin.h
@@ -3,6 +3,7 @@
  * Header file for virtio admin operations
  */
 #include <uapi/linux/virtio_pci.h>
+#include <uapi/linux/virtio_net_ff.h>
 
 #ifndef _LINUX_VIRTIO_ADMIN_H
 #define _LINUX_VIRTIO_ADMIN_H
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
new file mode 100644
index 000000000000..a35533bf8377
--- /dev/null
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -0,0 +1,55 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+ *
+ * Header file for virtio_net flow filters
+ */
+#ifndef _LINUX_VIRTIO_NET_FF_H
+#define _LINUX_VIRTIO_NET_FF_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
+#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
+#define VIRTIO_NET_FF_ACTION_CAP 0x802
+
+struct virtio_net_ff_cap_data {
+	__le32 groups_limit;
+	__le32 classifiers_limit;
+	__le32 rules_limit;
+	__le32 rules_per_group_limit;
+	__u8 last_rule_priority;
+	__u8 selectors_per_classifier_limit;
+};
+
+struct virtio_net_ff_selector {
+	__u8 type;
+	__u8 flags;
+	__u8 reserved[2];
+	__u8 length;
+	__u8 reserved1[3];
+	__u8 mask[];
+};
+
+#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
+#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
+#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
+#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
+#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
+#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
+
+struct virtio_net_ff_cap_mask_data {
+	__u8 count;
+	__u8 reserved[7];
+	struct virtio_net_ff_selector selectors[];
+};
+#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
+
+#define VIRTIO_NET_FF_ACTION_DROP 1
+#define VIRTIO_NET_FF_ACTION_RX_VQ 2
+#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
+struct virtio_net_ff_actions {
+	__u8 count;
+	__u8 reserved[7];
+	__u8 actions[];
+};
+#endif
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (3 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 21:13   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

All ethtool steering rules will go in one group, create it during
initialization.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c | 25 +++++++++++++++++++++++++
 include/uapi/linux/virtio_net_ff.h     |  7 +++++++
 2 files changed, 32 insertions(+)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index 61cb45331c97..0036c2db9f77 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -6,6 +6,9 @@
 #include <net/ip.h>
 #include "virtio_net_ff.h"
 
+#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
+#define VIRTNET_FF_MAX_GROUPS 1
+
 static size_t get_mask_size(u16 type)
 {
 	switch (type) {
@@ -30,6 +33,7 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
 			      sizeof(struct virtio_net_ff_selector) *
 			      VIRTIO_NET_FF_MASK_TYPE_MAX;
+	struct virtio_net_resource_obj_ff_group ethtool_group = {};
 	struct virtio_net_ff_selector *sel;
 	int err;
 	int i;
@@ -92,6 +96,12 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	if (le32_to_cpu(ff->ff_caps->groups_limit) < VIRTNET_FF_MAX_GROUPS) {
+		err = -ENOSPC;
+		goto err_ff_action;
+	}
+	ff->ff_caps->groups_limit = cpu_to_le32(VIRTNET_FF_MAX_GROUPS);
+
 	err = virtio_device_cap_set(vdev,
 				    VIRTIO_NET_FF_RESOURCE_CAP,
 				    ff->ff_caps,
@@ -121,6 +131,17 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	ethtool_group.group_priority = cpu_to_le16(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+
+	/* Use priority for the object ID. */
+	err = virtio_device_object_create(vdev,
+					  VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+					  VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
+					  &ethtool_group,
+					  sizeof(ethtool_group));
+	if (err)
+		goto err_ff_action;
+
 	ff->vdev = vdev;
 	ff->ff_supported = true;
 
@@ -139,6 +160,10 @@ void virtnet_ff_cleanup(struct virtnet_ff *ff)
 	if (!ff->ff_supported)
 		return;
 
+	virtio_device_object_destroy(ff->vdev,
+				     VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+				     VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+
 	kfree(ff->ff_actions);
 	kfree(ff->ff_mask);
 	kfree(ff->ff_caps);
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index a35533bf8377..662693e1fefd 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -12,6 +12,8 @@
 #define VIRTIO_NET_FF_SELECTOR_CAP 0x801
 #define VIRTIO_NET_FF_ACTION_CAP 0x802
 
+#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+
 struct virtio_net_ff_cap_data {
 	__le32 groups_limit;
 	__le32 classifiers_limit;
@@ -52,4 +54,9 @@ struct virtio_net_ff_actions {
 	__u8 reserved[7];
 	__u8 actions[];
 };
+
+struct virtio_net_resource_obj_ff_group {
+	__le16 group_priority;
+};
+
 #endif
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (4 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 20:58   ` Michael S. Tsirkin
                     ` (2 more replies)
  2025-09-23 14:19 ` [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible Daniel Jurgens
                   ` (5 subsequent siblings)
  11 siblings, 3 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

Filtering a flow requires a classifier to match the packets, and a rule
to filter on the matches.

A classifier consists of one or more selectors. There is one selector
per header type. A selector must only use fields set in the selector
capabality. If partial matching is supported, the classifier mask for a
particular field can be a subset of the mask for that field in the
capability.

The rule consists of a priority, an action and a key. The key is a byte
array containing headers corresponding to the selectors in the
classifier.

This patch implements ethtool rules for ethernet headers.

Example:
$ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
Added rule with ID 1

The rule in the example directs received packets with the specified
destination MAC address to rq 30.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c   | 423 +++++++++++++++++++++++
 drivers/net/virtio_net/virtio_net_ff.h   |  14 +
 drivers/net/virtio_net/virtio_net_main.c |  16 +
 include/uapi/linux/virtio_net_ff.h       |  20 ++
 4 files changed, 473 insertions(+)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index 0036c2db9f77..e3c34bfd1d55 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -9,6 +9,418 @@
 #define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
 #define VIRTNET_FF_MAX_GROUPS 1
 
+struct virtnet_ethtool_rule {
+	struct ethtool_rx_flow_spec flow_spec;
+	u32 classifier_id;
+};
+
+/* New fields must be added before the classifier struct */
+struct virtnet_classifier {
+	size_t size;
+	u32 id;
+	struct virtio_net_resource_obj_ff_classifier classifier;
+};
+
+static bool check_mask_vs_cap(const void *m, const void *c,
+			      u16 len, bool partial)
+{
+	const u8 *mask = m;
+	const u8 *cap = c;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (partial && ((mask[i] & cap[i]) != mask[i]))
+			return false;
+		if (!partial && mask[i] != cap[i])
+			return false;
+	}
+
+	return true;
+}
+
+static
+struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
+						u8 selector_type)
+{
+	struct virtio_net_ff_selector *sel;
+	u8 *buf;
+	int i;
+
+	buf = (u8 *)&ff->ff_mask->selectors;
+	sel = (struct virtio_net_ff_selector *)buf;
+
+	for (i = 0; i < ff->ff_mask->count; i++) {
+		if (sel->type == selector_type)
+			return sel;
+
+		buf += sizeof(struct virtio_net_ff_selector) + sel->length;
+		sel = (struct virtio_net_ff_selector *)buf;
+	}
+
+	return NULL;
+}
+
+static bool validate_eth_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct ethhdr *cap, *mask;
+	struct ethhdr zeros = {0};
+
+	cap = (struct ethhdr *)&sel_cap->mask;
+	mask = (struct ethhdr *)&sel->mask;
+
+	if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
+	    !check_mask_vs_cap(mask->h_dest, cap->h_dest,
+			       sizeof(mask->h_dest), partial_mask))
+		return false;
+
+	if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
+	    !check_mask_vs_cap(mask->h_source, cap->h_source,
+			       sizeof(mask->h_source), partial_mask))
+		return false;
+
+	if (mask->h_proto &&
+	    !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
+			       sizeof(__be16), partial_mask))
+		return false;
+
+	return true;
+}
+
+static bool validate_mask(const struct virtnet_ff *ff,
+			  const struct virtio_net_ff_selector *sel)
+{
+	struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
+
+	if (!sel_cap)
+		return false;
+
+	switch (sel->type) {
+	case VIRTIO_NET_FF_MASK_TYPE_ETH:
+		return validate_eth_mask(ff, sel, sel_cap);
+	}
+
+	return false;
+}
+
+static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+{
+	int err;
+
+	err = xa_alloc(&ff->classifiers, &c->id, c,
+		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
+		       GFP_KERNEL);
+	if (err)
+		return err;
+
+	err = virtio_device_object_create(ff->vdev,
+					  VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+					  c->id,
+					  &c->classifier,
+					  c->size);
+	if (err)
+		goto err_xarray;
+
+	return 0;
+
+err_xarray:
+	xa_erase(&ff->classifiers, c->id);
+
+	return err;
+}
+
+static void destroy_classifier(struct virtnet_ff *ff,
+			       u32 classifier_id)
+{
+	struct virtnet_classifier *c;
+
+	c = xa_load(&ff->classifiers, classifier_id);
+	if (c) {
+		virtio_device_object_destroy(ff->vdev,
+					     VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+					     c->id);
+
+		xa_erase(&ff->classifiers, c->id);
+		kfree(c);
+	}
+}
+
+static void destroy_ethtool_rule(struct virtnet_ff *ff,
+				 struct virtnet_ethtool_rule *eth_rule)
+{
+	ff->ethtool.num_rules--;
+
+	virtio_device_object_destroy(ff->vdev,
+				     VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+				     eth_rule->flow_spec.location);
+
+	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+	destroy_classifier(ff, eth_rule->classifier_id);
+	kfree(eth_rule);
+}
+
+static int insert_rule(struct virtnet_ff *ff,
+		       struct virtnet_ethtool_rule *eth_rule,
+		       u32 classifier_id,
+		       const u8 *key,
+		       size_t key_size)
+{
+	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
+	struct virtio_net_resource_obj_ff_rule *ff_rule;
+	int err;
+
+	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
+	if (!ff_rule) {
+		err = -ENOMEM;
+		goto err_eth_rule;
+	}
+	/*
+	 * Intentionally leave the priority as 0. All rules have the same
+	 * priority.
+	 */
+	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+	ff_rule->classifier_id = cpu_to_le32(classifier_id);
+	ff_rule->key_length = (u8)key_size;
+	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
+					     VIRTIO_NET_FF_ACTION_DROP :
+					     VIRTIO_NET_FF_ACTION_RX_VQ;
+	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
+					       cpu_to_le16(fs->ring_cookie) : 0;
+	memcpy(&ff_rule->keys, key, key_size);
+
+	err = virtio_device_object_create(ff->vdev,
+					  VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+					  fs->location,
+					  ff_rule,
+					  sizeof(*ff_rule) + key_size);
+	if (err)
+		goto err_ff_rule;
+
+	eth_rule->classifier_id = classifier_id;
+	ff->ethtool.num_rules++;
+	kfree(ff_rule);
+
+	return 0;
+
+err_ff_rule:
+	kfree(ff_rule);
+err_eth_rule:
+	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+	kfree(eth_rule);
+
+	return err;
+}
+
+static u32 flow_type_mask(u32 flow_type)
+{
+	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+}
+
+static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
+{
+	switch (fs->flow_type) {
+	case ETHER_FLOW:
+		return true;
+	}
+
+	return false;
+}
+
+static int validate_flow_input(struct virtnet_ff *ff,
+			       const struct ethtool_rx_flow_spec *fs,
+			       u16 curr_queue_pairs)
+{
+	/* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
+	if (fs->location != RX_CLS_LOC_ANY)
+		return -EOPNOTSUPP;
+
+	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
+	    fs->ring_cookie >= curr_queue_pairs)
+		return -EINVAL;
+
+	if (fs->flow_type != flow_type_mask(fs->flow_type))
+		return -EOPNOTSUPP;
+
+	if (!supported_flow_type(fs))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
+				 size_t *key_size, size_t *classifier_size,
+				 int *num_hdrs)
+{
+	*num_hdrs = 1;
+	*key_size = sizeof(struct ethhdr);
+	/*
+	 * The classifier size is the size of the classifier header, a selector
+	 * header for each type of header in the match criteria, and each header
+	 * providing the mask for matching against.
+	 */
+	*classifier_size = *key_size +
+			   sizeof(struct virtio_net_resource_obj_ff_classifier) +
+			   sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
+}
+
+static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
+				   u8 *key,
+				   const struct ethtool_rx_flow_spec *fs)
+{
+	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
+	struct ethhdr *eth_k = (struct ethhdr *)key;
+
+	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
+	selector->length = sizeof(struct ethhdr);
+
+	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+}
+
+static int
+validate_classifier_selectors(struct virtnet_ff *ff,
+			      struct virtio_net_resource_obj_ff_classifier *classifier,
+			      int num_hdrs)
+{
+	struct virtio_net_ff_selector *selector = classifier->selectors;
+
+	for (int i = 0; i < num_hdrs; i++) {
+		if (!validate_mask(ff, selector))
+			return -EINVAL;
+
+		selector = (struct virtio_net_ff_selector *)(((u8 *)selector) +
+			    sizeof(*selector) + selector->length);
+	}
+
+	return 0;
+}
+
+static int build_and_insert(struct virtnet_ff *ff,
+			    struct virtnet_ethtool_rule *eth_rule)
+{
+	struct virtio_net_resource_obj_ff_classifier *classifier;
+	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
+	struct virtio_net_ff_selector *selector;
+	struct virtnet_classifier *c;
+	size_t classifier_size;
+	size_t key_size;
+	int num_hdrs;
+	u8 *key;
+	int err;
+
+	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
+
+	key = kzalloc(key_size, GFP_KERNEL);
+	if (!key)
+		return -ENOMEM;
+
+	/*
+	 * virtio_net_ff_obj_ff_classifier is already included in the
+	 * classifier_size.
+	 */
+	c = kzalloc(classifier_size +
+		    sizeof(struct virtnet_classifier) -
+		    sizeof(struct virtio_net_resource_obj_ff_classifier),
+		    GFP_KERNEL);
+	if (!c) {
+		kfree(key);
+		return -ENOMEM;
+	}
+
+	c->size = classifier_size;
+	classifier = &c->classifier;
+	classifier->count = num_hdrs;
+	selector = &classifier->selectors[0];
+
+	setup_eth_hdr_key_mask(selector, key, fs);
+
+	err = validate_classifier_selectors(ff, classifier, num_hdrs);
+	if (err)
+		goto err_key;
+
+	err = setup_classifier(ff, c);
+	if (err)
+		goto err_classifier;
+
+	err = insert_rule(ff, eth_rule, c->id, key, key_size);
+	if (err) {
+		destroy_classifier(ff, c->id);
+		goto err_key;
+	}
+
+	return 0;
+
+err_classifier:
+	kfree(c);
+err_key:
+	kfree(key);
+
+	return err;
+}
+
+int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+				struct ethtool_rx_flow_spec *fs,
+				u16 curr_queue_pairs)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	int err;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	err = validate_flow_input(ff, fs, curr_queue_pairs);
+	if (err)
+		return err;
+
+	eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
+	if (!eth_rule)
+		return -ENOMEM;
+
+	err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
+		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
+		       GFP_KERNEL);
+	if (err)
+		goto err_rule;
+
+	eth_rule->flow_spec = *fs;
+
+	err = build_and_insert(ff, eth_rule);
+	if (err)
+		goto err_xa;
+
+	return err;
+
+err_xa:
+	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+
+err_rule:
+	fs->location = RX_CLS_LOC_ANY;
+	kfree(eth_rule);
+
+	return err;
+}
+
+int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	int err = 0;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	eth_rule = xa_load(&ff->ethtool.rules, location);
+	if (!eth_rule) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	destroy_ethtool_rule(ff, eth_rule);
+out:
+	return err;
+}
+
 static size_t get_mask_size(u16 type)
 {
 	switch (type) {
@@ -142,6 +554,8 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
+	xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
 	ff->vdev = vdev;
 	ff->ff_supported = true;
 
@@ -157,9 +571,18 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 
 void virtnet_ff_cleanup(struct virtnet_ff *ff)
 {
+	struct virtnet_ethtool_rule *eth_rule;
+	unsigned long i;
+
 	if (!ff->ff_supported)
 		return;
 
+	xa_for_each(&ff->ethtool.rules, i, eth_rule)
+		destroy_ethtool_rule(ff, eth_rule);
+
+	xa_destroy(&ff->ethtool.rules);
+	xa_destroy(&ff->classifiers);
+
 	virtio_device_object_destroy(ff->vdev,
 				     VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
 				     VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
index 4aac0bd08b63..94b575fbd9ed 100644
--- a/drivers/net/virtio_net/virtio_net_ff.h
+++ b/drivers/net/virtio_net/virtio_net_ff.h
@@ -3,20 +3,34 @@
  * Header file for virtio_net flow filters
  */
 #include <linux/virtio_admin.h>
+#include <uapi/linux/ethtool.h>
 
 #ifndef _VIRTIO_NET_FF_H
 #define _VIRTIO_NET_FF_H
 
+struct virtnet_ethtool_ff {
+	struct xarray rules;
+	int    num_rules;
+};
+
 struct virtnet_ff {
 	struct virtio_device *vdev;
 	bool ff_supported;
 	struct virtio_net_ff_cap_data *ff_caps;
 	struct virtio_net_ff_cap_mask_data *ff_mask;
 	struct virtio_net_ff_actions *ff_actions;
+	struct xarray classifiers;
+	int num_classifiers;
+	struct virtnet_ethtool_ff ethtool;
 };
 
 void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
 
 void virtnet_ff_cleanup(struct virtnet_ff *ff);
 
+int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+				struct ethtool_rx_flow_spec *fs,
+				u16 curr_queue_pairs);
+int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
+
 #endif /* _VIRTIO_NET_FF_H */
diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
index ebf3e5db0d64..808988cdf265 100644
--- a/drivers/net/virtio_net/virtio_net_main.c
+++ b/drivers/net/virtio_net/virtio_net_main.c
@@ -5619,6 +5619,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
 	return vi->curr_queue_pairs;
 }
 
+static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+
+	switch (info->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
+						   vi->curr_queue_pairs);
+	case ETHTOOL_SRXCLSRLDEL:
+		return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
+	}
+
+	return -EOPNOTSUPP;
+}
+
 static const struct ethtool_ops virtnet_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
 		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
@@ -5645,6 +5660,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
 	.get_rxfh_fields = virtnet_get_hashflow,
 	.set_rxfh_fields = virtnet_set_hashflow,
 	.get_rx_ring_count = virtnet_get_rx_ring_count,
+	.set_rxnfc = virtnet_set_rxnfc,
 };
 
 static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index 662693e1fefd..f258964322f4 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -13,6 +13,8 @@
 #define VIRTIO_NET_FF_ACTION_CAP 0x802
 
 #define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
+#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
 
 struct virtio_net_ff_cap_data {
 	__le32 groups_limit;
@@ -59,4 +61,22 @@ struct virtio_net_resource_obj_ff_group {
 	__le16 group_priority;
 };
 
+struct virtio_net_resource_obj_ff_classifier {
+	__u8 count;
+	__u8 reserved[7];
+	struct virtio_net_ff_selector selectors[];
+};
+
+struct virtio_net_resource_obj_ff_rule {
+	__le32 group_id;
+	__le32 classifier_id;
+	__u8 rule_priority;
+	__u8 key_length; /* length of key in bytes */
+	__u8 action;
+	__u8 reserved;
+	__le16 vq_index;
+	__u8 reserved1[2];
+	__u8 keys[];
+};
+
 #endif
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (5 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 20:53   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

Classifiers can be used by more than one rule. If there is an exisitng
classifier, use it instead of creating a new one.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c | 39 ++++++++++++++++++--------
 1 file changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index e3c34bfd1d55..30c5ded57ab5 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -17,6 +17,7 @@ struct virtnet_ethtool_rule {
 /* New fields must be added before the classifier struct */
 struct virtnet_classifier {
 	size_t size;
+	refcount_t refcount;
 	u32 id;
 	struct virtio_net_resource_obj_ff_classifier classifier;
 };
@@ -105,11 +106,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
 	return false;
 }
 
-static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+static int setup_classifier(struct virtnet_ff *ff,
+			    struct virtnet_classifier **c)
 {
+	struct virtnet_classifier *tmp;
+	unsigned long i;
 	int err;
 
-	err = xa_alloc(&ff->classifiers, &c->id, c,
+	xa_for_each(&ff->classifiers, i, tmp) {
+		if ((*c)->size == tmp->size &&
+		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
+			refcount_inc(&tmp->refcount);
+			kfree(*c);
+			*c = tmp;
+			goto out;
+		}
+	}
+
+	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
 		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
 		       GFP_KERNEL);
 	if (err)
@@ -117,27 +131,28 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
 
 	err = virtio_device_object_create(ff->vdev,
 					  VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
-					  c->id,
-					  &c->classifier,
-					  c->size);
+					  (*c)->id,
+					  &(*c)->classifier,
+					  (*c)->size);
 	if (err)
 		goto err_xarray;
 
+	refcount_set(&(*c)->refcount, 1);
+out:
 	return 0;
 
 err_xarray:
-	xa_erase(&ff->classifiers, c->id);
+	xa_erase(&ff->classifiers, (*c)->id);
 
 	return err;
 }
 
-static void destroy_classifier(struct virtnet_ff *ff,
-			       u32 classifier_id)
+static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
 {
 	struct virtnet_classifier *c;
 
 	c = xa_load(&ff->classifiers, classifier_id);
-	if (c) {
+	if (c && refcount_dec_and_test(&c->refcount)) {
 		virtio_device_object_destroy(ff->vdev,
 					     VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
 					     c->id);
@@ -157,7 +172,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
 				     eth_rule->flow_spec.location);
 
 	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
-	destroy_classifier(ff, eth_rule->classifier_id);
+	try_destroy_classifier(ff, eth_rule->classifier_id);
 	kfree(eth_rule);
 }
 
@@ -340,13 +355,13 @@ static int build_and_insert(struct virtnet_ff *ff,
 	if (err)
 		goto err_key;
 
-	err = setup_classifier(ff, c);
+	err = setup_classifier(ff, &c);
 	if (err)
 		goto err_classifier;
 
 	err = insert_rule(ff, eth_rule, c->id, key, key_size);
 	if (err) {
-		destroy_classifier(ff, c->id);
+		try_destroy_classifier(ff, c->id);
 		goto err_key;
 	}
 
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (6 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 20:53   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

Add support for IP_USER type rules from ethtool.

Example:
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
Added rule with ID 1

The example rule will drop packets with the source IP specified.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c | 127 +++++++++++++++++++++++--
 1 file changed, 119 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index 30c5ded57ab5..0374676d1342 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -90,6 +90,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_ip4_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct iphdr *cap, *mask;
+
+	cap = (struct iphdr *)&sel_cap->mask;
+	mask = (struct iphdr *)&sel->mask;
+
+	if (mask->saddr &&
+	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+	    sizeof(__be32), partial_mask))
+		return false;
+
+	if (mask->daddr &&
+	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+	    sizeof(__be32), partial_mask))
+		return false;
+
+	if (mask->protocol &&
+	    !check_mask_vs_cap(&mask->protocol, &cap->protocol,
+	    sizeof(u8), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -101,11 +129,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
 	switch (sel->type) {
 	case VIRTIO_NET_FF_MASK_TYPE_ETH:
 		return validate_eth_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+		return validate_ip4_mask(ff, sel, sel_cap);
 	}
 
 	return false;
 }
 
+static void parse_ip4(struct iphdr *mask, struct iphdr *key,
+		      const struct ethtool_rx_flow_spec *fs)
+{
+	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
+	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
+
+	mask->saddr = l3_mask->ip4src;
+	mask->daddr = l3_mask->ip4dst;
+	key->saddr = l3_val->ip4src;
+	key->daddr = l3_val->ip4dst;
+
+	if (mask->protocol) {
+		mask->protocol = l3_mask->proto;
+		key->protocol = l3_val->proto;
+	}
+}
+
+static bool has_ipv4(u32 flow_type)
+{
+	return flow_type == IP_USER_FLOW;
+}
+
 static int setup_classifier(struct virtnet_ff *ff,
 			    struct virtnet_classifier **c)
 {
@@ -237,6 +290,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 {
 	switch (fs->flow_type) {
 	case ETHER_FLOW:
+	case IP_USER_FLOW:
 		return true;
 	}
 
@@ -260,16 +314,27 @@ static int validate_flow_input(struct virtnet_ff *ff,
 
 	if (!supported_flow_type(fs))
 		return -EOPNOTSUPP;
-
 	return 0;
 }
 
 static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
-				 size_t *key_size, size_t *classifier_size,
-				 int *num_hdrs)
+				size_t *key_size, size_t *classifier_size,
+				int *num_hdrs)
 {
+	size_t size = sizeof(struct ethhdr);
+
 	*num_hdrs = 1;
 	*key_size = sizeof(struct ethhdr);
+
+	if (fs->flow_type == ETHER_FLOW)
+		goto done;
+
+	(*num_hdrs)++;
+	if (has_ipv4(fs->flow_type))
+		size += sizeof(struct iphdr);
+
+done:
+	*key_size = size;
 	/*
 	 * The classifier size is the size of the classifier header, a selector
 	 * header for each type of header in the match criteria, and each header
@@ -281,8 +346,9 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 }
 
 static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
-				   u8 *key,
-				   const struct ethtool_rx_flow_spec *fs)
+				  u8 *key,
+				  const struct ethtool_rx_flow_spec *fs,
+				  int num_hdrs)
 {
 	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
 	struct ethhdr *eth_k = (struct ethhdr *)key;
@@ -290,8 +356,33 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
 	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
 	selector->length = sizeof(struct ethhdr);
 
-	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
-	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+	if (num_hdrs > 1) {
+		eth_m->h_proto = cpu_to_be16(0xffff);
+		eth_k->h_proto = cpu_to_be16(ETH_P_IP);
+	} else {
+		memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+		memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+	}
+}
+
+static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
+			     u8 *key,
+			     const struct ethtool_rx_flow_spec *fs)
+{
+	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+	struct iphdr *v4_k = (struct iphdr *)key;
+
+	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+	selector->length = sizeof(struct iphdr);
+
+	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+	    fs->h_u.usr_ip4_spec.tos ||
+	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+		return -EOPNOTSUPP;
+
+	parse_ip4(v4_m, v4_k, fs);
+
+	return 0;
 }
 
 static int
@@ -312,6 +403,17 @@ validate_classifier_selectors(struct virtnet_ff *ff,
 	return 0;
 }
 
+static
+struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
+{
+	void *nextsel;
+
+	nextsel = (u8 *)sel + sizeof(struct virtio_net_ff_selector) +
+		  sel->length;
+
+	return nextsel;
+}
+
 static int build_and_insert(struct virtnet_ff *ff,
 			    struct virtnet_ethtool_rule *eth_rule)
 {
@@ -349,8 +451,17 @@ static int build_and_insert(struct virtnet_ff *ff,
 	classifier->count = num_hdrs;
 	selector = &classifier->selectors[0];
 
-	setup_eth_hdr_key_mask(selector, key, fs);
+	setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
+	if (num_hdrs == 1)
+		goto validate;
+
+	selector = next_selector(selector);
+
+	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+	if (err)
+		goto err_classifier;
 
+validate:
 	err = validate_classifier_selectors(ff, classifier, num_hdrs);
 	if (err)
 		goto err_key;
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (7 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 20:47   ` Michael S. Tsirkin
  2025-09-23 14:19 ` [PATCH net-next v3 10/11] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

Implement support for IPV6_USER_FLOW type rules.

Example:
$ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
Added rule with ID 0

The example rule will forward packets with the specified soure and
destination IP addresses to RX ring 3.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c | 89 +++++++++++++++++++++++---
 1 file changed, 81 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index 0374676d1342..ce59fb36dae9 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -118,6 +118,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_ip6_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct ipv6hdr *cap, *mask;
+
+	cap = (struct ipv6hdr *)&sel_cap->mask;
+	mask = (struct ipv6hdr *)&sel->mask;
+
+	if (!ipv6_addr_any(&mask->saddr) &&
+	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+			       sizeof(cap->saddr), partial_mask))
+		return false;
+
+	if (!ipv6_addr_any(&mask->daddr) &&
+	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+			       sizeof(cap->daddr), partial_mask))
+		return false;
+
+	if (mask->nexthdr &&
+	    !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
+	    sizeof(cap->nexthdr), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -132,6 +160,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
 
 	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
 		return validate_ip4_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+		return validate_ip6_mask(ff, sel, sel_cap);
 	}
 
 	return false;
@@ -154,11 +185,38 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
 	}
 }
 
+static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
+		      const struct ethtool_rx_flow_spec *fs)
+{
+	const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
+	const struct ethtool_usrip6_spec *l3_val  = &fs->h_u.usr_ip6_spec;
+
+	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
+		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
+		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
+	}
+
+	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
+		memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
+		memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
+	}
+
+	if (l3_mask->l4_proto) {
+		mask->nexthdr = l3_mask->l4_proto;
+		key->nexthdr = l3_val->l4_proto;
+	}
+}
+
 static bool has_ipv4(u32 flow_type)
 {
 	return flow_type == IP_USER_FLOW;
 }
 
+static bool has_ipv6(u32 flow_type)
+{
+	return flow_type == IPV6_USER_FLOW;
+}
+
 static int setup_classifier(struct virtnet_ff *ff,
 			    struct virtnet_classifier **c)
 {
@@ -291,6 +349,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 	switch (fs->flow_type) {
 	case ETHER_FLOW:
 	case IP_USER_FLOW:
+	case IPV6_USER_FLOW:
 		return true;
 	}
 
@@ -332,7 +391,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 	(*num_hdrs)++;
 	if (has_ipv4(fs->flow_type))
 		size += sizeof(struct iphdr);
-
+	else if (has_ipv6(fs->flow_type))
+		size += sizeof(struct ipv6hdr);
 done:
 	*key_size = size;
 	/*
@@ -369,18 +429,31 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 			     u8 *key,
 			     const struct ethtool_rx_flow_spec *fs)
 {
+	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
 	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+	struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
 	struct iphdr *v4_k = (struct iphdr *)key;
 
-	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
-	selector->length = sizeof(struct iphdr);
+	if (has_ipv6(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
+		selector->length = sizeof(struct ipv6hdr);
 
-	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
-	    fs->h_u.usr_ip4_spec.tos ||
-	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
-		return -EOPNOTSUPP;
+		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+		    fs->h_u.usr_ip6_spec.tclass)
+			return -EOPNOTSUPP;
 
-	parse_ip4(v4_m, v4_k, fs);
+		parse_ip6(v6_m, v6_k, fs);
+	} else {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+		selector->length = sizeof(struct iphdr);
+
+		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+		    fs->h_u.usr_ip4_spec.tos ||
+		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+			return -EOPNOTSUPP;
+
+		parse_ip4(v4_m, v4_k, fs);
+	}
 
 	return 0;
 }
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 10/11] virtio_net: Add support for TCP and UDP ethtool rules
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (8 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-23 14:19 ` [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
  2025-09-25 21:19 ` [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Michael S. Tsirkin
  11 siblings, 0 replies; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

Implement TCP and UDP V4/V6 ethtool flow types.

Examples:
$ ethtool -U ens9 flow-type udp4 dst-ip 192.168.5.2 dst-port\
4321 action 20
Added rule with ID 4

This example directs IPv4 UDP traffic with the specified address and
port to queue 20.

$ ethtool -U ens9 flow-type tcp6 src-ip 2001:db8::1 src-port 1234 dst-ip\
2001:db8::2 dst-port 4321 action 12
Added rule with ID 5

This example directs IPv6 TCP traffic with the specified address and
port to queue 12.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c | 207 +++++++++++++++++++++++--
 1 file changed, 198 insertions(+), 9 deletions(-)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index ce59fb36dae9..d4a34958cc42 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -146,6 +146,52 @@ static bool validate_ip6_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_tcp_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct tcphdr *cap, *mask;
+
+	cap = (struct tcphdr *)&sel_cap->mask;
+	mask = (struct tcphdr *)&sel->mask;
+
+	if (mask->source &&
+	    !check_mask_vs_cap(&mask->source, &cap->source,
+	    sizeof(cap->source), partial_mask))
+		return false;
+
+	if (mask->dest &&
+	    !check_mask_vs_cap(&mask->dest, &cap->dest,
+	    sizeof(cap->dest), partial_mask))
+		return false;
+
+	return true;
+}
+
+static bool validate_udp_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct udphdr *cap, *mask;
+
+	cap = (struct udphdr *)&sel_cap->mask;
+	mask = (struct udphdr *)&sel->mask;
+
+	if (mask->source &&
+	    !check_mask_vs_cap(&mask->source, &cap->source,
+	    sizeof(cap->source), partial_mask))
+		return false;
+
+	if (mask->dest &&
+	    !check_mask_vs_cap(&mask->dest, &cap->dest,
+	    sizeof(cap->dest), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -163,11 +209,45 @@ static bool validate_mask(const struct virtnet_ff *ff,
 
 	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
 		return validate_ip6_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_TCP:
+		return validate_tcp_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_UDP:
+		return validate_udp_mask(ff, sel, sel_cap);
 	}
 
 	return false;
 }
 
+static void set_tcp(struct tcphdr *mask, struct tcphdr *key,
+		    __be16 psrc_m, __be16 psrc_k,
+		    __be16 pdst_m, __be16 pdst_k)
+{
+	if (psrc_m) {
+		mask->source = psrc_m;
+		key->source = psrc_k;
+	}
+	if (pdst_m) {
+		mask->dest = pdst_m;
+		key->dest = pdst_k;
+	}
+}
+
+static void set_udp(struct udphdr *mask, struct udphdr *key,
+		    __be16 psrc_m, __be16 psrc_k,
+		    __be16 pdst_m, __be16 pdst_k)
+{
+	if (psrc_m) {
+		mask->source = psrc_m;
+		key->source = psrc_k;
+	}
+	if (pdst_m) {
+		mask->dest = pdst_m;
+		key->dest = pdst_k;
+	}
+}
+
 static void parse_ip4(struct iphdr *mask, struct iphdr *key,
 		      const struct ethtool_rx_flow_spec *fs)
 {
@@ -209,12 +289,26 @@ static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
 
 static bool has_ipv4(u32 flow_type)
 {
-	return flow_type == IP_USER_FLOW;
+	return flow_type == TCP_V4_FLOW ||
+	       flow_type == UDP_V4_FLOW ||
+	       flow_type == IP_USER_FLOW;
 }
 
 static bool has_ipv6(u32 flow_type)
 {
-	return flow_type == IPV6_USER_FLOW;
+	return flow_type == TCP_V6_FLOW ||
+	       flow_type == UDP_V6_FLOW ||
+	       flow_type == IPV6_USER_FLOW;
+}
+
+static bool has_tcp(u32 flow_type)
+{
+	return flow_type == TCP_V4_FLOW || flow_type == TCP_V6_FLOW;
+}
+
+static bool has_udp(u32 flow_type)
+{
+	return flow_type == UDP_V4_FLOW || flow_type == UDP_V6_FLOW;
 }
 
 static int setup_classifier(struct virtnet_ff *ff,
@@ -350,6 +444,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 	case ETHER_FLOW:
 	case IP_USER_FLOW:
 	case IPV6_USER_FLOW:
+	case TCP_V4_FLOW:
+	case TCP_V6_FLOW:
+	case UDP_V4_FLOW:
+	case UDP_V6_FLOW:
 		return true;
 	}
 
@@ -393,6 +491,12 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 		size += sizeof(struct iphdr);
 	else if (has_ipv6(fs->flow_type))
 		size += sizeof(struct ipv6hdr);
+
+	if (has_tcp(fs->flow_type) || has_udp(fs->flow_type)) {
+		(*num_hdrs)++;
+		size += has_tcp(fs->flow_type) ? sizeof(struct tcphdr) :
+						 sizeof(struct udphdr);
+	}
 done:
 	*key_size = size;
 	/*
@@ -427,7 +531,8 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
 
 static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 			     u8 *key,
-			     const struct ethtool_rx_flow_spec *fs)
+			     const struct ethtool_rx_flow_spec *fs,
+			     int num_hdrs)
 {
 	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
 	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
@@ -438,21 +543,93 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
 		selector->length = sizeof(struct ipv6hdr);
 
-		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
-		    fs->h_u.usr_ip6_spec.tclass)
+		if (num_hdrs == 2 && (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+				      fs->h_u.usr_ip6_spec.tclass))
 			return -EOPNOTSUPP;
 
 		parse_ip6(v6_m, v6_k, fs);
+
+		if (num_hdrs > 2) {
+			v6_m->nexthdr = 0xff;
+			if (has_tcp(fs->flow_type))
+				v6_k->nexthdr = IPPROTO_TCP;
+			else
+				v6_k->nexthdr = IPPROTO_UDP;
+		}
 	} else {
 		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
 		selector->length = sizeof(struct iphdr);
 
-		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
-		    fs->h_u.usr_ip4_spec.tos ||
-		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+		if (num_hdrs == 2 &&
+		    (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+		     fs->h_u.usr_ip4_spec.tos ||
+		     fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4))
 			return -EOPNOTSUPP;
 
 		parse_ip4(v4_m, v4_k, fs);
+
+		if (num_hdrs > 2) {
+			v4_m->protocol = 0xff;
+			if (has_tcp(fs->flow_type))
+				v4_k->protocol = IPPROTO_TCP;
+			else
+				v4_k->protocol = IPPROTO_UDP;
+		}
+	}
+
+	return 0;
+}
+
+static int setup_transport_key_mask(struct virtio_net_ff_selector *selector,
+				    u8 *key,
+				    struct ethtool_rx_flow_spec *fs)
+{
+	struct tcphdr *tcp_m = (struct tcphdr *)&selector->mask;
+	struct udphdr *udp_m = (struct udphdr *)&selector->mask;
+	const struct ethtool_tcpip6_spec *v6_l4_mask;
+	const struct ethtool_tcpip4_spec *v4_l4_mask;
+	const struct ethtool_tcpip6_spec *v6_l4_key;
+	const struct ethtool_tcpip4_spec *v4_l4_key;
+	struct tcphdr *tcp_k = (struct tcphdr *)key;
+	struct udphdr *udp_k = (struct udphdr *)key;
+
+	if (has_tcp(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_TCP;
+		selector->length = sizeof(struct tcphdr);
+
+		if (has_ipv6(fs->flow_type)) {
+			v6_l4_mask = &fs->m_u.tcp_ip6_spec;
+			v6_l4_key = &fs->h_u.tcp_ip6_spec;
+
+			set_tcp(tcp_m, tcp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+				v6_l4_mask->pdst, v6_l4_key->pdst);
+		} else {
+			v4_l4_mask = &fs->m_u.tcp_ip4_spec;
+			v4_l4_key = &fs->h_u.tcp_ip4_spec;
+
+			set_tcp(tcp_m, tcp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+				v4_l4_mask->pdst, v4_l4_key->pdst);
+		}
+
+	} else if (has_udp(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_UDP;
+		selector->length = sizeof(struct udphdr);
+
+		if (has_ipv6(fs->flow_type)) {
+			v6_l4_mask = &fs->m_u.udp_ip6_spec;
+			v6_l4_key = &fs->h_u.udp_ip6_spec;
+
+			set_udp(udp_m, udp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+				v6_l4_mask->pdst, v6_l4_key->pdst);
+		} else {
+			v4_l4_mask = &fs->m_u.udp_ip4_spec;
+			v4_l4_key = &fs->h_u.udp_ip4_spec;
+
+			set_udp(udp_m, udp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+				v4_l4_mask->pdst, v4_l4_key->pdst);
+		}
+	} else {
+		return -EOPNOTSUPP;
 	}
 
 	return 0;
@@ -495,6 +672,7 @@ static int build_and_insert(struct virtnet_ff *ff,
 	struct virtio_net_ff_selector *selector;
 	struct virtnet_classifier *c;
 	size_t classifier_size;
+	size_t key_offset;
 	size_t key_size;
 	int num_hdrs;
 	u8 *key;
@@ -528,9 +706,20 @@ static int build_and_insert(struct virtnet_ff *ff,
 	if (num_hdrs == 1)
 		goto validate;
 
+	key_offset = selector->length;
+	selector = next_selector(selector);
+
+	err = setup_ip_key_mask(selector, key + key_offset, fs, num_hdrs);
+	if (err)
+		goto err_classifier;
+
+	if (num_hdrs == 2)
+		goto validate;
+
+	key_offset += selector->length;
 	selector = next_selector(selector);
 
-	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+	err = setup_transport_key_mask(selector, key + key_offset, fs);
 	if (err)
 		goto err_classifier;
 
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (9 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 10/11] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
@ 2025-09-23 14:19 ` Daniel Jurgens
  2025-09-25 20:44   ` Michael S. Tsirkin
  2025-09-25 21:19 ` [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Michael S. Tsirkin
  11 siblings, 1 reply; 58+ messages in thread
From: Daniel Jurgens @ 2025-09-23 14:19 UTC (permalink / raw)
  To: netdev, mst, jasowang, alex.williamson, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens

- Get total number of rules. There's no user interface for this. It is
  used to allocate an appropriately sized buffer for getting all the
  rules.

- Get specific rule
$ ethtool -u ens9 rule 0
	Filter: 0
		Rule Type: UDP over IPv4
		Src IP addr: 0.0.0.0 mask: 255.255.255.255
		Dest IP addr: 192.168.5.2 mask: 0.0.0.0
		TOS: 0x0 mask: 0xff
		Src port: 0 mask: 0xffff
		Dest port: 4321 mask: 0x0
		Action: Direct to queue 16

- Get all rules:
$ ethtool -u ens9
31 RX rings available
Total 2 rules

Filter: 0
        Rule Type: UDP over IPv4
        Src IP addr: 0.0.0.0 mask: 255.255.255.255
        Dest IP addr: 192.168.5.2 mask: 0.0.0.0
...

Filter: 1
        Flow Type: Raw Ethernet
        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
        Dest MAC addr: 08:11:22:33:44:54 mask: 00:00:00:00:00:00

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
 drivers/net/virtio_net/virtio_net_ff.c   | 48 ++++++++++++++++++++++++
 drivers/net/virtio_net/virtio_net_ff.h   |  6 +++
 drivers/net/virtio_net/virtio_net_main.c | 23 ++++++++++++
 3 files changed, 77 insertions(+)

diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
index d4a34958cc42..5488300a4fc3 100644
--- a/drivers/net/virtio_net/virtio_net_ff.c
+++ b/drivers/net/virtio_net/virtio_net_ff.c
@@ -809,6 +809,54 @@ int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
 	return err;
 }
 
+int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+				   struct ethtool_rxnfc *info)
+{
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	info->rule_cnt = ff->ethtool.num_rules;
+	info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;
+
+	return 0;
+}
+
+int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+			     struct ethtool_rxnfc *info)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	eth_rule = xa_load(&ff->ethtool.rules, info->fs.location);
+	if (!eth_rule)
+		return -ENOENT;
+
+	info->fs = eth_rule->flow_spec;
+
+	return 0;
+}
+
+int
+virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+			      struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	unsigned long i = 0;
+	int idx = 0;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	xa_for_each(&ff->ethtool.rules, i, eth_rule)
+		rule_locs[idx++] = i;
+
+	info->data = le32_to_cpu(ff->ff_caps->rules_limit);
+
+	return 0;
+}
+
 static size_t get_mask_size(u16 type)
 {
 	switch (type) {
diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
index 94b575fbd9ed..4bb41e64cc59 100644
--- a/drivers/net/virtio_net/virtio_net_ff.h
+++ b/drivers/net/virtio_net/virtio_net_ff.h
@@ -28,6 +28,12 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
 
 void virtnet_ff_cleanup(struct virtnet_ff *ff);
 
+int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+				   struct ethtool_rxnfc *info);
+int virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+				  struct ethtool_rxnfc *info, u32 *rule_locs);
+int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+			     struct ethtool_rxnfc *info);
 int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
 				struct ethtool_rx_flow_spec *fs,
 				u16 curr_queue_pairs);
diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
index 808988cdf265..e8336925c912 100644
--- a/drivers/net/virtio_net/virtio_net_main.c
+++ b/drivers/net/virtio_net/virtio_net_main.c
@@ -5619,6 +5619,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
 	return vi->curr_queue_pairs;
 }
 
+static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int rc = 0;
+
+	switch (info->cmd) {
+	case ETHTOOL_GRXCLSRLCNT:
+		rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
+		break;
+	case ETHTOOL_GRXCLSRULE:
+		rc = virtnet_ethtool_get_flow(&vi->ff, info);
+		break;
+	case ETHTOOL_GRXCLSRLALL:
+		rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
+		break;
+	default:
+		rc = -EOPNOTSUPP;
+	}
+
+	return rc;
+}
+
 static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -5660,6 +5682,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
 	.get_rxfh_fields = virtnet_get_hashflow,
 	.set_rxfh_fields = virtnet_set_hashflow,
 	.get_rx_ring_count = virtnet_get_rx_ring_count,
+	.get_rxnfc = virtnet_get_rxnfc,
 	.set_rxnfc = virtnet_set_rxnfc,
 };
 
-- 
2.45.0


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
@ 2025-09-24  1:16   ` Jason Wang
  2025-09-24  6:22     ` Michael S. Tsirkin
  2025-09-24  6:16   ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Jason Wang @ 2025-09-24  1:16 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, mst, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas

On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
>
> Currently querying and setting capabilities is restricted to a single
> capability and contained within the virtio PCI driver. However, each
> device type has generic and device specific capabilities, that may be
> queried and set. In subsequent patches virtio_net will query and set
> flow filter capabilities.
>
> Move the admin related definitions to a new header file. It needs to be
> abstracted away from the PCI specifics to be used by upper layer
> drivers.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
> ---

[...]

>
>  size_t virtio_max_dma_size(const struct virtio_device *vdev);
>
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> new file mode 100644
> index 000000000000..bbf543d20be4
> --- /dev/null
> +++ b/include/linux/virtio_admin.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: GPL-2.0-only
> + *
> + * Header file for virtio admin operations
> + */
> +#include <uapi/linux/virtio_pci.h>
> +
> +#ifndef _LINUX_VIRTIO_ADMIN_H
> +#define _LINUX_VIRTIO_ADMIN_H
> +
> +struct virtio_device;
> +
> +/**
> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> + * @cap_list: Pointer to capability list structure containing supported_caps array
> + * @cap: Capability ID to check
> + *
> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> + * where each bit represents a capability. Bit 0 of the first element represents
> + * capability ID 0, bit 1 represents capability ID 1, and so on.
> + *
> + * Return: 1 if capability is supported, 0 otherwise
> + */
> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
> +
> +/**
> + * struct virtio_admin_ops - Operations for virtio admin functionality
> + *
> + * This structure contains function pointers for performing administrative
> + * operations on virtio devices. All data and caps pointers must be allocated
> + * on the heap by the caller.
> + */
> +struct virtio_admin_ops {
> +       /**
> +        * @cap_id_list_query: Query the list of supported capability IDs
> +        * @vdev: The virtio device to query
> +        * @data: Pointer to result structure (must be heap allocated)
> +        * Return: 0 on success, negative error code on failure
> +        */
> +       int (*cap_id_list_query)(struct virtio_device *vdev,
> +                                struct virtio_admin_cmd_query_cap_id_result *data);
> +       /**
> +        * @cap_get: Get capability data for a specific capability ID
> +        * @vdev: The virtio device
> +        * @id: Capability ID to retrieve
> +        * @caps: Pointer to capability data structure (must be heap allocated)
> +        * @cap_size: Size of the capability data structure
> +        * Return: 0 on success, negative error code on failure
> +        */
> +       int (*cap_get)(struct virtio_device *vdev,
> +                      u16 id,
> +                      void *caps,
> +                      size_t cap_size);
> +       /**
> +        * @cap_set: Set capability data for a specific capability ID
> +        * @vdev: The virtio device
> +        * @id: Capability ID to set
> +        * @caps: Pointer to capability data structure (must be heap allocated)
> +        * @cap_size: Size of the capability data structure
> +        * Return: 0 on success, negative error code on failure
> +        */
> +       int (*cap_set)(struct virtio_device *vdev,
> +                      u16 id,
> +                      const void *caps,
> +                      size_t cap_size);
> +};

Looking at this, it's nothing admin virtqueue specific, I wonder why
it is not part of virtio_config_ops.

Thanks


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
  2025-09-24  1:16   ` Jason Wang
@ 2025-09-24  6:16   ` Michael S. Tsirkin
  1 sibling, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-24  6:16 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas

On Tue, Sep 23, 2025 at 09:19:10AM -0500, Daniel Jurgens wrote:
> Currently querying and setting capabilities is restricted to a single
> capability and contained within the virtio PCI driver. However, each
> device type has generic and device specific capabilities, that may be
> queried and set. In subsequent patches virtio_net will query and set
> flow filter capabilities.
> 
> Move the admin related definitions to a new header file. It needs to be
> abstracted away from the PCI specifics to be used by upper layer
> drivers.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>


...


> +/**
> + * struct virtio_admin_ops - Operations for virtio admin functionality
> + *
> + * This structure contains function pointers for performing administrative
> + * operations on virtio devices. All data and caps pointers must be allocated
> + * on the heap by the caller.
> + */
> +struct virtio_admin_ops {
> +	/**
> +	 * @cap_id_list_query: Query the list of supported capability IDs
> +	 * @vdev: The virtio device to query
> +	 * @data: Pointer to result structure (must be heap allocated)
> +	 * Return: 0 on success, negative error code on failure
> +	 */
> +	int (*cap_id_list_query)(struct virtio_device *vdev,
> +				 struct virtio_admin_cmd_query_cap_id_result *data);
> +	/**
> +	 * @cap_get: Get capability data for a specific capability ID
> +	 * @vdev: The virtio device
> +	 * @id: Capability ID to retrieve
> +	 * @caps: Pointer to capability data structure (must be heap allocated)
> +	 * @cap_size: Size of the capability data structure
> +	 * Return: 0 on success, negative error code on failure
> +	 */
> +	int (*cap_get)(struct virtio_device *vdev,
> +		       u16 id,
> +		       void *caps,
> +		       size_t cap_size);
> +	/**
> +	 * @cap_set: Set capability data for a specific capability ID
> +	 * @vdev: The virtio device
> +	 * @id: Capability ID to set
> +	 * @caps: Pointer to capability data structure (must be heap allocated)
> +	 * @cap_size: Size of the capability data structure
> +	 * Return: 0 on success, negative error code on failure
> +	 */
> +	int (*cap_set)(struct virtio_device *vdev,
> +		       u16 id,
> +		       const void *caps,
> +		       size_t cap_size);
> +};
> +


I do not get why do we need this indirection. There is a single
implementation in the spec for now, and your patchset does not introduce
a new one.


-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-24  1:16   ` Jason Wang
@ 2025-09-24  6:22     ` Michael S. Tsirkin
  2025-09-24 19:02       ` Dan Jurgens
  2025-09-26  4:55       ` Jason Wang
  0 siblings, 2 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-24  6:22 UTC (permalink / raw)
  To: Jason Wang
  Cc: Daniel Jurgens, netdev, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Wed, Sep 24, 2025 at 09:16:32AM +0800, Jason Wang wrote:
> On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
> >
> > Currently querying and setting capabilities is restricted to a single
> > capability and contained within the virtio PCI driver. However, each
> > device type has generic and device specific capabilities, that may be
> > queried and set. In subsequent patches virtio_net will query and set
> > flow filter capabilities.
> >
> > Move the admin related definitions to a new header file. It needs to be
> > abstracted away from the PCI specifics to be used by upper layer
> > drivers.
> >
> > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> > Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
> > ---
> 
> [...]
> 
> >
> >  size_t virtio_max_dma_size(const struct virtio_device *vdev);
> >
> > diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> > new file mode 100644
> > index 000000000000..bbf543d20be4
> > --- /dev/null
> > +++ b/include/linux/virtio_admin.h
> > @@ -0,0 +1,68 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only
> > + *
> > + * Header file for virtio admin operations
> > + */
> > +#include <uapi/linux/virtio_pci.h>
> > +
> > +#ifndef _LINUX_VIRTIO_ADMIN_H
> > +#define _LINUX_VIRTIO_ADMIN_H
> > +
> > +struct virtio_device;
> > +
> > +/**
> > + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> > + * @cap_list: Pointer to capability list structure containing supported_caps array
> > + * @cap: Capability ID to check
> > + *
> > + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> > + * where each bit represents a capability. Bit 0 of the first element represents
> > + * capability ID 0, bit 1 represents capability ID 1, and so on.
> > + *
> > + * Return: 1 if capability is supported, 0 otherwise
> > + */
> > +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> > +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
> > +
> > +/**
> > + * struct virtio_admin_ops - Operations for virtio admin functionality
> > + *
> > + * This structure contains function pointers for performing administrative
> > + * operations on virtio devices. All data and caps pointers must be allocated
> > + * on the heap by the caller.
> > + */
> > +struct virtio_admin_ops {
> > +       /**
> > +        * @cap_id_list_query: Query the list of supported capability IDs
> > +        * @vdev: The virtio device to query
> > +        * @data: Pointer to result structure (must be heap allocated)
> > +        * Return: 0 on success, negative error code on failure
> > +        */
> > +       int (*cap_id_list_query)(struct virtio_device *vdev,
> > +                                struct virtio_admin_cmd_query_cap_id_result *data);
> > +       /**
> > +        * @cap_get: Get capability data for a specific capability ID
> > +        * @vdev: The virtio device
> > +        * @id: Capability ID to retrieve
> > +        * @caps: Pointer to capability data structure (must be heap allocated)
> > +        * @cap_size: Size of the capability data structure
> > +        * Return: 0 on success, negative error code on failure
> > +        */
> > +       int (*cap_get)(struct virtio_device *vdev,
> > +                      u16 id,
> > +                      void *caps,
> > +                      size_t cap_size);
> > +       /**
> > +        * @cap_set: Set capability data for a specific capability ID
> > +        * @vdev: The virtio device
> > +        * @id: Capability ID to set
> > +        * @caps: Pointer to capability data structure (must be heap allocated)
> > +        * @cap_size: Size of the capability data structure
> > +        * Return: 0 on success, negative error code on failure
> > +        */
> > +       int (*cap_set)(struct virtio_device *vdev,
> > +                      u16 id,
> > +                      const void *caps,
> > +                      size_t cap_size);
> > +};
> 
> Looking at this, it's nothing admin virtqueue specific, I wonder why
> it is not part of virtio_config_ops.
> 
> Thanks

cap things are admin commands. But what I do not get is why they
need to be callbacks.

The only thing about admin commands that is pci specific is finding
the admin vq.

I'd expect an API for that in config then, and the rest of code can
be completely transport independent.


-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-24  6:22     ` Michael S. Tsirkin
@ 2025-09-24 19:02       ` Dan Jurgens
  2025-09-25  6:16         ` Michael S. Tsirkin
  2025-09-26  4:55       ` Jason Wang
  1 sibling, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-24 19:02 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: netdev, alex.williamson, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas

On 9/24/25 1:22 AM, Michael S. Tsirkin wrote:
> On Wed, Sep 24, 2025 at 09:16:32AM +0800, Jason Wang wrote:
>> On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
>>>
>>> Currently querying and setting capabilities is restricted to a single
>>> capability and contained within the virtio PCI driver. However, each
>>> device type has generic and device specific capabilities, that may be
>>> queried and set. In subsequent patches virtio_net will query and set
>>> flow filter capabilities.
>>>
>>> Move the admin related definitions to a new header file. It needs to be
>>> abstracted away from the PCI specifics to be used by upper layer
>>> drivers.
>>>
>>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>>> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
>>> ---
>>
>> [...]
>>
>>>
>>>  size_t virtio_max_dma_size(const struct virtio_device *vdev);
>>>
>>> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
>>> new file mode 100644
>>> index 000000000000..bbf543d20be4
>>> --- /dev/null
>>> +++ b/include/linux/virtio_admin.h
>>> @@ -0,0 +1,68 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only
>>> + *
>>> + * Header file for virtio admin operations
>>> + */
>>> +#include <uapi/linux/virtio_pci.h>
>>> +
>>> +#ifndef _LINUX_VIRTIO_ADMIN_H
>>> +#define _LINUX_VIRTIO_ADMIN_H
>>> +
>>> +struct virtio_device;
>>> +
>>> +/**
>>> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
>>> + * @cap_list: Pointer to capability list structure containing supported_caps array
>>> + * @cap: Capability ID to check
>>> + *
>>> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
>>> + * where each bit represents a capability. Bit 0 of the first element represents
>>> + * capability ID 0, bit 1 represents capability ID 1, and so on.
>>> + *
>>> + * Return: 1 if capability is supported, 0 otherwise
>>> + */
>>> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
>>> +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
>>> +
>>> +/**
>>> + * struct virtio_admin_ops - Operations for virtio admin functionality
>>> + *
>>> + * This structure contains function pointers for performing administrative
>>> + * operations on virtio devices. All data and caps pointers must be allocated
>>> + * on the heap by the caller.
>>> + */
>>> +struct virtio_admin_ops {
>>> +       /**
>>> +        * @cap_id_list_query: Query the list of supported capability IDs
>>> +        * @vdev: The virtio device to query
>>> +        * @data: Pointer to result structure (must be heap allocated)
>>> +        * Return: 0 on success, negative error code on failure
>>> +        */
>>> +       int (*cap_id_list_query)(struct virtio_device *vdev,
>>> +                                struct virtio_admin_cmd_query_cap_id_result *data);
>>> +       /**
>>> +        * @cap_get: Get capability data for a specific capability ID
>>> +        * @vdev: The virtio device
>>> +        * @id: Capability ID to retrieve
>>> +        * @caps: Pointer to capability data structure (must be heap allocated)
>>> +        * @cap_size: Size of the capability data structure
>>> +        * Return: 0 on success, negative error code on failure
>>> +        */
>>> +       int (*cap_get)(struct virtio_device *vdev,
>>> +                      u16 id,
>>> +                      void *caps,
>>> +                      size_t cap_size);
>>> +       /**
>>> +        * @cap_set: Set capability data for a specific capability ID
>>> +        * @vdev: The virtio device
>>> +        * @id: Capability ID to set
>>> +        * @caps: Pointer to capability data structure (must be heap allocated)
>>> +        * @cap_size: Size of the capability data structure
>>> +        * Return: 0 on success, negative error code on failure
>>> +        */
>>> +       int (*cap_set)(struct virtio_device *vdev,
>>> +                      u16 id,
>>> +                      const void *caps,
>>> +                      size_t cap_size);
>>> +};
>>
>> Looking at this, it's nothing admin virtqueue specific, I wonder why
>> it is not part of virtio_config_ops.
>>
>> Thanks
> 
> cap things are admin commands. But what I do not get is why they
> need to be callbacks.
> 
> The only thing about admin commands that is pci specific is finding
> the admin vq.
> 
> I'd expect an API for that in config then, and the rest of code can
> be completely transport independent.
> 
> 

The idea was that each transport would implement the callbacks, and we
have indirection at the virtio_device level. Similar to the config_ops.
So the drivers stay transport agnostic. I know these are PCI specific
now, but thought it should be implemented generically.

These could go in config ops. But I thought it was better to isolate
them in a new _ops structure.

An earlier implementation had the net driver accessing the admin_ops
directly. But Parav thought this was better.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-23 14:19 ` [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory Daniel Jurgens
@ 2025-09-25  3:56   ` Xuan Zhuo
  2025-09-25  6:13     ` Michael S. Tsirkin
  2025-09-25 21:17   ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Xuan Zhuo @ 2025-09-25  3:56 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Daniel Jurgens, netdev, mst, jasowang, alex.williamson,
	pabeni

On Tue, 23 Sep 2025 09:19:12 -0500, Daniel Jurgens <danielj@nvidia.com> wrote:
> The flow filter implementaion requires minimal changes to the
> existing virtio_net implementation. It's cleaner to separate it into
> another file. In order to do so, move virtio_net.c into the new
> virtio_net directory, and create a makefile for it. Note the name is
> changed to virtio_net_main.c, so the module can retain the name
> virtio_net.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>

To help this work move forward smoothly, I don't recommend splitting the
directory structure within this patchset. Directory reorganization can be a
separate effortâ€”I've previously experimented with this myself. I'd really
like to see this work progress smoothly.

Thanks.


> ---
>  MAINTAINERS                                               | 2 +-
>  drivers/net/Makefile                                      | 2 +-
>  drivers/net/virtio_net/Makefile                           | 8 ++++++++
>  .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
>  4 files changed, 10 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/net/virtio_net/Makefile
>  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8a770714101..09d26c4225a9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
>  F:	Documentation/driver-api/virtio/
>  F:	drivers/block/virtio_blk.c
>  F:	drivers/crypto/virtio/
> -F:	drivers/net/virtio_net.c
> +F:	drivers/net/virtio_net/
>  F:	drivers/vdpa/
>  F:	drivers/virtio/
>  F:	include/linux/vdpa.h
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 73bc63ecd65f..cf28992658a6 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
>  obj-$(CONFIG_TUN) += tun.o
>  obj-$(CONFIG_TAP) += tap.o
>  obj-$(CONFIG_VETH) += veth.o
> -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> +obj-$(CONFIG_VIRTIO_NET) += virtio_net/
>  obj-$(CONFIG_VXLAN) += vxlan/
>  obj-$(CONFIG_GENEVE) += geneve.o
>  obj-$(CONFIG_BAREUDP) += bareudp.o
> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> new file mode 100644
> index 000000000000..c0a4725ddd69
> --- /dev/null
> +++ b/drivers/net/virtio_net/Makefile
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for the VirtIO Net driver
> +#
> +
> +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> +
> +virtio_net-objs := virtio_net_main.o
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
> similarity index 100%
> rename from drivers/net/virtio_net.c
> rename to drivers/net/virtio_net/virtio_net_main.c
> --
> 2.45.0
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-25  3:56   ` Xuan Zhuo
@ 2025-09-25  6:13     ` Michael S. Tsirkin
  2025-09-25 15:48       ` Dan Jurgens
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25  6:13 UTC (permalink / raw)
  To: Xuan Zhuo
  Cc: Daniel Jurgens, virtualization, parav, shshitrit, yohadt,
	eperezma, shameerali.kolothum.thodi, jgg, kevin.tian, kuba,
	andrew+netdev, edumazet, netdev, jasowang, alex.williamson,
	pabeni

On Thu, Sep 25, 2025 at 11:56:09AM +0800, Xuan Zhuo wrote:
> On Tue, 23 Sep 2025 09:19:12 -0500, Daniel Jurgens <danielj@nvidia.com> wrote:
> > The flow filter implementaion requires minimal changes to the
> > existing virtio_net implementation. It's cleaner to separate it into
> > another file. In order to do so, move virtio_net.c into the new
> > virtio_net directory, and create a makefile for it. Note the name is
> > changed to virtio_net_main.c, so the module can retain the name
> > virtio_net.
> >
> > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> 
> To help this work move forward smoothly, I don't recommend splitting the
> directory structure within this patchset. Directory reorganization can be a
> separate effortâ€”I've previously experimented with this myself. I'd really
> like to see this work progress smoothly.
> 
> Thanks.

Indeed.

> 
> > ---
> >  MAINTAINERS                                               | 2 +-
> >  drivers/net/Makefile                                      | 2 +-
> >  drivers/net/virtio_net/Makefile                           | 8 ++++++++
> >  .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
> >  4 files changed, 10 insertions(+), 2 deletions(-)
> >  create mode 100644 drivers/net/virtio_net/Makefile
> >  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS
> > index a8a770714101..09d26c4225a9 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
> >  F:	Documentation/driver-api/virtio/
> >  F:	drivers/block/virtio_blk.c
> >  F:	drivers/crypto/virtio/
> > -F:	drivers/net/virtio_net.c
> > +F:	drivers/net/virtio_net/
> >  F:	drivers/vdpa/
> >  F:	drivers/virtio/
> >  F:	include/linux/vdpa.h
> > diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> > index 73bc63ecd65f..cf28992658a6 100644
> > --- a/drivers/net/Makefile
> > +++ b/drivers/net/Makefile
> > @@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
> >  obj-$(CONFIG_TUN) += tun.o
> >  obj-$(CONFIG_TAP) += tap.o
> >  obj-$(CONFIG_VETH) += veth.o
> > -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> > +obj-$(CONFIG_VIRTIO_NET) += virtio_net/
> >  obj-$(CONFIG_VXLAN) += vxlan/
> >  obj-$(CONFIG_GENEVE) += geneve.o
> >  obj-$(CONFIG_BAREUDP) += bareudp.o
> > diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> > new file mode 100644
> > index 000000000000..c0a4725ddd69
> > --- /dev/null
> > +++ b/drivers/net/virtio_net/Makefile
> > @@ -0,0 +1,8 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +#
> > +# Makefile for the VirtIO Net driver
> > +#
> > +
> > +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> > +
> > +virtio_net-objs := virtio_net_main.o
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
> > similarity index 100%
> > rename from drivers/net/virtio_net.c
> > rename to drivers/net/virtio_net/virtio_net_main.c
> > --
> > 2.45.0
> >


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-24 19:02       ` Dan Jurgens
@ 2025-09-25  6:16         ` Michael S. Tsirkin
  2025-09-25  9:51           ` Parav Pandit
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25  6:16 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: Jason Wang, netdev, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Wed, Sep 24, 2025 at 02:02:34PM -0500, Dan Jurgens wrote:
> On 9/24/25 1:22 AM, Michael S. Tsirkin wrote:
> > On Wed, Sep 24, 2025 at 09:16:32AM +0800, Jason Wang wrote:
> >> On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
> >>>
> >>> Currently querying and setting capabilities is restricted to a single
> >>> capability and contained within the virtio PCI driver. However, each
> >>> device type has generic and device specific capabilities, that may be
> >>> queried and set. In subsequent patches virtio_net will query and set
> >>> flow filter capabilities.
> >>>
> >>> Move the admin related definitions to a new header file. It needs to be
> >>> abstracted away from the PCI specifics to be used by upper layer
> >>> drivers.
> >>>
> >>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >>> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> >>> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
> >>> ---
> >>
> >> [...]
> >>
> >>>
> >>>  size_t virtio_max_dma_size(const struct virtio_device *vdev);
> >>>
> >>> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> >>> new file mode 100644
> >>> index 000000000000..bbf543d20be4
> >>> --- /dev/null
> >>> +++ b/include/linux/virtio_admin.h
> >>> @@ -0,0 +1,68 @@
> >>> +/* SPDX-License-Identifier: GPL-2.0-only
> >>> + *
> >>> + * Header file for virtio admin operations
> >>> + */
> >>> +#include <uapi/linux/virtio_pci.h>
> >>> +
> >>> +#ifndef _LINUX_VIRTIO_ADMIN_H
> >>> +#define _LINUX_VIRTIO_ADMIN_H
> >>> +
> >>> +struct virtio_device;
> >>> +
> >>> +/**
> >>> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> >>> + * @cap_list: Pointer to capability list structure containing supported_caps array
> >>> + * @cap: Capability ID to check
> >>> + *
> >>> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> >>> + * where each bit represents a capability. Bit 0 of the first element represents
> >>> + * capability ID 0, bit 1 represents capability ID 1, and so on.
> >>> + *
> >>> + * Return: 1 if capability is supported, 0 otherwise
> >>> + */
> >>> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> >>> +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
> >>> +
> >>> +/**
> >>> + * struct virtio_admin_ops - Operations for virtio admin functionality
> >>> + *
> >>> + * This structure contains function pointers for performing administrative
> >>> + * operations on virtio devices. All data and caps pointers must be allocated
> >>> + * on the heap by the caller.
> >>> + */
> >>> +struct virtio_admin_ops {
> >>> +       /**
> >>> +        * @cap_id_list_query: Query the list of supported capability IDs
> >>> +        * @vdev: The virtio device to query
> >>> +        * @data: Pointer to result structure (must be heap allocated)
> >>> +        * Return: 0 on success, negative error code on failure
> >>> +        */
> >>> +       int (*cap_id_list_query)(struct virtio_device *vdev,
> >>> +                                struct virtio_admin_cmd_query_cap_id_result *data);
> >>> +       /**
> >>> +        * @cap_get: Get capability data for a specific capability ID
> >>> +        * @vdev: The virtio device
> >>> +        * @id: Capability ID to retrieve
> >>> +        * @caps: Pointer to capability data structure (must be heap allocated)
> >>> +        * @cap_size: Size of the capability data structure
> >>> +        * Return: 0 on success, negative error code on failure
> >>> +        */
> >>> +       int (*cap_get)(struct virtio_device *vdev,
> >>> +                      u16 id,
> >>> +                      void *caps,
> >>> +                      size_t cap_size);
> >>> +       /**
> >>> +        * @cap_set: Set capability data for a specific capability ID
> >>> +        * @vdev: The virtio device
> >>> +        * @id: Capability ID to set
> >>> +        * @caps: Pointer to capability data structure (must be heap allocated)
> >>> +        * @cap_size: Size of the capability data structure
> >>> +        * Return: 0 on success, negative error code on failure
> >>> +        */
> >>> +       int (*cap_set)(struct virtio_device *vdev,
> >>> +                      u16 id,
> >>> +                      const void *caps,
> >>> +                      size_t cap_size);
> >>> +};
> >>
> >> Looking at this, it's nothing admin virtqueue specific, I wonder why
> >> it is not part of virtio_config_ops.
> >>
> >> Thanks
> > 
> > cap things are admin commands. But what I do not get is why they
> > need to be callbacks.
> > 
> > The only thing about admin commands that is pci specific is finding
> > the admin vq.
> > 
> > I'd expect an API for that in config then, and the rest of code can
> > be completely transport independent.
> > 
> > 
> 
> The idea was that each transport would implement the callbacks, and we
> have indirection at the virtio_device level. Similar to the config_ops.
> So the drivers stay transport agnostic. I know these are PCI specific
> now, but thought it should be implemented generically.
> 
> These could go in config ops. But I thought it was better to isolate
> them in a new _ops structure.
> 
> An earlier implementation had the net driver accessing the admin_ops
> directly. But Parav thought this was better.

Right, but most stuff is not transport specific. If you are going to
put in the work, what is transport specific is admin VQ access.
Commands themselves are transport agnostic, we just did not need
them in non-pci previously.


-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25  6:16         ` Michael S. Tsirkin
@ 2025-09-25  9:51           ` Parav Pandit
  2025-09-25 10:35             ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Parav Pandit @ 2025-09-25  9:51 UTC (permalink / raw)
  To: Michael S. Tsirkin, Dan Jurgens
  Cc: Jason Wang, netdev, alex.williamson, pabeni, virtualization,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas


On 25-09-2025 11:46 am, Michael S. Tsirkin wrote:
> On Wed, Sep 24, 2025 at 02:02:34PM -0500, Dan Jurgens wrote:
>> On 9/24/25 1:22 AM, Michael S. Tsirkin wrote:
>>> On Wed, Sep 24, 2025 at 09:16:32AM +0800, Jason Wang wrote:
>>>> On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
>>>>> Currently querying and setting capabilities is restricted to a single
>>>>> capability and contained within the virtio PCI driver. However, each
>>>>> device type has generic and device specific capabilities, that may be
>>>>> queried and set. In subsequent patches virtio_net will query and set
>>>>> flow filter capabilities.
>>>>>
>>>>> Move the admin related definitions to a new header file. It needs to be
>>>>> abstracted away from the PCI specifics to be used by upper layer
>>>>> drivers.
>>>>>
>>>>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>>>>> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
>>>>> ---
>>>> [...]
>>>>
>>>>>   size_t virtio_max_dma_size(const struct virtio_device *vdev);
>>>>>
>>>>> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
>>>>> new file mode 100644
>>>>> index 000000000000..bbf543d20be4
>>>>> --- /dev/null
>>>>> +++ b/include/linux/virtio_admin.h
>>>>> @@ -0,0 +1,68 @@
>>>>> +/* SPDX-License-Identifier: GPL-2.0-only
>>>>> + *
>>>>> + * Header file for virtio admin operations
>>>>> + */
>>>>> +#include <uapi/linux/virtio_pci.h>
>>>>> +
>>>>> +#ifndef _LINUX_VIRTIO_ADMIN_H
>>>>> +#define _LINUX_VIRTIO_ADMIN_H
>>>>> +
>>>>> +struct virtio_device;
>>>>> +
>>>>> +/**
>>>>> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
>>>>> + * @cap_list: Pointer to capability list structure containing supported_caps array
>>>>> + * @cap: Capability ID to check
>>>>> + *
>>>>> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
>>>>> + * where each bit represents a capability. Bit 0 of the first element represents
>>>>> + * capability ID 0, bit 1 represents capability ID 1, and so on.
>>>>> + *
>>>>> + * Return: 1 if capability is supported, 0 otherwise
>>>>> + */
>>>>> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
>>>>> +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
>>>>> +
>>>>> +/**
>>>>> + * struct virtio_admin_ops - Operations for virtio admin functionality
>>>>> + *
>>>>> + * This structure contains function pointers for performing administrative
>>>>> + * operations on virtio devices. All data and caps pointers must be allocated
>>>>> + * on the heap by the caller.
>>>>> + */
>>>>> +struct virtio_admin_ops {
>>>>> +       /**
>>>>> +        * @cap_id_list_query: Query the list of supported capability IDs
>>>>> +        * @vdev: The virtio device to query
>>>>> +        * @data: Pointer to result structure (must be heap allocated)
>>>>> +        * Return: 0 on success, negative error code on failure
>>>>> +        */
>>>>> +       int (*cap_id_list_query)(struct virtio_device *vdev,
>>>>> +                                struct virtio_admin_cmd_query_cap_id_result *data);
>>>>> +       /**
>>>>> +        * @cap_get: Get capability data for a specific capability ID
>>>>> +        * @vdev: The virtio device
>>>>> +        * @id: Capability ID to retrieve
>>>>> +        * @caps: Pointer to capability data structure (must be heap allocated)
>>>>> +        * @cap_size: Size of the capability data structure
>>>>> +        * Return: 0 on success, negative error code on failure
>>>>> +        */
>>>>> +       int (*cap_get)(struct virtio_device *vdev,
>>>>> +                      u16 id,
>>>>> +                      void *caps,
>>>>> +                      size_t cap_size);
>>>>> +       /**
>>>>> +        * @cap_set: Set capability data for a specific capability ID
>>>>> +        * @vdev: The virtio device
>>>>> +        * @id: Capability ID to set
>>>>> +        * @caps: Pointer to capability data structure (must be heap allocated)
>>>>> +        * @cap_size: Size of the capability data structure
>>>>> +        * Return: 0 on success, negative error code on failure
>>>>> +        */
>>>>> +       int (*cap_set)(struct virtio_device *vdev,
>>>>> +                      u16 id,
>>>>> +                      const void *caps,
>>>>> +                      size_t cap_size);
>>>>> +};
>>>> Looking at this, it's nothing admin virtqueue specific, I wonder why
>>>> it is not part of virtio_config_ops.

It is very clear from the virtio_admin_ops definition that it is not 
specific to admin vq. It is a admin command interface.


>>>> Thanks
>>> cap things are admin commands. But what I do not get is why they
>>> need to be callbacks.
>>>
>>> The only thing about admin commands that is pci specific is finding
>>> the admin vq.
>>>
>>> I'd expect an API for that in config then, and the rest of code can
>>> be completely transport independent.
>>>
>>>
>> The idea was that each transport would implement the callbacks, and we
>> have indirection at the virtio_device level. Similar to the config_ops.
>> So the drivers stay transport agnostic. I know these are PCI specific
>> now, but thought it should be implemented generically.
>>
>> These could go in config ops. But I thought it was better to isolate
>> them in a new _ops structure.
>>
>> An earlier implementation had the net driver accessing the admin_ops
>> directly. But Parav thought this was better.
> Right, but most stuff is not transport specific. If you are going to
> put in the work, what is transport specific is admin VQ access.
> Commands themselves are transport agnostic, we just did not need
> them in non-pci previously.
>
Should config_ops be extended to have admin cmd interface there?

No strong preference for putting function pointers in new admin_ops or 
config_ops.

I just find it better to have admin_ops clearly defined as it makes the 
code crystal clear of what the ops are about.

Today, one needs to be cautious when reading config_ops of below note.

"Note: Do not assume that a transport implements all of the operations".

Having well defined admin_ops appeared more clear. But config_ops 
extension (or overloading) seems more simpler, no strong preference to me.

Regarding Dan's comment on "net driver directly accessing admin_ops 
directly" seems a bad idea to me.

Function pointers are there for multiple transports to implement their 
own implementation.

it is not meant to develop open coded drivers. In fact some day 
config_ops struct definition can be restricted to only transport drivers.

Major part of the kernel does not follow open coding function pointer calls.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25  9:51           ` Parav Pandit
@ 2025-09-25 10:35             ` Michael S. Tsirkin
  2025-09-25 10:45               ` Parav Pandit
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 10:35 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Dan Jurgens, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
> Function pointers are there for multiple transports to implement their own
> implementation.

My understanding is that you want to use flow control admin commands 
in virtio net, without making it depend on virtio pci.
This why the callbacks are here. Is that right?

That is fair enough, but it looks like every new command then
needs a lot of boilerplate code with a callback a wrapper and
a transport implementation.

Why not just put all this code in virtio core? It looks like the
transport just needs to expose an API to find the admin vq.

-- 
MST

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 10:35             ` Michael S. Tsirkin
@ 2025-09-25 10:45               ` Parav Pandit
  2025-09-25 11:49                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Parav Pandit @ 2025-09-25 10:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Jurgens, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
> On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
>> Function pointers are there for multiple transports to implement their own
>> implementation.
> My understanding is that you want to use flow control admin commands
> in virtio net, without making it depend on virtio pci.
No flow control in vnet.
> This why the callbacks are here. Is that right?

No. callbacks are there so that transport agnostic layer can invoke it, 
which is drivers/virtio/virtio.c.

And transport specific code stays in transport layer, which is presently 
following config_ops design.

>
> That is fair enough, but it looks like every new command then
> needs a lot of boilerplate code with a callback a wrapper and
> a transport implementation.

Not really. I dont see any callbacks or wrapper in current proposed patches.

All it has is transport specific implementation of admin commands.

>
>
> Why not just put all this code in virtio core? It looks like the
> transport just needs to expose an API to find the admin vq.

Can you please be specific of which line in the current code can be 
moved to virtio core?

When the spec was drafted, _one_ was thinking of admin command transport 
over non admin vq also.

So current implementation of letting transport decide on how to 
transport a command seems right to me.

But sure, if you can pin point the lines of code that can be shifted to 
generic layer, that would be good.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 10:45               ` Parav Pandit
@ 2025-09-25 11:49                 ` Michael S. Tsirkin
  2025-09-25 12:09                   ` Parav Pandit
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 11:49 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Dan Jurgens, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Thu, Sep 25, 2025 at 04:15:19PM +0530, Parav Pandit wrote:
> 
> On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
> > On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
> > > Function pointers are there for multiple transports to implement their own
> > > implementation.
> > My understanding is that you want to use flow control admin commands
> > in virtio net, without making it depend on virtio pci.
> No flow control in vnet.
> > This why the callbacks are here. Is that right?
> 
> No. callbacks are there so that transport agnostic layer can invoke it,
> which is drivers/virtio/virtio.c.
> 
> And transport specific code stays in transport layer, which is presently
> following config_ops design.
> 
> > 
> > That is fair enough, but it looks like every new command then
> > needs a lot of boilerplate code with a callback a wrapper and
> > a transport implementation.
> 
> Not really. I dont see any callbacks or wrapper in current proposed patches.
> 
> All it has is transport specific implementation of admin commands.
> 
> > 
> > 
> > Why not just put all this code in virtio core? It looks like the
> > transport just needs to expose an API to find the admin vq.
> 
> Can you please be specific of which line in the current code can be moved to
> virtio core?
> 
> When the spec was drafted, _one_ was thinking of admin command transport
> over non admin vq also.
> 
> So current implementation of letting transport decide on how to transport a
> command seems right to me.
> 
> But sure, if you can pin point the lines of code that can be shifted to
> generic layer, that would be good.

I imagine a get_admin_vq operation in config_ops. The rest of the
code seems to be transport independent and could be part of
the core. WDYT?

-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 11:49                 ` Michael S. Tsirkin
@ 2025-09-25 12:09                   ` Parav Pandit
  2025-09-25 13:08                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Parav Pandit @ 2025-09-25 12:09 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Dan Jurgens, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas


On 25-09-2025 05:19 pm, Michael S. Tsirkin wrote:
> On Thu, Sep 25, 2025 at 04:15:19PM +0530, Parav Pandit wrote:
>> On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
>>> On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
>>>> Function pointers are there for multiple transports to implement their own
>>>> implementation.
>>> My understanding is that you want to use flow control admin commands
>>> in virtio net, without making it depend on virtio pci.
>> No flow control in vnet.
>>> This why the callbacks are here. Is that right?
>> No. callbacks are there so that transport agnostic layer can invoke it,
>> which is drivers/virtio/virtio.c.
>>
>> And transport specific code stays in transport layer, which is presently
>> following config_ops design.
>>
>>> That is fair enough, but it looks like every new command then
>>> needs a lot of boilerplate code with a callback a wrapper and
>>> a transport implementation.
>> Not really. I dont see any callbacks or wrapper in current proposed patches.
>>
>> All it has is transport specific implementation of admin commands.
>>
>>>
>>> Why not just put all this code in virtio core? It looks like the
>>> transport just needs to expose an API to find the admin vq.
>> Can you please be specific of which line in the current code can be moved to
>> virtio core?
>>
>> When the spec was drafted, _one_ was thinking of admin command transport
>> over non admin vq also.
>>
>> So current implementation of letting transport decide on how to transport a
>> command seems right to me.
>>
>> But sure, if you can pin point the lines of code that can be shifted to
>> generic layer, that would be good.
> I imagine a get_admin_vq operation in config_ops. The rest of the
> code seems to be transport independent and could be part of
> the core. WDYT?
>
IMHV, the code before vp_modern_admin_cmd_exec() can be part of 
drivers/virtio/virtio_admin_cmds.c and admin_cmd_exec() can be part of 
the config ops.

However such refactor can be differed when it actually becomes boiler 
plate code where there is more than one transport and/or more than one 
way to send admin cmds.

Even if its done, it probably will require vfio-virtio-pci to interact 
with generic virtio layer. Not sure added value of that complication to 
be part of this series.


Dan,

WDYT?


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 12:09                   ` Parav Pandit
@ 2025-09-25 13:08                     ` Michael S. Tsirkin
  2025-09-25 16:53                       ` Dan Jurgens
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 13:08 UTC (permalink / raw)
  To: Parav Pandit
  Cc: Dan Jurgens, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Thu, Sep 25, 2025 at 05:39:54PM +0530, Parav Pandit wrote:
> 
> On 25-09-2025 05:19 pm, Michael S. Tsirkin wrote:
> > On Thu, Sep 25, 2025 at 04:15:19PM +0530, Parav Pandit wrote:
> > > On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
> > > > On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
> > > > > Function pointers are there for multiple transports to implement their own
> > > > > implementation.
> > > > My understanding is that you want to use flow control admin commands
> > > > in virtio net, without making it depend on virtio pci.
> > > No flow control in vnet.
> > > > This why the callbacks are here. Is that right?
> > > No. callbacks are there so that transport agnostic layer can invoke it,
> > > which is drivers/virtio/virtio.c.
> > > 
> > > And transport specific code stays in transport layer, which is presently
> > > following config_ops design.
> > > 
> > > > That is fair enough, but it looks like every new command then
> > > > needs a lot of boilerplate code with a callback a wrapper and
> > > > a transport implementation.
> > > Not really. I dont see any callbacks or wrapper in current proposed patches.
> > > 
> > > All it has is transport specific implementation of admin commands.
> > > 
> > > > 
> > > > Why not just put all this code in virtio core? It looks like the
> > > > transport just needs to expose an API to find the admin vq.
> > > Can you please be specific of which line in the current code can be moved to
> > > virtio core?
> > > 
> > > When the spec was drafted, _one_ was thinking of admin command transport
> > > over non admin vq also.
> > > 
> > > So current implementation of letting transport decide on how to transport a
> > > command seems right to me.
> > > 
> > > But sure, if you can pin point the lines of code that can be shifted to
> > > generic layer, that would be good.
> > I imagine a get_admin_vq operation in config_ops. The rest of the
> > code seems to be transport independent and could be part of
> > the core. WDYT?
> > 
> IMHV, the code before vp_modern_admin_cmd_exec() can be part of
> drivers/virtio/virtio_admin_cmds.c and admin_cmd_exec() can be part of the
> config ops.
> 
> However such refactor can be differed when it actually becomes boiler plate
> code where there is more than one transport and/or more than one way to send
> admin cmds.

Well administration virtqueue section is currently not a part of a
transport section in the spec.  But if you think it will change and so
find it cleaner for transports to expose, instead of a VQ, a generic
interfaces to send an admin command, that's fine too. That is still a
far cry from adding all the object management in the transport. 


Well we have all the new code you are writing, and hacking around
the fact it's in the wrong module with a level of indirection
seems wrong.
If you need help moving this code let me know, it's not hard.

> Even if its done, it probably will require vfio-virtio-pci to interact with
> generic virtio layer. Not sure added value of that complication to be part
> of this series.
> 
> 
> Dan,
> 
> WDYT?


virtio pci pulls in the core already, and VFIO only uses the SRIOV
group, so it can keep using the existing pci device based interfaces,
if you prefer.

-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-25  6:13     ` Michael S. Tsirkin
@ 2025-09-25 15:48       ` Dan Jurgens
  2025-09-25 20:10         ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-25 15:48 UTC (permalink / raw)
  To: Michael S. Tsirkin, Xuan Zhuo
  Cc: virtualization, parav, shshitrit, yohadt, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, netdev, jasowang, alex.williamson, pabeni

On 9/25/25 1:13 AM, Michael S. Tsirkin wrote:
> On Thu, Sep 25, 2025 at 11:56:09AM +0800, Xuan Zhuo wrote:
>> On Tue, 23 Sep 2025 09:19:12 -0500, Daniel Jurgens <danielj@nvidia.com> wrote:
>>> The flow filter implementaion requires minimal changes to the
>>> existing virtio_net implementation. It's cleaner to separate it into
>>> another file. In order to do so, move virtio_net.c into the new
>>> virtio_net directory, and create a makefile for it. Note the name is
>>> changed to virtio_net_main.c, so the module can retain the name
>>> virtio_net.
>>>
>>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>>
>> To help this work move forward smoothly, I don't recommend splitting the
>> directory structure within this patchset. Directory reorganization can be a
>> separate effortâ€”I've previously experimented with this myself. I'd really
>> like to see this work progress smoothly.
>>
>> Thanks.
> 
> Indeed.

It's not a hill I'm willing to die on, but breaking this up into files
makes sense. virtio_main.c is already huge, and this would make it 15%
bigger.

> 
>>
>>> ---
>>>  MAINTAINERS                                               | 2 +-
>>>  drivers/net/Makefile                                      | 2 +-
>>>  drivers/net/virtio_net/Makefile                           | 8 ++++++++
>>>  .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
>>>  4 files changed, 10 insertions(+), 2 deletions(-)
>>>  create mode 100644 drivers/net/virtio_net/Makefile
>>>  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)
>>>
>>> diff --git a/MAINTAINERS b/MAINTAINERS
>>> index a8a770714101..09d26c4225a9 100644
>>> --- a/MAINTAINERS
>>> +++ b/MAINTAINERS
>>> @@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
>>>  F:	Documentation/driver-api/virtio/
>>>  F:	drivers/block/virtio_blk.c
>>>  F:	drivers/crypto/virtio/
>>> -F:	drivers/net/virtio_net.c
>>> +F:	drivers/net/virtio_net/
>>>  F:	drivers/vdpa/
>>>  F:	drivers/virtio/
>>>  F:	include/linux/vdpa.h
>>> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
>>> index 73bc63ecd65f..cf28992658a6 100644
>>> --- a/drivers/net/Makefile
>>> +++ b/drivers/net/Makefile
>>> @@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
>>>  obj-$(CONFIG_TUN) += tun.o
>>>  obj-$(CONFIG_TAP) += tap.o
>>>  obj-$(CONFIG_VETH) += veth.o
>>> -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
>>> +obj-$(CONFIG_VIRTIO_NET) += virtio_net/
>>>  obj-$(CONFIG_VXLAN) += vxlan/
>>>  obj-$(CONFIG_GENEVE) += geneve.o
>>>  obj-$(CONFIG_BAREUDP) += bareudp.o
>>> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
>>> new file mode 100644
>>> index 000000000000..c0a4725ddd69
>>> --- /dev/null
>>> +++ b/drivers/net/virtio_net/Makefile
>>> @@ -0,0 +1,8 @@
>>> +# SPDX-License-Identifier: GPL-2.0-only
>>> +#
>>> +# Makefile for the VirtIO Net driver
>>> +#
>>> +
>>> +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
>>> +
>>> +virtio_net-objs := virtio_net_main.o
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
>>> similarity index 100%
>>> rename from drivers/net/virtio_net.c
>>> rename to drivers/net/virtio_net/virtio_net_main.c
>>> --
>>> 2.45.0
>>>
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 13:08                     ` Michael S. Tsirkin
@ 2025-09-25 16:53                       ` Dan Jurgens
  2025-09-25 16:55                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-25 16:53 UTC (permalink / raw)
  To: Michael S. Tsirkin, Parav Pandit
  Cc: Jason Wang, netdev, alex.williamson, pabeni, virtualization,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas

On 9/25/25 8:08 AM, Michael S. Tsirkin wrote:
> On Thu, Sep 25, 2025 at 05:39:54PM +0530, Parav Pandit wrote:
>>
>> On 25-09-2025 05:19 pm, Michael S. Tsirkin wrote:
>>> On Thu, Sep 25, 2025 at 04:15:19PM +0530, Parav Pandit wrote:
>>>> On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
>>>>> On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
>>>>>> Function pointers are there for multiple transports to implement their own
>>>>>> implementation.
>>>>> My understanding is that you want to use flow control admin commands
>>>>> in virtio net, without making it depend on virtio pci.
>>>> No flow control in vnet.
>>>>> This why the callbacks are here. Is that right?
>>>> No. callbacks are there so that transport agnostic layer can invoke it,
>>>> which is drivers/virtio/virtio.c.
>>>>
>>>> And transport specific code stays in transport layer, which is presently
>>>> following config_ops design.
>>>>
>>>>> That is fair enough, but it looks like every new command then
>>>>> needs a lot of boilerplate code with a callback a wrapper and
>>>>> a transport implementation.
>>>> Not really. I dont see any callbacks or wrapper in current proposed patches.
>>>>
>>>> All it has is transport specific implementation of admin commands.
>>>>
>>>>>
>>>>> Why not just put all this code in virtio core? It looks like the
>>>>> transport just needs to expose an API to find the admin vq.
>>>> Can you please be specific of which line in the current code can be moved to
>>>> virtio core?
>>>>
>>>> When the spec was drafted, _one_ was thinking of admin command transport
>>>> over non admin vq also.
>>>>
>>>> So current implementation of letting transport decide on how to transport a
>>>> command seems right to me.
>>>>
>>>> But sure, if you can pin point the lines of code that can be shifted to
>>>> generic layer, that would be good.
>>> I imagine a get_admin_vq operation in config_ops. The rest of the
>>> code seems to be transport independent and could be part of
>>> the core. WDYT?
>>>
>> IMHV, the code before vp_modern_admin_cmd_exec() can be part of
>> drivers/virtio/virtio_admin_cmds.c and admin_cmd_exec() can be part of the
>> config ops.
>>
>> However such refactor can be differed when it actually becomes boiler plate
>> code where there is more than one transport and/or more than one way to send
>> admin cmds.
> 
> Well administration virtqueue section is currently not a part of a
> transport section in the spec.  But if you think it will change and so
> find it cleaner for transports to expose, instead of a VQ, a generic
> interfaces to send an admin command, that's fine too. That is still a
> far cry from adding all the object management in the transport. 
> 
> 
> Well we have all the new code you are writing, and hacking around
> the fact it's in the wrong module with a level of indirection
> seems wrong.
> If you need help moving this code let me know, it's not hard.
> 
>> Even if its done, it probably will require vfio-virtio-pci to interact with
>> generic virtio layer. Not sure added value of that complication to be part
>> of this series.
>>
>>
>> Dan,
>>
>> WDYT?
> 
> 
> virtio pci pulls in the core already, and VFIO only uses the SRIOV
> group, so it can keep using the existing pci device based interfaces,
> if you prefer.
> 

I can make changes here. I'd appreciate if you review the rest of the
series while I do so. Patches 3+ are isolated from this, so it won't be
a waste of your time.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-25 16:53                       ` Dan Jurgens
@ 2025-09-25 16:55                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 16:55 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: Parav Pandit, Jason Wang, netdev, alex.williamson, pabeni,
	virtualization, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Thu, Sep 25, 2025 at 11:53:50AM -0500, Dan Jurgens wrote:
> On 9/25/25 8:08 AM, Michael S. Tsirkin wrote:
> > On Thu, Sep 25, 2025 at 05:39:54PM +0530, Parav Pandit wrote:
> >>
> >> On 25-09-2025 05:19 pm, Michael S. Tsirkin wrote:
> >>> On Thu, Sep 25, 2025 at 04:15:19PM +0530, Parav Pandit wrote:
> >>>> On 25-09-2025 04:05 pm, Michael S. Tsirkin wrote:
> >>>>> On Thu, Sep 25, 2025 at 03:21:38PM +0530, Parav Pandit wrote:
> >>>>>> Function pointers are there for multiple transports to implement their own
> >>>>>> implementation.
> >>>>> My understanding is that you want to use flow control admin commands
> >>>>> in virtio net, without making it depend on virtio pci.
> >>>> No flow control in vnet.
> >>>>> This why the callbacks are here. Is that right?
> >>>> No. callbacks are there so that transport agnostic layer can invoke it,
> >>>> which is drivers/virtio/virtio.c.
> >>>>
> >>>> And transport specific code stays in transport layer, which is presently
> >>>> following config_ops design.
> >>>>
> >>>>> That is fair enough, but it looks like every new command then
> >>>>> needs a lot of boilerplate code with a callback a wrapper and
> >>>>> a transport implementation.
> >>>> Not really. I dont see any callbacks or wrapper in current proposed patches.
> >>>>
> >>>> All it has is transport specific implementation of admin commands.
> >>>>
> >>>>>
> >>>>> Why not just put all this code in virtio core? It looks like the
> >>>>> transport just needs to expose an API to find the admin vq.
> >>>> Can you please be specific of which line in the current code can be moved to
> >>>> virtio core?
> >>>>
> >>>> When the spec was drafted, _one_ was thinking of admin command transport
> >>>> over non admin vq also.
> >>>>
> >>>> So current implementation of letting transport decide on how to transport a
> >>>> command seems right to me.
> >>>>
> >>>> But sure, if you can pin point the lines of code that can be shifted to
> >>>> generic layer, that would be good.
> >>> I imagine a get_admin_vq operation in config_ops. The rest of the
> >>> code seems to be transport independent and could be part of
> >>> the core. WDYT?
> >>>
> >> IMHV, the code before vp_modern_admin_cmd_exec() can be part of
> >> drivers/virtio/virtio_admin_cmds.c and admin_cmd_exec() can be part of the
> >> config ops.
> >>
> >> However such refactor can be differed when it actually becomes boiler plate
> >> code where there is more than one transport and/or more than one way to send
> >> admin cmds.
> > 
> > Well administration virtqueue section is currently not a part of a
> > transport section in the spec.  But if you think it will change and so
> > find it cleaner for transports to expose, instead of a VQ, a generic
> > interfaces to send an admin command, that's fine too. That is still a
> > far cry from adding all the object management in the transport. 
> > 
> > 
> > Well we have all the new code you are writing, and hacking around
> > the fact it's in the wrong module with a level of indirection
> > seems wrong.
> > If you need help moving this code let me know, it's not hard.
> > 
> >> Even if its done, it probably will require vfio-virtio-pci to interact with
> >> generic virtio layer. Not sure added value of that complication to be part
> >> of this series.
> >>
> >>
> >> Dan,
> >>
> >> WDYT?
> > 
> > 
> > virtio pci pulls in the core already, and VFIO only uses the SRIOV
> > group, so it can keep using the existing pci device based interfaces,
> > if you prefer.
> > 
> 
> I can make changes here. I'd appreciate if you review the rest of the
> series while I do so. Patches 3+ are isolated from this, so it won't be
> a waste of your time.

OK - will review 3+, thanks!

-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-25 15:48       ` Dan Jurgens
@ 2025-09-25 20:10         ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:10 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: Xuan Zhuo, virtualization, parav, shshitrit, yohadt, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, netdev, jasowang, alex.williamson, pabeni

On Thu, Sep 25, 2025 at 10:48:24AM -0500, Dan Jurgens wrote:
> On 9/25/25 1:13 AM, Michael S. Tsirkin wrote:
> > On Thu, Sep 25, 2025 at 11:56:09AM +0800, Xuan Zhuo wrote:
> >> On Tue, 23 Sep 2025 09:19:12 -0500, Daniel Jurgens <danielj@nvidia.com> wrote:
> >>> The flow filter implementaion requires minimal changes to the
> >>> existing virtio_net implementation. It's cleaner to separate it into
> >>> another file. In order to do so, move virtio_net.c into the new
> >>> virtio_net directory, and create a makefile for it. Note the name is
> >>> changed to virtio_net_main.c, so the module can retain the name
> >>> virtio_net.
> >>>
> >>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >>> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> >>
> >> To help this work move forward smoothly, I don't recommend splitting the
> >> directory structure within this patchset. Directory reorganization can be a
> >> separate effortâ€”I've previously experimented with this myself. I'd really
> >> like to see this work progress smoothly.
> >>
> >> Thanks.
> > 
> > Indeed.
> 
> It's not a hill I'm willing to die on, but breaking this up into files
> makes sense. virtio_main.c is already huge, and this would make it 15%
> bigger.


Oh I agee - I think it's the largest single file driver now -
but let's do this on top pls.

And when we do it I'd like to see us split it up to some
logical chunks with as small interaction between them
as possible, and I want to see how they look.

And I think "ethtool support" would be a reasonable chunk,
for example, and this might be better as a part of it.


> > 
> >>
> >>> ---
> >>>  MAINTAINERS                                               | 2 +-
> >>>  drivers/net/Makefile                                      | 2 +-
> >>>  drivers/net/virtio_net/Makefile                           | 8 ++++++++
> >>>  .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
> >>>  4 files changed, 10 insertions(+), 2 deletions(-)
> >>>  create mode 100644 drivers/net/virtio_net/Makefile
> >>>  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)
> >>>
> >>> diff --git a/MAINTAINERS b/MAINTAINERS
> >>> index a8a770714101..09d26c4225a9 100644
> >>> --- a/MAINTAINERS
> >>> +++ b/MAINTAINERS
> >>> @@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
> >>>  F:	Documentation/driver-api/virtio/
> >>>  F:	drivers/block/virtio_blk.c
> >>>  F:	drivers/crypto/virtio/
> >>> -F:	drivers/net/virtio_net.c
> >>> +F:	drivers/net/virtio_net/
> >>>  F:	drivers/vdpa/
> >>>  F:	drivers/virtio/
> >>>  F:	include/linux/vdpa.h
> >>> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> >>> index 73bc63ecd65f..cf28992658a6 100644
> >>> --- a/drivers/net/Makefile
> >>> +++ b/drivers/net/Makefile
> >>> @@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
> >>>  obj-$(CONFIG_TUN) += tun.o
> >>>  obj-$(CONFIG_TAP) += tap.o
> >>>  obj-$(CONFIG_VETH) += veth.o
> >>> -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> >>> +obj-$(CONFIG_VIRTIO_NET) += virtio_net/
> >>>  obj-$(CONFIG_VXLAN) += vxlan/
> >>>  obj-$(CONFIG_GENEVE) += geneve.o
> >>>  obj-$(CONFIG_BAREUDP) += bareudp.o
> >>> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> >>> new file mode 100644
> >>> index 000000000000..c0a4725ddd69
> >>> --- /dev/null
> >>> +++ b/drivers/net/virtio_net/Makefile
> >>> @@ -0,0 +1,8 @@
> >>> +# SPDX-License-Identifier: GPL-2.0-only
> >>> +#
> >>> +# Makefile for the VirtIO Net driver
> >>> +#
> >>> +
> >>> +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> >>> +
> >>> +virtio_net-objs := virtio_net_main.o
> >>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
> >>> similarity index 100%
> >>> rename from drivers/net/virtio_net.c
> >>> rename to drivers/net/virtio_net/virtio_net_main.c
> >>> --
> >>> 2.45.0
> >>>
> > 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops
  2025-09-23 14:19 ` [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
@ 2025-09-25 20:44   ` Michael S. Tsirkin
  2025-09-28  4:39     ` Dan Jurgens
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:44 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:20AM -0500, Daniel Jurgens wrote:
> - Get total number of rules. There's no user interface for this. It is
>   used to allocate an appropriately sized buffer for getting all the
>   rules.
> 
> - Get specific rule
> $ ethtool -u ens9 rule 0
> 	Filter: 0
> 		Rule Type: UDP over IPv4
> 		Src IP addr: 0.0.0.0 mask: 255.255.255.255
> 		Dest IP addr: 192.168.5.2 mask: 0.0.0.0
> 		TOS: 0x0 mask: 0xff
> 		Src port: 0 mask: 0xffff
> 		Dest port: 4321 mask: 0x0
> 		Action: Direct to queue 16
> 
> - Get all rules:
> $ ethtool -u ens9
> 31 RX rings available
> Total 2 rules
> 
> Filter: 0
>         Rule Type: UDP over IPv4
>         Src IP addr: 0.0.0.0 mask: 255.255.255.255
>         Dest IP addr: 192.168.5.2 mask: 0.0.0.0
> ...
> 
> Filter: 1
>         Flow Type: Raw Ethernet
>         Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
>         Dest MAC addr: 08:11:22:33:44:54 mask: 00:00:00:00:00:00
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c   | 48 ++++++++++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h   |  6 +++
>  drivers/net/virtio_net/virtio_net_main.c | 23 ++++++++++++
>  3 files changed, 77 insertions(+)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index d4a34958cc42..5488300a4fc3 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -809,6 +809,54 @@ int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
>  	return err;
>  }
>  
> +int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
> +				   struct ethtool_rxnfc *info)
> +{
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	info->rule_cnt = ff->ethtool.num_rules;
> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;

hmm. what if rules_limit has the high bit set?
or matches any of
#define RX_CLS_LOC_ANY          0xffffffff
#define RX_CLS_LOC_FIRST        0xfffffffe
#define RX_CLS_LOC_LAST         0xfffffffd
by chance?


> +
> +	return 0;
> +}
> +
> +int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
> +			     struct ethtool_rxnfc *info)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	eth_rule = xa_load(&ff->ethtool.rules, info->fs.location);
> +	if (!eth_rule)
> +		return -ENOENT;
> +
> +	info->fs = eth_rule->flow_spec;
> +
> +	return 0;
> +}
> +
> +int
> +virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
> +			      struct ethtool_rxnfc *info, u32 *rule_locs)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	unsigned long i = 0;
> +	int idx = 0;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
> +		rule_locs[idx++] = i;
> +
> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit);

same question

> +
> +	return 0;
> +}
> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
> index 94b575fbd9ed..4bb41e64cc59 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.h
> +++ b/drivers/net/virtio_net/virtio_net_ff.h
> @@ -28,6 +28,12 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
>  
>  void virtnet_ff_cleanup(struct virtnet_ff *ff);
>  
> +int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
> +				   struct ethtool_rxnfc *info);
> +int virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
> +				  struct ethtool_rxnfc *info, u32 *rule_locs);
> +int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
> +			     struct ethtool_rxnfc *info);
>  int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
>  				struct ethtool_rx_flow_spec *fs,
>  				u16 curr_queue_pairs);
> diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
> index 808988cdf265..e8336925c912 100644
> --- a/drivers/net/virtio_net/virtio_net_main.c
> +++ b/drivers/net/virtio_net/virtio_net_main.c
> @@ -5619,6 +5619,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +	int rc = 0;
> +
> +	switch (info->cmd) {
> +	case ETHTOOL_GRXCLSRLCNT:
> +		rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRULE:
> +		rc = virtnet_ethtool_get_flow(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRLALL:
> +		rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
> +		break;
> +	default:
> +		rc = -EOPNOTSUPP;
> +	}
> +
> +	return rc;
> +}
> +
>  static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> @@ -5660,6 +5682,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.get_rxfh_fields = virtnet_get_hashflow,
>  	.set_rxfh_fields = virtnet_set_hashflow,
>  	.get_rx_ring_count = virtnet_get_rx_ring_count,
> +	.get_rxnfc = virtnet_get_rxnfc,
>  	.set_rxnfc = virtnet_set_rxnfc,
>  };
>  
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering
  2025-09-23 14:19 ` [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-09-25 20:47   ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:47 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:18AM -0500, Daniel Jurgens wrote:
> Implement support for IPV6_USER_FLOW type rules.
> 
> Example:
> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
> Added rule with ID 0
> 
> The example rule will forward packets with the specified soure and


source


> destination IP addresses to RX ring 3.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c | 89 +++++++++++++++++++++++---
>  1 file changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index 0374676d1342..ce59fb36dae9 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -118,6 +118,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
>  	return true;
>  }
>  
> +static bool validate_ip6_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ipv6hdr *cap, *mask;
> +
> +	cap = (struct ipv6hdr *)&sel_cap->mask;
> +	mask = (struct ipv6hdr *)&sel->mask;
> +
> +	if (!ipv6_addr_any(&mask->saddr) &&
> +	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> +			       sizeof(cap->saddr), partial_mask))
> +		return false;
> +
> +	if (!ipv6_addr_any(&mask->daddr) &&
> +	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> +			       sizeof(cap->daddr), partial_mask))
> +		return false;
> +
> +	if (mask->nexthdr &&
> +	    !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
> +	    sizeof(cap->nexthdr), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
>  static bool validate_mask(const struct virtnet_ff *ff,
>  			  const struct virtio_net_ff_selector *sel)
>  {
> @@ -132,6 +160,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  
>  	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
>  		return validate_ip4_mask(ff, sel, sel_cap);
> +
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return validate_ip6_mask(ff, sel, sel_cap);
>  	}
>  
>  	return false;
> @@ -154,11 +185,38 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>  	}
>  }
>  
> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> +		      const struct ethtool_rx_flow_spec *fs)
> +{
> +	const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
> +	const struct ethtool_usrip6_spec *l3_val  = &fs->h_u.usr_ip6_spec;
> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
> +		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
> +		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
> +	}
> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
> +		memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> +		memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> +	}
> +
> +	if (l3_mask->l4_proto) {
> +		mask->nexthdr = l3_mask->l4_proto;
> +		key->nexthdr = l3_val->l4_proto;
> +	}
> +}
> +
>  static bool has_ipv4(u32 flow_type)
>  {
>  	return flow_type == IP_USER_FLOW;
>  }
>  
> +static bool has_ipv6(u32 flow_type)
> +{
> +	return flow_type == IPV6_USER_FLOW;
> +}
> +
>  static int setup_classifier(struct virtnet_ff *ff,
>  			    struct virtnet_classifier **c)
>  {
> @@ -291,6 +349,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  	switch (fs->flow_type) {
>  	case ETHER_FLOW:
>  	case IP_USER_FLOW:
> +	case IPV6_USER_FLOW:
>  		return true;
>  	}
>  
> @@ -332,7 +391,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>  	(*num_hdrs)++;
>  	if (has_ipv4(fs->flow_type))
>  		size += sizeof(struct iphdr);
> -
> +	else if (has_ipv6(fs->flow_type))
> +		size += sizeof(struct ipv6hdr);
>  done:
>  	*key_size = size;
>  	/*
> @@ -369,18 +429,31 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>  			     u8 *key,
>  			     const struct ethtool_rx_flow_spec *fs)
>  {
> +	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
>  	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> +	struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
>  	struct iphdr *v4_k = (struct iphdr *)key;
>  
> -	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> -	selector->length = sizeof(struct iphdr);
> +	if (has_ipv6(fs->flow_type)) {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
> +		selector->length = sizeof(struct ipv6hdr);
>  
> -	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> -	    fs->h_u.usr_ip4_spec.tos ||
> -	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> -		return -EOPNOTSUPP;
> +		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip6_spec.tclass)
> +			return -EOPNOTSUPP;
>  
> -	parse_ip4(v4_m, v4_k, fs);
> +		parse_ip6(v6_m, v6_k, fs);
> +	} else {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> +		selector->length = sizeof(struct iphdr);
> +
> +		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip4_spec.tos ||
> +		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> +			return -EOPNOTSUPP;
> +
> +		parse_ip4(v4_m, v4_k, fs);
> +	}
>  
>  	return 0;
>  }
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules
  2025-09-23 14:19 ` [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-09-25 20:53   ` Michael S. Tsirkin
  2025-09-25 21:13     ` Dan Jurgens
  2025-10-01 14:15     ` Dan Jurgens
  0 siblings, 2 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:53 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:17AM -0500, Daniel Jurgens wrote:
> Add support for IP_USER type rules from ethtool.
> 
> Example:
> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
> Added rule with ID 1
> 
> The example rule will drop packets with the source IP specified.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c | 127 +++++++++++++++++++++++--
>  1 file changed, 119 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index 30c5ded57ab5..0374676d1342 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -90,6 +90,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
>  	return true;
>  }
>  
> +static bool validate_ip4_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)

I'd prefer that all functions have virtnet prefix,
avoid polluting the global namespace.


> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct iphdr *cap, *mask;
> +
> +	cap = (struct iphdr *)&sel_cap->mask;
> +	mask = (struct iphdr *)&sel->mask;
> +
> +	if (mask->saddr &&
> +	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> +	    sizeof(__be32), partial_mask))


pls align continuation to the right of (.

> +		return false;
> +
> +	if (mask->daddr &&
> +	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> +	    sizeof(__be32), partial_mask))


and here

> +		return false;
> +
> +	if (mask->protocol &&
> +	    !check_mask_vs_cap(&mask->protocol, &cap->protocol,
> +	    sizeof(u8), partial_mask))


and here


> +		return false;
> +
> +	return true;
> +}
> +
>  static bool validate_mask(const struct virtnet_ff *ff,
>  			  const struct virtio_net_ff_selector *sel)
>  {
> @@ -101,11 +129,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  	switch (sel->type) {
>  	case VIRTIO_NET_FF_MASK_TYPE_ETH:
>  		return validate_eth_mask(ff, sel, sel_cap);
> +
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> +		return validate_ip4_mask(ff, sel, sel_cap);
>  	}
>  
>  	return false;
>  }
>  
> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> +		      const struct ethtool_rx_flow_spec *fs)
> +{
> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
> +
> +	mask->saddr = l3_mask->ip4src;
> +	mask->daddr = l3_mask->ip4dst;
> +	key->saddr = l3_val->ip4src;
> +	key->daddr = l3_val->ip4dst;
> +
> +	if (mask->protocol) {
> +		mask->protocol = l3_mask->proto;

Is this right? You just checked mask->protocol and are
now overriding it?


> +		key->protocol = l3_val->proto;
> +	}
> +}
> +
> +static bool has_ipv4(u32 flow_type)
> +{
> +	return flow_type == IP_USER_FLOW;
> +}
> +
>  static int setup_classifier(struct virtnet_ff *ff,
>  			    struct virtnet_classifier **c)
>  {
> @@ -237,6 +290,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  {
>  	switch (fs->flow_type) {
>  	case ETHER_FLOW:
> +	case IP_USER_FLOW:
>  		return true;
>  	}
>  
> @@ -260,16 +314,27 @@ static int validate_flow_input(struct virtnet_ff *ff,
>  
>  	if (!supported_flow_type(fs))
>  		return -EOPNOTSUPP;
> -
>  	return 0;
>  }
>  
>  static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> -				 size_t *key_size, size_t *classifier_size,
> -				 int *num_hdrs)
> +				size_t *key_size, size_t *classifier_size,
> +				int *num_hdrs)
>  {
> +	size_t size = sizeof(struct ethhdr);
> +
>  	*num_hdrs = 1;
>  	*key_size = sizeof(struct ethhdr);
> +
> +	if (fs->flow_type == ETHER_FLOW)
> +		goto done;
> +
> +	(*num_hdrs)++;

I prefer ++(*num_hdrs) in such cases generally. why return old value if
we discard it anyway?

> +	if (has_ipv4(fs->flow_type))
> +		size += sizeof(struct iphdr);
> +
> +done:
> +	*key_size = size;
>  	/*
>  	 * The classifier size is the size of the classifier header, a selector
>  	 * header for each type of header in the match criteria, and each header
> @@ -281,8 +346,9 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>  }
>  
>  static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
> -				   u8 *key,
> -				   const struct ethtool_rx_flow_spec *fs)
> +				  u8 *key,
> +				  const struct ethtool_rx_flow_spec *fs,
> +				  int num_hdrs)
>  {
>  	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
>  	struct ethhdr *eth_k = (struct ethhdr *)key;
> @@ -290,8 +356,33 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
>  	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
>  	selector->length = sizeof(struct ethhdr);
>  
> -	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> -	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +	if (num_hdrs > 1) {
> +		eth_m->h_proto = cpu_to_be16(0xffff);
> +		eth_k->h_proto = cpu_to_be16(ETH_P_IP);
> +	} else {
> +		memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> +		memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +	}
> +}
> +
> +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> +			     u8 *key,
> +			     const struct ethtool_rx_flow_spec *fs)
> +{
> +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> +	struct iphdr *v4_k = (struct iphdr *)key;
> +
> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> +	selector->length = sizeof(struct iphdr);
> +
> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> +	    fs->h_u.usr_ip4_spec.tos ||
> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> +		return -EOPNOTSUPP;
> +
> +	parse_ip4(v4_m, v4_k, fs);
> +
> +	return 0;
>  }
>  
>  static int
> @@ -312,6 +403,17 @@ validate_classifier_selectors(struct virtnet_ff *ff,
>  	return 0;
>  }
>  
> +static
> +struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
> +{
> +	void *nextsel;
> +
> +	nextsel = (u8 *)sel + sizeof(struct virtio_net_ff_selector) +
> +		  sel->length;

you do not need this variable. and cast to void* looks cleaner imho.

> +
> +	return nextsel;
> +}
> +
>  static int build_and_insert(struct virtnet_ff *ff,
>  			    struct virtnet_ethtool_rule *eth_rule)
>  {
> @@ -349,8 +451,17 @@ static int build_and_insert(struct virtnet_ff *ff,
>  	classifier->count = num_hdrs;
>  	selector = &classifier->selectors[0];
>  
> -	setup_eth_hdr_key_mask(selector, key, fs);
> +	setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
> +	if (num_hdrs == 1)
> +		goto validate;
> +
> +	selector = next_selector(selector);
> +
> +	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
> +	if (err)
> +		goto err_classifier;
>  
> +validate:
>  	err = validate_classifier_selectors(ff, classifier, num_hdrs);
>  	if (err)
>  		goto err_key;
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible
  2025-09-23 14:19 ` [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-09-25 20:53   ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:53 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:16AM -0500, Daniel Jurgens wrote:
> Classifiers can be used by more than one rule. If there is an exisitng

existing

> classifier, use it instead of creating a new one.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c | 39 ++++++++++++++++++--------
>  1 file changed, 27 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index e3c34bfd1d55..30c5ded57ab5 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -17,6 +17,7 @@ struct virtnet_ethtool_rule {
>  /* New fields must be added before the classifier struct */
>  struct virtnet_classifier {
>  	size_t size;
> +	refcount_t refcount;
>  	u32 id;
>  	struct virtio_net_resource_obj_ff_classifier classifier;
>  };
> @@ -105,11 +106,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  	return false;
>  }
>  
> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +static int setup_classifier(struct virtnet_ff *ff,
> +			    struct virtnet_classifier **c)
>  {
> +	struct virtnet_classifier *tmp;
> +	unsigned long i;
>  	int err;
>  
> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> +	xa_for_each(&ff->classifiers, i, tmp) {
> +		if ((*c)->size == tmp->size &&
> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> +			refcount_inc(&tmp->refcount);
> +			kfree(*c);
> +			*c = tmp;
> +			goto out;
> +		}
> +	}
> +
> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>  		       GFP_KERNEL);
>  	if (err)
> @@ -117,27 +131,28 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>  
>  	err = virtio_device_object_create(ff->vdev,
>  					  VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> -					  c->id,
> -					  &c->classifier,
> -					  c->size);
> +					  (*c)->id,
> +					  &(*c)->classifier,
> +					  (*c)->size);
>  	if (err)
>  		goto err_xarray;
>  
> +	refcount_set(&(*c)->refcount, 1);
> +out:
>  	return 0;
>  
>  err_xarray:
> -	xa_erase(&ff->classifiers, c->id);
> +	xa_erase(&ff->classifiers, (*c)->id);
>  
>  	return err;
>  }
>  
> -static void destroy_classifier(struct virtnet_ff *ff,
> -			       u32 classifier_id)
> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
>  {
>  	struct virtnet_classifier *c;
>  
>  	c = xa_load(&ff->classifiers, classifier_id);
> -	if (c) {
> +	if (c && refcount_dec_and_test(&c->refcount)) {
>  		virtio_device_object_destroy(ff->vdev,
>  					     VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
>  					     c->id);
> @@ -157,7 +172,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
>  				     eth_rule->flow_spec.location);
>  
>  	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> -	destroy_classifier(ff, eth_rule->classifier_id);
> +	try_destroy_classifier(ff, eth_rule->classifier_id);
>  	kfree(eth_rule);
>  }
>  
> @@ -340,13 +355,13 @@ static int build_and_insert(struct virtnet_ff *ff,
>  	if (err)
>  		goto err_key;
>  
> -	err = setup_classifier(ff, c);
> +	err = setup_classifier(ff, &c);
>  	if (err)
>  		goto err_classifier;
>  
>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
>  	if (err) {
> -		destroy_classifier(ff, c->id);
> +		try_destroy_classifier(ff, c->id);
>  		goto err_key;
>  	}
>  
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-09-25 20:58   ` Michael S. Tsirkin
  2025-09-27  4:45     ` Dan Jurgens
  2025-09-25 21:10   ` Michael S. Tsirkin
  2025-09-26 20:48   ` Jakub Kicinski
  2 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 20:58 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:15AM -0500, Daniel Jurgens wrote:
> Filtering a flow requires a classifier to match the packets, and a rule
> to filter on the matches.
> 
> A classifier consists of one or more selectors. There is one selector
> per header type. A selector must only use fields set in the selector
> capabality.


capability

> If partial matching is supported, the classifier mask for a
> particular field can be a subset of the mask for that field in the
> capability.
> 
> The rule consists of a priority, an action and a key. The key is a byte
> array containing headers corresponding to the selectors in the
> classifier.
> 
> This patch implements ethtool rules for ethernet headers.
> 
> Example:
> $ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
> Added rule with ID 1
> 
> The rule in the example directs received packets with the specified
> destination MAC address to rq 30.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>


overall pls use void* not u8* then you will not need so
many casts, just assignments

> ---
>  drivers/net/virtio_net/virtio_net_ff.c   | 423 +++++++++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h   |  14 +
>  drivers/net/virtio_net/virtio_net_main.c |  16 +
>  include/uapi/linux/virtio_net_ff.h       |  20 ++
>  4 files changed, 473 insertions(+)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index 0036c2db9f77..e3c34bfd1d55 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -9,6 +9,418 @@
>  #define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
>  #define VIRTNET_FF_MAX_GROUPS 1
>  
> +struct virtnet_ethtool_rule {
> +	struct ethtool_rx_flow_spec flow_spec;
> +	u32 classifier_id;
> +};
> +
> +/* New fields must be added before the classifier struct */

meaning classifier must be last? then pls say so.
maybe add BUILD_BUG_ON to test this, too.

> +struct virtnet_classifier {
> +	size_t size;
> +	u32 id;
> +	struct virtio_net_resource_obj_ff_classifier classifier;
> +};
> +
> +static bool check_mask_vs_cap(const void *m, const void *c,
> +			      u16 len, bool partial)
> +{
> +	const u8 *mask = m;
> +	const u8 *cap = c;
> +	int i;
> +
> +	for (i = 0; i < len; i++) {
> +		if (partial && ((mask[i] & cap[i]) != mask[i]))
> +			return false;
> +		if (!partial && mask[i] != cap[i])
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +static
> +struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
> +						u8 selector_type)
> +{
> +	struct virtio_net_ff_selector *sel;
> +	u8 *buf;

make it void * you will not need casts

and assignment is cleaner with the declaration.

> +	int i;
> +
> +	buf = (u8 *)&ff->ff_mask->selectors;
> +	sel = (struct virtio_net_ff_selector *)buf;
> +
> +	for (i = 0; i < ff->ff_mask->count; i++) {
> +		if (sel->type == selector_type)
> +			return sel;
> +
> +		buf += sizeof(struct virtio_net_ff_selector) + sel->length;
> +		sel = (struct virtio_net_ff_selector *)buf;
> +	}
> +
> +	return NULL;
> +}
> +
> +static bool validate_eth_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ethhdr *cap, *mask;
> +	struct ethhdr zeros = {0};

just {} is the same.

> +
> +	cap = (struct ethhdr *)&sel_cap->mask;
> +	mask = (struct ethhdr *)&sel->mask;

do we know they are big enough?


> +
> +	if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
> +	    !check_mask_vs_cap(mask->h_dest, cap->h_dest,
> +			       sizeof(mask->h_dest), partial_mask))
> +		return false;
> +
> +	if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
> +	    !check_mask_vs_cap(mask->h_source, cap->h_source,
> +			       sizeof(mask->h_source), partial_mask))
> +		return false;
> +
> +	if (mask->h_proto &&
> +	    !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
> +			       sizeof(__be16), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
> +static bool validate_mask(const struct virtnet_ff *ff,
> +			  const struct virtio_net_ff_selector *sel)
> +{
> +	struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
> +
> +	if (!sel_cap)
> +		return false;
> +
> +	switch (sel->type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return validate_eth_mask(ff, sel, sel_cap);
> +	}
> +
> +	return false;
> +}
> +
> +static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +{
> +	int err;
> +
> +	err = xa_alloc(&ff->classifiers, &c->id, c,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		return err;
> +
> +	err = virtio_device_object_create(ff->vdev,
> +					  VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +					  c->id,
> +					  &c->classifier,
> +					  c->size);
> +	if (err)
> +		goto err_xarray;
> +
> +	return 0;
> +
> +err_xarray:
> +	xa_erase(&ff->classifiers, c->id);
> +
> +	return err;
> +}
> +
> +static void destroy_classifier(struct virtnet_ff *ff,
> +			       u32 classifier_id)
> +{
> +	struct virtnet_classifier *c;
> +
> +	c = xa_load(&ff->classifiers, classifier_id);
> +	if (c) {
> +		virtio_device_object_destroy(ff->vdev,
> +					     VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +					     c->id);
> +
> +		xa_erase(&ff->classifiers, c->id);
> +		kfree(c);
> +	}
> +}
> +
> +static void destroy_ethtool_rule(struct virtnet_ff *ff,
> +				 struct virtnet_ethtool_rule *eth_rule)
> +{
> +	ff->ethtool.num_rules--;
> +
> +	virtio_device_object_destroy(ff->vdev,
> +				     VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +				     eth_rule->flow_spec.location);
> +
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +	destroy_classifier(ff, eth_rule->classifier_id);
> +	kfree(eth_rule);
> +}
> +
> +static int insert_rule(struct virtnet_ff *ff,
> +		       struct virtnet_ethtool_rule *eth_rule,
> +		       u32 classifier_id,
> +		       const u8 *key,
> +		       size_t key_size)
> +{
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_resource_obj_ff_rule *ff_rule;
> +	int err;
> +
> +	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
> +	if (!ff_rule) {
> +		err = -ENOMEM;
> +		goto err_eth_rule;
> +	}
> +	/*
> +	 * Intentionally leave the priority as 0. All rules have the same
> +	 * priority.
> +	 */


Not the right style of comment for net.

> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
> +	ff_rule->key_length = (u8)key_size;
> +	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
> +					     VIRTIO_NET_FF_ACTION_DROP :
> +					     VIRTIO_NET_FF_ACTION_RX_VQ;
> +	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
> +					       cpu_to_le16(fs->ring_cookie) : 0;
> +	memcpy(&ff_rule->keys, key, key_size);
> +
> +	err = virtio_device_object_create(ff->vdev,
> +					  VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +					  fs->location,
> +					  ff_rule,
> +					  sizeof(*ff_rule) + key_size);
> +	if (err)
> +		goto err_ff_rule;
> +
> +	eth_rule->classifier_id = classifier_id;
> +	ff->ethtool.num_rules++;
> +	kfree(ff_rule);
> +
> +	return 0;
> +
> +err_ff_rule:
> +	kfree(ff_rule);
> +err_eth_rule:
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +	kfree(eth_rule);
> +
> +	return err;
> +}
> +
> +static u32 flow_type_mask(u32 flow_type)
> +{
> +	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
> +}
> +
> +static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
> +{
> +	switch (fs->flow_type) {
> +	case ETHER_FLOW:
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static int validate_flow_input(struct virtnet_ff *ff,
> +			       const struct ethtool_rx_flow_spec *fs,
> +			       u16 curr_queue_pairs)
> +{
> +	/* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
> +	if (fs->location != RX_CLS_LOC_ANY)
> +		return -EOPNOTSUPP;
> +
> +	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
> +	    fs->ring_cookie >= curr_queue_pairs)
> +		return -EINVAL;
> +
> +	if (fs->flow_type != flow_type_mask(fs->flow_type))
> +		return -EOPNOTSUPP;
> +
> +	if (!supported_flow_type(fs))
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> +				 size_t *key_size, size_t *classifier_size,
> +				 int *num_hdrs)
> +{
> +	*num_hdrs = 1;
> +	*key_size = sizeof(struct ethhdr);
> +	/*
> +	 * The classifier size is the size of the classifier header, a selector
> +	 * header for each type of header in the match criteria, and each header
> +	 * providing the mask for matching against.
> +	 */
> +	*classifier_size = *key_size +
> +			   sizeof(struct virtio_net_resource_obj_ff_classifier) +
> +			   sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
> +}
> +
> +static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
> +				   u8 *key,
> +				   const struct ethtool_rx_flow_spec *fs)
> +{
> +	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
> +	struct ethhdr *eth_k = (struct ethhdr *)key;
> +
> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
> +	selector->length = sizeof(struct ethhdr);
> +
> +	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> +	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +}
> +
> +static int
> +validate_classifier_selectors(struct virtnet_ff *ff,
> +			      struct virtio_net_resource_obj_ff_classifier *classifier,
> +			      int num_hdrs)
> +{
> +	struct virtio_net_ff_selector *selector = classifier->selectors;
> +
> +	for (int i = 0; i < num_hdrs; i++) {
> +		if (!validate_mask(ff, selector))
> +			return -EINVAL;
> +
> +		selector = (struct virtio_net_ff_selector *)(((u8 *)selector) +
> +			    sizeof(*selector) + selector->length);
> +	}
> +
> +	return 0;
> +}
> +
> +static int build_and_insert(struct virtnet_ff *ff,
> +			    struct virtnet_ethtool_rule *eth_rule)
> +{
> +	struct virtio_net_resource_obj_ff_classifier *classifier;
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_ff_selector *selector;
> +	struct virtnet_classifier *c;
> +	size_t classifier_size;
> +	size_t key_size;
> +	int num_hdrs;
> +	u8 *key;
> +	int err;
> +
> +	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
> +
> +	key = kzalloc(key_size, GFP_KERNEL);
> +	if (!key)
> +		return -ENOMEM;
> +
> +	/*
> +	 * virtio_net_ff_obj_ff_classifier is already included in the
> +	 * classifier_size.
> +	 */
> +	c = kzalloc(classifier_size +
> +		    sizeof(struct virtnet_classifier) -
> +		    sizeof(struct virtio_net_resource_obj_ff_classifier),
> +		    GFP_KERNEL);
> +	if (!c) {
> +		kfree(key);
> +		return -ENOMEM;
> +	}
> +
> +	c->size = classifier_size;
> +	classifier = &c->classifier;
> +	classifier->count = num_hdrs;
> +	selector = &classifier->selectors[0];
> +
> +	setup_eth_hdr_key_mask(selector, key, fs);
> +
> +	err = validate_classifier_selectors(ff, classifier, num_hdrs);
> +	if (err)
> +		goto err_key;
> +
> +	err = setup_classifier(ff, c);
> +	if (err)
> +		goto err_classifier;
> +
> +	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> +	if (err) {
> +		destroy_classifier(ff, c->id);
> +		goto err_key;
> +	}
> +
> +	return 0;
> +
> +err_classifier:
> +	kfree(c);
> +err_key:
> +	kfree(key);
> +
> +	return err;
> +}
> +
> +int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				struct ethtool_rx_flow_spec *fs,
> +				u16 curr_queue_pairs)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	err = validate_flow_input(ff, fs, curr_queue_pairs);
> +	if (err)
> +		return err;
> +
> +	eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
> +	if (!eth_rule)
> +		return -ENOMEM;
> +
> +	err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		goto err_rule;
> +
> +	eth_rule->flow_spec = *fs;
> +
> +	err = build_and_insert(ff, eth_rule);
> +	if (err)
> +		goto err_xa;
> +
> +	return err;
> +
> +err_xa:
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +
> +err_rule:
> +	fs->location = RX_CLS_LOC_ANY;
> +	kfree(eth_rule);
> +
> +	return err;
> +}
> +
> +int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err = 0;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	eth_rule = xa_load(&ff->ethtool.rules, location);
> +	if (!eth_rule) {
> +		err = -ENOENT;
> +		goto out;
> +	}
> +
> +	destroy_ethtool_rule(ff, eth_rule);
> +out:
> +	return err;
> +}
> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> @@ -142,6 +554,8 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
> +	xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
>  	ff->vdev = vdev;
>  	ff->ff_supported = true;
>  
> @@ -157,9 +571,18 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  
>  void virtnet_ff_cleanup(struct virtnet_ff *ff)
>  {
> +	struct virtnet_ethtool_rule *eth_rule;
> +	unsigned long i;
> +
>  	if (!ff->ff_supported)
>  		return;
>  
> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
> +		destroy_ethtool_rule(ff, eth_rule);
> +
> +	xa_destroy(&ff->ethtool.rules);
> +	xa_destroy(&ff->classifiers);
> +
>  	virtio_device_object_destroy(ff->vdev,
>  				     VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
>  				     VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
> index 4aac0bd08b63..94b575fbd9ed 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.h
> +++ b/drivers/net/virtio_net/virtio_net_ff.h
> @@ -3,20 +3,34 @@
>   * Header file for virtio_net flow filters
>   */
>  #include <linux/virtio_admin.h>
> +#include <uapi/linux/ethtool.h>
>  
>  #ifndef _VIRTIO_NET_FF_H
>  #define _VIRTIO_NET_FF_H
>  
> +struct virtnet_ethtool_ff {
> +	struct xarray rules;
> +	int    num_rules;
> +};
> +
>  struct virtnet_ff {
>  	struct virtio_device *vdev;
>  	bool ff_supported;
>  	struct virtio_net_ff_cap_data *ff_caps;
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
> +	struct xarray classifiers;
> +	int num_classifiers;
> +	struct virtnet_ethtool_ff ethtool;
>  };
>  
>  void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
>  
>  void virtnet_ff_cleanup(struct virtnet_ff *ff);
>  
> +int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				struct ethtool_rx_flow_spec *fs,
> +				u16 curr_queue_pairs);
> +int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
> +
>  #endif /* _VIRTIO_NET_FF_H */
> diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
> index ebf3e5db0d64..808988cdf265 100644
> --- a/drivers/net/virtio_net/virtio_net_main.c
> +++ b/drivers/net/virtio_net/virtio_net_main.c
> @@ -5619,6 +5619,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +
> +	switch (info->cmd) {
> +	case ETHTOOL_SRXCLSRLINS:
> +		return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
> +						   vi->curr_queue_pairs);
> +	case ETHTOOL_SRXCLSRLDEL:
> +		return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
>  static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
>  		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
> @@ -5645,6 +5660,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.get_rxfh_fields = virtnet_get_hashflow,
>  	.set_rxfh_fields = virtnet_set_hashflow,
>  	.get_rx_ring_count = virtnet_get_rx_ring_count,
> +	.set_rxnfc = virtnet_set_rxnfc,
>  };
>  
>  static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> index 662693e1fefd..f258964322f4 100644
> --- a/include/uapi/linux/virtio_net_ff.h
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -13,6 +13,8 @@
>  #define VIRTIO_NET_FF_ACTION_CAP 0x802
>  
>  #define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
>  
>  struct virtio_net_ff_cap_data {
>  	__le32 groups_limit;
> @@ -59,4 +61,22 @@ struct virtio_net_resource_obj_ff_group {
>  	__le16 group_priority;
>  };
>  
> +struct virtio_net_resource_obj_ff_classifier {
> +	__u8 count;
> +	__u8 reserved[7];
> +	struct virtio_net_ff_selector selectors[];
> +};
> +
> +struct virtio_net_resource_obj_ff_rule {
> +	__le32 group_id;
> +	__le32 classifier_id;
> +	__u8 rule_priority;
> +	__u8 key_length; /* length of key in bytes */
> +	__u8 action;
> +	__u8 reserved;
> +	__le16 vq_index;
> +	__u8 reserved1[2];
> +	__u8 keys[];
> +};
> +
>  #endif
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-09-25 21:01   ` Michael S. Tsirkin
  2025-09-26  2:12     ` Dan Jurgens
  2025-09-25 21:16   ` Michael S. Tsirkin
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:01 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
> When probing a virtnet device, attempt to read the flow filter
> capabilities. In order to use the feature the caps must also
> be set. For now setting what was read is sufficient.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/Makefile          |   2 +-
>  drivers/net/virtio_net/virtio_net_ff.c   | 145 +++++++++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h   |  22 ++++
>  drivers/net/virtio_net/virtio_net_main.c |   7 ++
>  include/linux/virtio_admin.h             |   1 +
>  include/uapi/linux/virtio_net_ff.h       |  55 +++++++++
>  6 files changed, 231 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.c
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.h
>  create mode 100644 include/uapi/linux/virtio_net_ff.h
> 
> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> index c0a4725ddd69..c41a587ffb5b 100644
> --- a/drivers/net/virtio_net/Makefile
> +++ b/drivers/net/virtio_net/Makefile
> @@ -5,4 +5,4 @@
>  
>  obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
>  
> -virtio_net-objs := virtio_net_main.o
> +virtio_net-objs := virtio_net_main.o virtio_net_ff.o
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> new file mode 100644
> index 000000000000..61cb45331c97
> --- /dev/null
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/virtio_admin.h>
> +#include <linux/virtio.h>
> +#include <net/ipv6.h>
> +#include <net/ip.h>
> +#include "virtio_net_ff.h"
> +
> +static size_t get_mask_size(u16 type)
> +{
> +	switch (type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return sizeof(struct ethhdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> +		return sizeof(struct iphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return sizeof(struct ipv6hdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_TCP:
> +		return sizeof(struct tcphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_UDP:
> +		return sizeof(struct udphdr);
> +	}
> +
> +	return 0;
> +}
> +
> +void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
> +{
> +	struct virtio_admin_cmd_query_cap_id_result *cap_id_list __free(kfree) = NULL;
> +	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
> +			      sizeof(struct virtio_net_ff_selector) *
> +			      VIRTIO_NET_FF_MASK_TYPE_MAX;
> +	struct virtio_net_ff_selector *sel;
> +	int err;
> +	int i;
> +
> +	cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
> +	if (!cap_id_list)
> +		return;
> +
> +	err = virtio_device_cap_id_list_query(vdev, cap_id_list);
> +	if (err)
> +		return;
> +
> +	if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_RESOURCE_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_SELECTOR_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_ACTION_CAP)))
> +		return;
> +
> +	ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
> +	if (!ff->ff_caps)
> +		return;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_RESOURCE_CAP,
> +				    ff->ff_caps,
> +				    sizeof(*ff->ff_caps));
> +
> +	if (err)
> +		goto err_ff;
> +
> +	/* VIRTIO_NET_FF_MASK_TYPE start at 1 */
> +	for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
> +		ff_mask_size += get_mask_size(i);
> +
> +	ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
> +	if (!ff->ff_mask)
> +		goto err_ff;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_SELECTOR_CAP,
> +				    ff->ff_mask,
> +				    ff_mask_size);
> +
> +	if (err)
> +		goto err_ff_mask;
> +
> +	ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
> +					VIRTIO_NET_FF_ACTION_MAX,
> +					GFP_KERNEL);
> +	if (!ff->ff_actions)
> +		goto err_ff_mask;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_ACTION_CAP,
> +				    ff->ff_actions,
> +				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_RESOURCE_CAP,
> +				    ff->ff_caps,
> +				    sizeof(*ff->ff_caps));
> +	if (err)
> +		goto err_ff_action;
> +
> +	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
> +	sel = &ff->ff_mask->selectors[0];
> +
> +	for (int i = 0; i < ff->ff_mask->count; i++) {


I do not think kernel style allows this int inside loop.
I think you need to declare it at the beginning of the block.

> +		ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
> +		sel = (struct virtio_net_ff_selector *)((u8 *)sel + sizeof(*sel) + sel->length);
> +	}
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_SELECTOR_CAP,
> +				    ff->ff_mask,
> +				    ff_mask_size);
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_ACTION_CAP,
> +				    ff->ff_actions,
> +				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +	if (err)
> +		goto err_ff_action;
> +
> +	ff->vdev = vdev;
> +	ff->ff_supported = true;
> +
> +	return;
> +
> +err_ff_action:
> +	kfree(ff->ff_actions);
> +err_ff_mask:
> +	kfree(ff->ff_mask);
> +err_ff:
> +	kfree(ff->ff_caps);
> +}
> +
> +void virtnet_ff_cleanup(struct virtnet_ff *ff)
> +{
> +	if (!ff->ff_supported)
> +		return;
> +
> +	kfree(ff->ff_actions);
> +	kfree(ff->ff_mask);
> +	kfree(ff->ff_caps);
> +}
> diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
> new file mode 100644
> index 000000000000..4aac0bd08b63
> --- /dev/null
> +++ b/drivers/net/virtio_net/virtio_net_ff.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only
> + *
> + * Header file for virtio_net flow filters
> + */
> +#include <linux/virtio_admin.h>
> +
> +#ifndef _VIRTIO_NET_FF_H
> +#define _VIRTIO_NET_FF_H
> +
> +struct virtnet_ff {
> +	struct virtio_device *vdev;
> +	bool ff_supported;
> +	struct virtio_net_ff_cap_data *ff_caps;
> +	struct virtio_net_ff_cap_mask_data *ff_mask;
> +	struct virtio_net_ff_actions *ff_actions;
> +};
> +
> +void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
> +
> +void virtnet_ff_cleanup(struct virtnet_ff *ff);
> +
> +#endif /* _VIRTIO_NET_FF_H */
> diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
> index 7da5a37917e9..ebf3e5db0d64 100644
> --- a/drivers/net/virtio_net/virtio_net_main.c
> +++ b/drivers/net/virtio_net/virtio_net_main.c
> @@ -26,6 +26,7 @@
>  #include <net/netdev_rx_queue.h>
>  #include <net/netdev_queues.h>
>  #include <net/xdp_sock_drv.h>
> +#include "virtio_net_ff.h"
>  
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -493,6 +494,8 @@ struct virtnet_info {
>  	struct failover *failover;
>  
>  	u64 device_stats_cap;
> +
> +	struct virtnet_ff ff;
>  };
>  
>  struct padded_vnet_hdr {
> @@ -7116,6 +7119,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>  	}
>  	vi->guest_offloads_capable = vi->guest_offloads;
>  
> +	virtnet_ff_init(&vi->ff, vi->vdev);
> +
>  	rtnl_unlock();
>  
>  	err = virtnet_cpu_notif_add(vi);
> @@ -7131,6 +7136,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>  
>  free_unregister_netdev:
>  	unregister_netdev(dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  free_failover:
>  	net_failover_destroy(vi->failover);
>  free_vqs:
> @@ -7180,6 +7186,7 @@ static void virtnet_remove(struct virtio_device *vdev)
>  	virtnet_free_irq_moder(vi);
>  
>  	unregister_netdev(vi->dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  
>  	net_failover_destroy(vi->failover);
>  
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> index cc6b82461c9f..f8f1369d1175 100644
> --- a/include/linux/virtio_admin.h
> +++ b/include/linux/virtio_admin.h
> @@ -3,6 +3,7 @@
>   * Header file for virtio admin operations
>   */
>  #include <uapi/linux/virtio_pci.h>
> +#include <uapi/linux/virtio_net_ff.h>
>  
>  #ifndef _LINUX_VIRTIO_ADMIN_H
>  #define _LINUX_VIRTIO_ADMIN_H
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..a35533bf8377
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -0,0 +1,55 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;
> +};
> +
> +struct virtio_net_ff_selector {
> +	__u8 type;
> +	__u8 flags;
> +	__u8 reserved[2];
> +	__u8 length;
> +	__u8 reserved1[3];
> +	__u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	struct virtio_net_ff_selector selectors[];
> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
> +struct virtio_net_ff_actions {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 actions[];
> +};
> +#endif
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
  2025-09-25 20:58   ` Michael S. Tsirkin
@ 2025-09-25 21:10   ` Michael S. Tsirkin
  2025-09-27  5:02     ` Dan Jurgens
  2025-09-26 20:48   ` Jakub Kicinski
  2 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:10 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:15AM -0500, Daniel Jurgens wrote:
> Filtering a flow requires a classifier to match the packets, and a rule
> to filter on the matches.
> 
> A classifier consists of one or more selectors. There is one selector
> per header type. A selector must only use fields set in the selector
> capabality. If partial matching is supported, the classifier mask for a
> particular field can be a subset of the mask for that field in the
> capability.
> 
> The rule consists of a priority, an action and a key. The key is a byte
> array containing headers corresponding to the selectors in the
> classifier.
> 
> This patch implements ethtool rules for ethernet headers.
> 
> Example:
> $ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
> Added rule with ID 1
> 
> The rule in the example directs received packets with the specified
> destination MAC address to rq 30.


As you are adding things to UAPI header pls document this fact, too.


> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c   | 423 +++++++++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h   |  14 +
>  drivers/net/virtio_net/virtio_net_main.c |  16 +
>  include/uapi/linux/virtio_net_ff.h       |  20 ++
>  4 files changed, 473 insertions(+)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index 0036c2db9f77..e3c34bfd1d55 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -9,6 +9,418 @@
>  #define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
>  #define VIRTNET_FF_MAX_GROUPS 1
>  
> +struct virtnet_ethtool_rule {
> +	struct ethtool_rx_flow_spec flow_spec;
> +	u32 classifier_id;
> +};
> +
> +/* New fields must be added before the classifier struct */
> +struct virtnet_classifier {
> +	size_t size;
> +	u32 id;
> +	struct virtio_net_resource_obj_ff_classifier classifier;
> +};
> +
> +static bool check_mask_vs_cap(const void *m, const void *c,
> +			      u16 len, bool partial)
> +{
> +	const u8 *mask = m;
> +	const u8 *cap = c;
> +	int i;
> +
> +	for (i = 0; i < len; i++) {
> +		if (partial && ((mask[i] & cap[i]) != mask[i]))
> +			return false;
> +		if (!partial && mask[i] != cap[i])
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +static
> +struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
> +						u8 selector_type)
> +{
> +	struct virtio_net_ff_selector *sel;
> +	u8 *buf;
> +	int i;
> +
> +	buf = (u8 *)&ff->ff_mask->selectors;
> +	sel = (struct virtio_net_ff_selector *)buf;
> +
> +	for (i = 0; i < ff->ff_mask->count; i++) {
> +		if (sel->type == selector_type)
> +			return sel;
> +
> +		buf += sizeof(struct virtio_net_ff_selector) + sel->length;
> +		sel = (struct virtio_net_ff_selector *)buf;
> +	}
> +
> +	return NULL;
> +}
> +
> +static bool validate_eth_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ethhdr *cap, *mask;
> +	struct ethhdr zeros = {0};
> +
> +	cap = (struct ethhdr *)&sel_cap->mask;
> +	mask = (struct ethhdr *)&sel->mask;
> +
> +	if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
> +	    !check_mask_vs_cap(mask->h_dest, cap->h_dest,
> +			       sizeof(mask->h_dest), partial_mask))
> +		return false;
> +
> +	if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
> +	    !check_mask_vs_cap(mask->h_source, cap->h_source,
> +			       sizeof(mask->h_source), partial_mask))
> +		return false;
> +
> +	if (mask->h_proto &&
> +	    !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
> +			       sizeof(__be16), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
> +static bool validate_mask(const struct virtnet_ff *ff,
> +			  const struct virtio_net_ff_selector *sel)
> +{
> +	struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
> +
> +	if (!sel_cap)
> +		return false;
> +
> +	switch (sel->type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return validate_eth_mask(ff, sel, sel_cap);
> +	}
> +
> +	return false;
> +}
> +
> +static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +{
> +	int err;
> +
> +	err = xa_alloc(&ff->classifiers, &c->id, c,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		return err;
> +
> +	err = virtio_device_object_create(ff->vdev,
> +					  VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +					  c->id,
> +					  &c->classifier,
> +					  c->size);
> +	if (err)
> +		goto err_xarray;
> +
> +	return 0;
> +
> +err_xarray:
> +	xa_erase(&ff->classifiers, c->id);
> +
> +	return err;
> +}
> +
> +static void destroy_classifier(struct virtnet_ff *ff,
> +			       u32 classifier_id)
> +{
> +	struct virtnet_classifier *c;
> +
> +	c = xa_load(&ff->classifiers, classifier_id);
> +	if (c) {
> +		virtio_device_object_destroy(ff->vdev,
> +					     VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +					     c->id);
> +
> +		xa_erase(&ff->classifiers, c->id);
> +		kfree(c);
> +	}
> +}
> +
> +static void destroy_ethtool_rule(struct virtnet_ff *ff,
> +				 struct virtnet_ethtool_rule *eth_rule)
> +{
> +	ff->ethtool.num_rules--;
> +
> +	virtio_device_object_destroy(ff->vdev,
> +				     VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +				     eth_rule->flow_spec.location);
> +
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +	destroy_classifier(ff, eth_rule->classifier_id);
> +	kfree(eth_rule);
> +}
> +
> +static int insert_rule(struct virtnet_ff *ff,
> +		       struct virtnet_ethtool_rule *eth_rule,
> +		       u32 classifier_id,
> +		       const u8 *key,
> +		       size_t key_size)
> +{
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_resource_obj_ff_rule *ff_rule;
> +	int err;
> +
> +	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
> +	if (!ff_rule) {
> +		err = -ENOMEM;
> +		goto err_eth_rule;
> +	}
> +	/*
> +	 * Intentionally leave the priority as 0. All rules have the same
> +	 * priority.
> +	 */
> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
> +	ff_rule->key_length = (u8)key_size;

Do we know that key size is <256?



> +	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
> +					     VIRTIO_NET_FF_ACTION_DROP :
> +					     VIRTIO_NET_FF_ACTION_RX_VQ;
> +	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
> +					       cpu_to_le16(fs->ring_cookie) : 0;
> +	memcpy(&ff_rule->keys, key, key_size);
> +
> +	err = virtio_device_object_create(ff->vdev,
> +					  VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +					  fs->location,
> +					  ff_rule,
> +					  sizeof(*ff_rule) + key_size);
> +	if (err)
> +		goto err_ff_rule;
> +
> +	eth_rule->classifier_id = classifier_id;
> +	ff->ethtool.num_rules++;
> +	kfree(ff_rule);
> +
> +	return 0;
> +
> +err_ff_rule:
> +	kfree(ff_rule);
> +err_eth_rule:
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +	kfree(eth_rule);

This is a weird way to handle errors. You never added or allocated eth_rule,
which are you erasing and freeing here?


Checking callers:

	> +     err = build_and_insert(ff, eth_rule);
	> +     if (err)
	> +             goto err_xa;
	> +
	> +     return err;
	> +
	> +err_xa:
	> +     xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
	> +
	> +err_rule:
	> +     fs->location = RX_CLS_LOC_ANY;
	> +     kfree(eth_rule);

looks like double free to me.



> +
> +	return err;
> +}
> +
> +static u32 flow_type_mask(u32 flow_type)
> +{
> +	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
> +}
> +
> +static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
> +{
> +	switch (fs->flow_type) {
> +	case ETHER_FLOW:
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static int validate_flow_input(struct virtnet_ff *ff,
> +			       const struct ethtool_rx_flow_spec *fs,
> +			       u16 curr_queue_pairs)
> +{
> +	/* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
> +	if (fs->location != RX_CLS_LOC_ANY)
> +		return -EOPNOTSUPP;
> +
> +	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
> +	    fs->ring_cookie >= curr_queue_pairs)
> +		return -EINVAL;
> +
> +	if (fs->flow_type != flow_type_mask(fs->flow_type))
> +		return -EOPNOTSUPP;
> +
> +	if (!supported_flow_type(fs))
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> +				 size_t *key_size, size_t *classifier_size,
> +				 int *num_hdrs)
> +{
> +	*num_hdrs = 1;
> +	*key_size = sizeof(struct ethhdr);
> +	/*
> +	 * The classifier size is the size of the classifier header, a selector
> +	 * header for each type of header in the match criteria, and each header
> +	 * providing the mask for matching against.
> +	 */
> +	*classifier_size = *key_size +
> +			   sizeof(struct virtio_net_resource_obj_ff_classifier) +
> +			   sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
> +}
> +
> +static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
> +				   u8 *key,
> +				   const struct ethtool_rx_flow_spec *fs)
> +{
> +	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
> +	struct ethhdr *eth_k = (struct ethhdr *)key;
> +
> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
> +	selector->length = sizeof(struct ethhdr);
> +
> +	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> +	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +}
> +
> +static int
> +validate_classifier_selectors(struct virtnet_ff *ff,
> +			      struct virtio_net_resource_obj_ff_classifier *classifier,
> +			      int num_hdrs)
> +{
> +	struct virtio_net_ff_selector *selector = classifier->selectors;
> +
> +	for (int i = 0; i < num_hdrs; i++) {


not sure kernel style allows these.
i think you should declare these at beginning of block.


> +		if (!validate_mask(ff, selector))
> +			return -EINVAL;
> +
> +		selector = (struct virtio_net_ff_selector *)(((u8 *)selector) +
> +			    sizeof(*selector) + selector->length);
> +	}
> +
> +	return 0;
> +}
> +
> +static int build_and_insert(struct virtnet_ff *ff,
> +			    struct virtnet_ethtool_rule *eth_rule)
> +{
> +	struct virtio_net_resource_obj_ff_classifier *classifier;
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_ff_selector *selector;
> +	struct virtnet_classifier *c;
> +	size_t classifier_size;
> +	size_t key_size;
> +	int num_hdrs;
> +	u8 *key;
> +	int err;
> +
> +	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
> +
> +	key = kzalloc(key_size, GFP_KERNEL);
> +	if (!key)
> +		return -ENOMEM;
> +
> +	/*
> +	 * virtio_net_ff_obj_ff_classifier is already included in the
> +	 * classifier_size.
> +	 */
> +	c = kzalloc(classifier_size +
> +		    sizeof(struct virtnet_classifier) -
> +		    sizeof(struct virtio_net_resource_obj_ff_classifier),

do we know all this math does not overflow?

> +		    GFP_KERNEL);
> +	if (!c) {
> +		kfree(key);
> +		return -ENOMEM;
> +	}
> +
> +	c->size = classifier_size;
> +	classifier = &c->classifier;
> +	classifier->count = num_hdrs;
> +	selector = &classifier->selectors[0];
> +
> +	setup_eth_hdr_key_mask(selector, key, fs);
> +
> +	err = validate_classifier_selectors(ff, classifier, num_hdrs);
> +	if (err)
> +		goto err_key;
> +
> +	err = setup_classifier(ff, c);
> +	if (err)
> +		goto err_classifier;
> +
> +	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> +	if (err) {
> +		destroy_classifier(ff, c->id);
> +		goto err_key;
> +	}
> +
> +	return 0;
> +
> +err_classifier:
> +	kfree(c);
> +err_key:
> +	kfree(key);
> +
> +	return err;
> +}
> +
> +int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				struct ethtool_rx_flow_spec *fs,
> +				u16 curr_queue_pairs)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	err = validate_flow_input(ff, fs, curr_queue_pairs);
> +	if (err)
> +		return err;
> +
> +	eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
> +	if (!eth_rule)
> +		return -ENOMEM;
> +
> +	err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		goto err_rule;
> +
> +	eth_rule->flow_spec = *fs;
> +
> +	err = build_and_insert(ff, eth_rule);
> +	if (err)
> +		goto err_xa;
> +
> +	return err;
> +
> +err_xa:
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +
> +err_rule:
> +	fs->location = RX_CLS_LOC_ANY;
> +	kfree(eth_rule);
> +
> +	return err;
> +}
> +
> +int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err = 0;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	eth_rule = xa_load(&ff->ethtool.rules, location);
> +	if (!eth_rule) {
> +		err = -ENOENT;
> +		goto out;
> +	}
> +
> +	destroy_ethtool_rule(ff, eth_rule);
> +out:
> +	return err;
> +}
> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> @@ -142,6 +554,8 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
> +	xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
>  	ff->vdev = vdev;
>  	ff->ff_supported = true;
>  
> @@ -157,9 +571,18 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  
>  void virtnet_ff_cleanup(struct virtnet_ff *ff)
>  {
> +	struct virtnet_ethtool_rule *eth_rule;
> +	unsigned long i;
> +
>  	if (!ff->ff_supported)
>  		return;
>  
> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
> +		destroy_ethtool_rule(ff, eth_rule);
> +
> +	xa_destroy(&ff->ethtool.rules);
> +	xa_destroy(&ff->classifiers);
> +
>  	virtio_device_object_destroy(ff->vdev,
>  				     VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
>  				     VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
> index 4aac0bd08b63..94b575fbd9ed 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.h
> +++ b/drivers/net/virtio_net/virtio_net_ff.h
> @@ -3,20 +3,34 @@
>   * Header file for virtio_net flow filters
>   */
>  #include <linux/virtio_admin.h>
> +#include <uapi/linux/ethtool.h>
>  
>  #ifndef _VIRTIO_NET_FF_H
>  #define _VIRTIO_NET_FF_H
>  
> +struct virtnet_ethtool_ff {
> +	struct xarray rules;
> +	int    num_rules;
> +};
> +
>  struct virtnet_ff {
>  	struct virtio_device *vdev;
>  	bool ff_supported;
>  	struct virtio_net_ff_cap_data *ff_caps;
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
> +	struct xarray classifiers;
> +	int num_classifiers;
> +	struct virtnet_ethtool_ff ethtool;
>  };
>  
>  void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
>  
>  void virtnet_ff_cleanup(struct virtnet_ff *ff);
>  
> +int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				struct ethtool_rx_flow_spec *fs,
> +				u16 curr_queue_pairs);
> +int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
> +
>  #endif /* _VIRTIO_NET_FF_H */
> diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
> index ebf3e5db0d64..808988cdf265 100644
> --- a/drivers/net/virtio_net/virtio_net_main.c
> +++ b/drivers/net/virtio_net/virtio_net_main.c
> @@ -5619,6 +5619,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +
> +	switch (info->cmd) {
> +	case ETHTOOL_SRXCLSRLINS:
> +		return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
> +						   vi->curr_queue_pairs);
> +	case ETHTOOL_SRXCLSRLDEL:
> +		return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
>  static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
>  		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
> @@ -5645,6 +5660,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.get_rxfh_fields = virtnet_get_hashflow,
>  	.set_rxfh_fields = virtnet_set_hashflow,
>  	.get_rx_ring_count = virtnet_get_rx_ring_count,
> +	.set_rxnfc = virtnet_set_rxnfc,
>  };
>  
>  static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> index 662693e1fefd..f258964322f4 100644
> --- a/include/uapi/linux/virtio_net_ff.h
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -13,6 +13,8 @@
>  #define VIRTIO_NET_FF_ACTION_CAP 0x802
>  
>  #define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
>  
>  struct virtio_net_ff_cap_data {
>  	__le32 groups_limit;
> @@ -59,4 +61,22 @@ struct virtio_net_resource_obj_ff_group {
>  	__le16 group_priority;
>  };
>  
> +struct virtio_net_resource_obj_ff_classifier {
> +	__u8 count;
> +	__u8 reserved[7];
> +	struct virtio_net_ff_selector selectors[];
> +};
> +
> +struct virtio_net_resource_obj_ff_rule {
> +	__le32 group_id;
> +	__le32 classifier_id;
> +	__u8 rule_priority;
> +	__u8 key_length; /* length of key in bytes */
> +	__u8 action;
> +	__u8 reserved;
> +	__le16 vq_index;
> +	__u8 reserved1[2];
> +	__u8 keys[];
> +};
> +
>  #endif
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering
  2025-09-23 14:19 ` [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
@ 2025-09-25 21:13   ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:13 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:14AM -0500, Daniel Jurgens wrote:
> All ethtool steering rules will go in one group, create it during
> initialization.


document uapi changes pls.

> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/virtio_net_ff.c | 25 +++++++++++++++++++++++++
>  include/uapi/linux/virtio_net_ff.h     |  7 +++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> index 61cb45331c97..0036c2db9f77 100644
> --- a/drivers/net/virtio_net/virtio_net_ff.c
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -6,6 +6,9 @@
>  #include <net/ip.h>
>  #include "virtio_net_ff.h"
>  
> +#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
> +#define VIRTNET_FF_MAX_GROUPS 1
> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> @@ -30,6 +33,7 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
>  			      sizeof(struct virtio_net_ff_selector) *
>  			      VIRTIO_NET_FF_MASK_TYPE_MAX;
> +	struct virtio_net_resource_obj_ff_group ethtool_group = {};
>  	struct virtio_net_ff_selector *sel;
>  	int err;
>  	int i;
> @@ -92,6 +96,12 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	if (le32_to_cpu(ff->ff_caps->groups_limit) < VIRTNET_FF_MAX_GROUPS) {
> +		err = -ENOSPC;
> +		goto err_ff_action;
> +	}
> +	ff->ff_caps->groups_limit = cpu_to_le32(VIRTNET_FF_MAX_GROUPS);
> +
>  	err = virtio_device_cap_set(vdev,
>  				    VIRTIO_NET_FF_RESOURCE_CAP,
>  				    ff->ff_caps,
> @@ -121,6 +131,17 @@ void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	ethtool_group.group_priority = cpu_to_le16(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +
> +	/* Use priority for the object ID. */
> +	err = virtio_device_object_create(vdev,
> +					  VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
> +					  VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
> +					  &ethtool_group,
> +					  sizeof(ethtool_group));
> +	if (err)
> +		goto err_ff_action;
> +
>  	ff->vdev = vdev;
>  	ff->ff_supported = true;
>  
> @@ -139,6 +160,10 @@ void virtnet_ff_cleanup(struct virtnet_ff *ff)
>  	if (!ff->ff_supported)
>  		return;
>  
> +	virtio_device_object_destroy(ff->vdev,
> +				     VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
> +				     VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +
>  	kfree(ff->ff_actions);
>  	kfree(ff->ff_mask);
>  	kfree(ff->ff_caps);
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> index a35533bf8377..662693e1fefd 100644
> --- a/include/uapi/linux/virtio_net_ff.h
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -12,6 +12,8 @@
>  #define VIRTIO_NET_FF_SELECTOR_CAP 0x801
>  #define VIRTIO_NET_FF_ACTION_CAP 0x802
>  
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
> +
>  struct virtio_net_ff_cap_data {
>  	__le32 groups_limit;
>  	__le32 classifiers_limit;
> @@ -52,4 +54,9 @@ struct virtio_net_ff_actions {
>  	__u8 reserved[7];
>  	__u8 actions[];
>  };
> +
> +struct virtio_net_resource_obj_ff_group {
> +	__le16 group_priority;
> +};
> +
>  #endif
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules
  2025-09-25 20:53   ` Michael S. Tsirkin
@ 2025-09-25 21:13     ` Dan Jurgens
  2025-09-25 21:20       ` Michael S. Tsirkin
  2025-10-01 14:15     ` Dan Jurgens
  1 sibling, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-25 21:13 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 3:53 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:17AM -0500, Daniel Jurgens wrote:
>> Add support for IP_USER type rules from ethtool.
>>
>> +static
>> +struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
>> +{
>> +	void *nextsel;
>> +
>> +	nextsel = (u8 *)sel + sizeof(struct virtio_net_ff_selector) +
>> +		  sel->length;
> 
> you do not need this variable. and cast to void* looks cleaner imho.

It's cast to u8* so we do pointer arithmetic in bytes, which is not
standard C on void*. GCC doesn't mind, but I thing Clang does.

I saw you had a similar comment on a subsequent too.>
>> +
>> +	return nextsel;
>> +}
>> +



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
  2025-09-25 21:01   ` Michael S. Tsirkin
@ 2025-09-25 21:16   ` Michael S. Tsirkin
  2025-09-26  4:54     ` Dan Jurgens
  2025-09-26 16:01   ` Simon Horman
  2025-09-26 20:45   ` Jakub Kicinski
  3 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:16 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
> When probing a virtnet device, attempt to read the flow filter
> capabilities. In order to use the feature the caps must also
> be set. For now setting what was read is sufficient.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  drivers/net/virtio_net/Makefile          |   2 +-
>  drivers/net/virtio_net/virtio_net_ff.c   | 145 +++++++++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h   |  22 ++++
>  drivers/net/virtio_net/virtio_net_main.c |   7 ++
>  include/linux/virtio_admin.h             |   1 +
>  include/uapi/linux/virtio_net_ff.h       |  55 +++++++++
>  6 files changed, 231 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.c
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.h
>  create mode 100644 include/uapi/linux/virtio_net_ff.h
> 
> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> index c0a4725ddd69..c41a587ffb5b 100644
> --- a/drivers/net/virtio_net/Makefile
> +++ b/drivers/net/virtio_net/Makefile
> @@ -5,4 +5,4 @@
>  
>  obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
>  
> -virtio_net-objs := virtio_net_main.o
> +virtio_net-objs := virtio_net_main.o virtio_net_ff.o
> diff --git a/drivers/net/virtio_net/virtio_net_ff.c b/drivers/net/virtio_net/virtio_net_ff.c
> new file mode 100644
> index 000000000000..61cb45331c97
> --- /dev/null
> +++ b/drivers/net/virtio_net/virtio_net_ff.c
> @@ -0,0 +1,145 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/virtio_admin.h>
> +#include <linux/virtio.h>
> +#include <net/ipv6.h>
> +#include <net/ip.h>
> +#include "virtio_net_ff.h"
> +
> +static size_t get_mask_size(u16 type)
> +{
> +	switch (type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return sizeof(struct ethhdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> +		return sizeof(struct iphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return sizeof(struct ipv6hdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_TCP:
> +		return sizeof(struct tcphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_UDP:
> +		return sizeof(struct udphdr);
> +	}
> +
> +	return 0;
> +}
> +
> +void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
> +{
> +	struct virtio_admin_cmd_query_cap_id_result *cap_id_list __free(kfree) = NULL;
> +	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
> +			      sizeof(struct virtio_net_ff_selector) *
> +			      VIRTIO_NET_FF_MASK_TYPE_MAX;
> +	struct virtio_net_ff_selector *sel;
> +	int err;
> +	int i;
> +
> +	cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
> +	if (!cap_id_list)
> +		return;
> +
> +	err = virtio_device_cap_id_list_query(vdev, cap_id_list);
> +	if (err)
> +		return;
> +
> +	if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_RESOURCE_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_SELECTOR_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_ACTION_CAP)))
> +		return;
> +
> +	ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
> +	if (!ff->ff_caps)
> +		return;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_RESOURCE_CAP,
> +				    ff->ff_caps,
> +				    sizeof(*ff->ff_caps));
> +
> +	if (err)
> +		goto err_ff;
> +
> +	/* VIRTIO_NET_FF_MASK_TYPE start at 1 */
> +	for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
> +		ff_mask_size += get_mask_size(i);
> +
> +	ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
> +	if (!ff->ff_mask)
> +		goto err_ff;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_SELECTOR_CAP,
> +				    ff->ff_mask,
> +				    ff_mask_size);
> +
> +	if (err)
> +		goto err_ff_mask;
> +
> +	ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
> +					VIRTIO_NET_FF_ACTION_MAX,
> +					GFP_KERNEL);
> +	if (!ff->ff_actions)
> +		goto err_ff_mask;
> +
> +	err = virtio_device_cap_get(vdev,
> +				    VIRTIO_NET_FF_ACTION_CAP,
> +				    ff->ff_actions,
> +				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_RESOURCE_CAP,
> +				    ff->ff_caps,
> +				    sizeof(*ff->ff_caps));
> +	if (err)
> +		goto err_ff_action;
> +
> +	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
> +	sel = &ff->ff_mask->selectors[0];
> +
> +	for (int i = 0; i < ff->ff_mask->count; i++) {

i think kernel prefers variables at beginning of the block

> +		ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;

do we know this will not overflow?


> +		sel = (struct virtio_net_ff_selector *)((u8 *)sel + sizeof(*sel) + sel->length);
> +	}
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_SELECTOR_CAP,
> +				    ff->ff_mask,
> +				    ff_mask_size);
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_device_cap_set(vdev,
> +				    VIRTIO_NET_FF_ACTION_CAP,
> +				    ff->ff_actions,
> +				    sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +	if (err)
> +		goto err_ff_action;
> +
> +	ff->vdev = vdev;
> +	ff->ff_supported = true;
> +
> +	return;
> +
> +err_ff_action:
> +	kfree(ff->ff_actions);
> +err_ff_mask:
> +	kfree(ff->ff_mask);
> +err_ff:
> +	kfree(ff->ff_caps);
> +}
> +
> +void virtnet_ff_cleanup(struct virtnet_ff *ff)
> +{
> +	if (!ff->ff_supported)
> +		return;
> +
> +	kfree(ff->ff_actions);
> +	kfree(ff->ff_mask);
> +	kfree(ff->ff_caps);
> +}
> diff --git a/drivers/net/virtio_net/virtio_net_ff.h b/drivers/net/virtio_net/virtio_net_ff.h
> new file mode 100644
> index 000000000000..4aac0bd08b63
> --- /dev/null
> +++ b/drivers/net/virtio_net/virtio_net_ff.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only
> + *
> + * Header file for virtio_net flow filters
> + */
> +#include <linux/virtio_admin.h>
> +
> +#ifndef _VIRTIO_NET_FF_H
> +#define _VIRTIO_NET_FF_H
> +
> +struct virtnet_ff {
> +	struct virtio_device *vdev;
> +	bool ff_supported;
> +	struct virtio_net_ff_cap_data *ff_caps;
> +	struct virtio_net_ff_cap_mask_data *ff_mask;
> +	struct virtio_net_ff_actions *ff_actions;
> +};
> +
> +void virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev);
> +
> +void virtnet_ff_cleanup(struct virtnet_ff *ff);
> +
> +#endif /* _VIRTIO_NET_FF_H */
> diff --git a/drivers/net/virtio_net/virtio_net_main.c b/drivers/net/virtio_net/virtio_net_main.c
> index 7da5a37917e9..ebf3e5db0d64 100644
> --- a/drivers/net/virtio_net/virtio_net_main.c
> +++ b/drivers/net/virtio_net/virtio_net_main.c
> @@ -26,6 +26,7 @@
>  #include <net/netdev_rx_queue.h>
>  #include <net/netdev_queues.h>
>  #include <net/xdp_sock_drv.h>
> +#include "virtio_net_ff.h"
>  
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -493,6 +494,8 @@ struct virtnet_info {
>  	struct failover *failover;
>  
>  	u64 device_stats_cap;
> +
> +	struct virtnet_ff ff;
>  };
>  
>  struct padded_vnet_hdr {
> @@ -7116,6 +7119,8 @@ static int virtnet_probe(struct virtio_device *vdev)
>  	}
>  	vi->guest_offloads_capable = vi->guest_offloads;
>  
> +	virtnet_ff_init(&vi->ff, vi->vdev);
> +
>  	rtnl_unlock();
>  
>  	err = virtnet_cpu_notif_add(vi);
> @@ -7131,6 +7136,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>  
>  free_unregister_netdev:
>  	unregister_netdev(dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  free_failover:
>  	net_failover_destroy(vi->failover);
>  free_vqs:
> @@ -7180,6 +7186,7 @@ static void virtnet_remove(struct virtio_device *vdev)
>  	virtnet_free_irq_moder(vi);
>  
>  	unregister_netdev(vi->dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  
>  	net_failover_destroy(vi->failover);
>  
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> index cc6b82461c9f..f8f1369d1175 100644
> --- a/include/linux/virtio_admin.h
> +++ b/include/linux/virtio_admin.h
> @@ -3,6 +3,7 @@
>   * Header file for virtio admin operations
>   */
>  #include <uapi/linux/virtio_pci.h>
> +#include <uapi/linux/virtio_net_ff.h>
>  
>  #ifndef _LINUX_VIRTIO_ADMIN_H
>  #define _LINUX_VIRTIO_ADMIN_H
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..a35533bf8377
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -0,0 +1,55 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;
> +};
> +
> +struct virtio_net_ff_selector {
> +	__u8 type;
> +	__u8 flags;
> +	__u8 reserved[2];
> +	__u8 length;
> +	__u8 reserved1[3];
> +	__u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	struct virtio_net_ff_selector selectors[];
> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
> +struct virtio_net_ff_actions {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 actions[];
> +};
> +#endif
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory
  2025-09-23 14:19 ` [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory Daniel Jurgens
  2025-09-25  3:56   ` Xuan Zhuo
@ 2025-09-25 21:17   ` Michael S. Tsirkin
  1 sibling, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:17 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:12AM -0500, Daniel Jurgens wrote:
> The flow filter implementaion 

implementation

>requires minimal changes to the
> existing virtio_net implementation. It's cleaner to separate it into
> another file. In order to do so, move virtio_net.c into the new
> virtio_net directory, and create a makefile for it. Note the name is
> changed to virtio_net_main.c, so the module can retain the name
> virtio_net.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> ---
>  MAINTAINERS                                               | 2 +-
>  drivers/net/Makefile                                      | 2 +-
>  drivers/net/virtio_net/Makefile                           | 8 ++++++++
>  .../net/{virtio_net.c => virtio_net/virtio_net_main.c}    | 0
>  4 files changed, 10 insertions(+), 2 deletions(-)
>  create mode 100644 drivers/net/virtio_net/Makefile
>  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (100%)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8a770714101..09d26c4225a9 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -26685,7 +26685,7 @@ F:	Documentation/devicetree/bindings/virtio/
>  F:	Documentation/driver-api/virtio/
>  F:	drivers/block/virtio_blk.c
>  F:	drivers/crypto/virtio/
> -F:	drivers/net/virtio_net.c
> +F:	drivers/net/virtio_net/
>  F:	drivers/vdpa/
>  F:	drivers/virtio/
>  F:	include/linux/vdpa.h
> diff --git a/drivers/net/Makefile b/drivers/net/Makefile
> index 73bc63ecd65f..cf28992658a6 100644
> --- a/drivers/net/Makefile
> +++ b/drivers/net/Makefile
> @@ -33,7 +33,7 @@ obj-$(CONFIG_NET_TEAM) += team/
>  obj-$(CONFIG_TUN) += tun.o
>  obj-$(CONFIG_TAP) += tap.o
>  obj-$(CONFIG_VETH) += veth.o
> -obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> +obj-$(CONFIG_VIRTIO_NET) += virtio_net/
>  obj-$(CONFIG_VXLAN) += vxlan/
>  obj-$(CONFIG_GENEVE) += geneve.o
>  obj-$(CONFIG_BAREUDP) += bareudp.o
> diff --git a/drivers/net/virtio_net/Makefile b/drivers/net/virtio_net/Makefile
> new file mode 100644
> index 000000000000..c0a4725ddd69
> --- /dev/null
> +++ b/drivers/net/virtio_net/Makefile
> @@ -0,0 +1,8 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for the VirtIO Net driver
> +#
> +
> +obj-$(CONFIG_VIRTIO_NET) += virtio_net.o
> +
> +virtio_net-objs := virtio_net_main.o
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net/virtio_net_main.c
> similarity index 100%
> rename from drivers/net/virtio_net.c
> rename to drivers/net/virtio_net/virtio_net_main.c
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support
  2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (10 preceding siblings ...)
  2025-09-23 14:19 ` [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
@ 2025-09-25 21:19 ` Michael S. Tsirkin
  11 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:19 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Tue, Sep 23, 2025 at 09:19:09AM -0500, Daniel Jurgens wrote:
> This series implements ethtool flow rules support for virtio_net using the
> virtio flow filter (FF) specification. The implementation allows users to
> configure packet filtering rules through ethtool commands, directing
> packets to specific receive queues, or dropping them based on various
> header fields.
> 
> The series starts with infrastructure changes to expose virtio PCI admin
> capabilities and object management APIs. It then creates the virtio_net
> directory structure and implements the flow filter functionality with support
> for:



ok i took a quick look as you asked

main things:


	1. I am not sure device output is validated sufficiently
	and can not cause all kind of overflows
	can be an issues esp for coco


	2. avoid u8* just to do pointer math. void* is better for this.



	

> - Layer 2 (Ethernet) flow rules
> - IPv4 and IPv6 flow rules  
> - TCP and UDP flow rules (both IPv4 and IPv6)
> - Rule querying and management operations
> 
> Setting, deleting and viewing flow filters, -1 action is drop, postive
> integers steer to that RQ:
> 
> $ ethtool -u ens9
> 4 RX rings available
> Total 0 rules
> 
> $ ethtool -U ens9 flow-type ether src 1c:34:da:4a:33:dd action 0
> Added rule with ID 0
> $ ethtool -U ens9 flow-type udp4 dst-port 5001 action 3
> Added rule with ID 1
> $ ethtool -U ens9 flow-type tcp6 src-ip fc00::2 dst-port 5001 action 2
> Added rule with ID 2
> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action 1
> Added rule with ID 3
> $ ethtool -U ens9 flow-type ip6 dst-ip fc00::1 action -1
> Added rule with ID 4
> $ ethtool -U ens9 flow-type ip6 src-ip fc00::2 action -1
> Added rule with ID 5
> $ ethtool -U ens9 delete 4
> $ ethtool -u ens9
> 4 RX rings available
> Total 5 rules
> 
> Filter: 0
>         Flow Type: Raw Ethernet
>         Src MAC addr: 1C:34:DA:4A:33:DD mask: 00:00:00:00:00:00
>         Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
>         Ethertype: 0x0 mask: 0xFFFF
>         Action: Direct to queue 0
> 
> Filter: 1
>         Rule Type: UDP over IPv4
>         Src IP addr: 0.0.0.0 mask: 255.255.255.255
>         Dest IP addr: 0.0.0.0 mask: 255.255.255.255
>         TOS: 0x0 mask: 0xff
>         Src port: 0 mask: 0xffff
>         Dest port: 5001 mask: 0x0
>         Action: Direct to queue 3
> 
> Filter: 2
>         Rule Type: TCP over IPv6
>         Src IP addr: fc00::2 mask: ::
>         Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
>         Traffic Class: 0x0 mask: 0xff
>         Src port: 0 mask: 0xffff
>         Dest port: 5001 mask: 0x0
>         Action: Direct to queue 2
> 
> Filter: 3
>         Rule Type: Raw IPv4
>         Src IP addr: 192.168.51.101 mask: 0.0.0.0
>         Dest IP addr: 0.0.0.0 mask: 255.255.255.255
>         TOS: 0x0 mask: 0xff
>         Protocol: 0 mask: 0xff
>         L4 bytes: 0x0 mask: 0xffffffff
>         Action: Direct to queue 1
> 
> Filter: 5
>         Rule Type: Raw IPv6
>         Src IP addr: fc00::2 mask: ::
>         Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
>         Traffic Class: 0x0 mask: 0xff
>         Protocol: 0 mask: 0xff
>         L4 bytes: 0x0 mask: 0xffffffff
>         Action: Drop
> 
> v2:
>   - Fix sparse warnings
>   - Fix memory leak on subsequent failure to allocate
>   - Fix some Typos
> 
> v3:
>   - Rebased
> 	- Added back get|set_rxnfc to virtio_net
>   - Added admin_ops to virtio_device kdoc.
> 
> Daniel Jurgens (11):
>   virtio-pci: Expose generic device capability operations
>   virtio-pci: Expose object create and destroy API
>   virtio_net: Create virtio_net directory
>   virtio_net: Query and set flow filter caps
>   virtio_net: Create a FF group for ethtool steering
>   virtio_net: Implement layer 2 ethtool flow rules
>   virtio_net: Use existing classifier if possible
>   virtio_net: Implement IPv4 ethtool flow rules
>   virtio_net: Add support for IPv6 ethtool steering
>   virtio_net: Add support for TCP and UDP ethtool rules
>   virtio_net: Add get ethtool flow rules ops
> 
>  MAINTAINERS                                   |    2 +-
>  drivers/net/Makefile                          |    2 +-
>  drivers/net/virtio_net/Makefile               |    8 +
>  drivers/net/virtio_net/virtio_net_ff.c        | 1029 +++++++++++++++++
>  drivers/net/virtio_net/virtio_net_ff.h        |   42 +
>  .../virtio_net_main.c}                        |   46 +
>  drivers/vfio/pci/virtio/migrate.c             |    8 +-
>  drivers/virtio/virtio.c                       |  141 +++
>  drivers/virtio/virtio_pci_common.h            |    1 -
>  drivers/virtio/virtio_pci_modern.c            |  320 ++---
>  include/linux/virtio.h                        |   22 +
>  include/linux/virtio_admin.h                  |  101 ++
>  include/linux/virtio_pci_admin.h              |    7 +-
>  include/uapi/linux/virtio_net_ff.h            |   82 ++
>  include/uapi/linux/virtio_pci.h               |    7 +-
>  15 files changed, 1677 insertions(+), 141 deletions(-)
>  create mode 100644 drivers/net/virtio_net/Makefile
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.c
>  create mode 100644 drivers/net/virtio_net/virtio_net_ff.h
>  rename drivers/net/{virtio_net.c => virtio_net/virtio_net_main.c} (99%)
>  create mode 100644 include/linux/virtio_admin.h
>  create mode 100644 include/uapi/linux/virtio_net_ff.h
> 
> -- 
> 2.45.0


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules
  2025-09-25 21:13     ` Dan Jurgens
@ 2025-09-25 21:20       ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-25 21:20 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Thu, Sep 25, 2025 at 04:13:42PM -0500, Dan Jurgens wrote:
> On 9/25/25 3:53 PM, Michael S. Tsirkin wrote:
> > On Tue, Sep 23, 2025 at 09:19:17AM -0500, Daniel Jurgens wrote:
> >> Add support for IP_USER type rules from ethtool.
> >>
> >> +static
> >> +struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
> >> +{
> >> +	void *nextsel;
> >> +
> >> +	nextsel = (u8 *)sel + sizeof(struct virtio_net_ff_selector) +
> >> +		  sel->length;
> > 
> > you do not need this variable. and cast to void* looks cleaner imho.
> 
> It's cast to u8* so we do pointer arithmetic in bytes, which is not
> standard C on void*. GCC doesn't mind, but I thing Clang does.
> 
> I saw you had a similar comment on a subsequent too.>

it's a known C standard bug.

kernel does void* math everywhere.

Linus hath spoken on this ;)


> >> +
> >> +	return nextsel;
> >> +}
> >> +
> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-25 21:01   ` Michael S. Tsirkin
@ 2025-09-26  2:12     ` Dan Jurgens
  2025-09-27  9:02       ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-26  2:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 4:01 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
>> When probing a virtnet device, attempt to read the flow filter
>> capabilities. In order to use the feature the caps must also
>> be set. For now setting what was read is sufficient.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>

>> +	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
>> +	sel = &ff->ff_mask->selectors[0];
>> +
>> +	for (int i = 0; i < ff->ff_mask->count; i++) {
> 
> 
> I do not think kernel style allows this int inside loop.
> I think you need to declare it at the beginning of the block.
> 

checkpatch didn't mind and there are many other instances where it's done.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-25 21:16   ` Michael S. Tsirkin
@ 2025-09-26  4:54     ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-26  4:54 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 4:16 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
>> When probing a virtnet device, attempt to read the flow filter
>> capabilities. In order to use the feature the caps must also
>> be set. For now setting what was read is sufficient.
>> +	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
>> +	sel = &ff->ff_mask->selectors[0];
>> +
>> +	for (int i = 0; i < ff->ff_mask->count; i++) {
> 
> i think kernel prefers variables at beginning of the block

i was already declared. Removed this.

> 
>> +		ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
> 
> do we know this will not overflow?

length is u8, ff_mask_size is size_t, so probably not, but I added a
check of length against the max length we expect.

> 
> 

> 


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-24  6:22     ` Michael S. Tsirkin
  2025-09-24 19:02       ` Dan Jurgens
@ 2025-09-26  4:55       ` Jason Wang
  2025-09-26 14:26         ` Michael S. Tsirkin
  1 sibling, 1 reply; 58+ messages in thread
From: Jason Wang @ 2025-09-26  4:55 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Daniel Jurgens, netdev, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Wed, Sep 24, 2025 at 2:22 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Wed, Sep 24, 2025 at 09:16:32AM +0800, Jason Wang wrote:
> > On Tue, Sep 23, 2025 at 10:20 PM Daniel Jurgens <danielj@nvidia.com> wrote:
> > >
> > > Currently querying and setting capabilities is restricted to a single
> > > capability and contained within the virtio PCI driver. However, each
> > > device type has generic and device specific capabilities, that may be
> > > queried and set. In subsequent patches virtio_net will query and set
> > > flow filter capabilities.
> > >
> > > Move the admin related definitions to a new header file. It needs to be
> > > abstracted away from the PCI specifics to be used by upper layer
> > > drivers.
> > >
> > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> > > Reviewed-by: Parav Pandit <parav@nvidia.com>
> > > Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> > > Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
> > > ---
> >
> > [...]
> >
> > >
> > >  size_t virtio_max_dma_size(const struct virtio_device *vdev);
> > >
> > > diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> > > new file mode 100644
> > > index 000000000000..bbf543d20be4
> > > --- /dev/null
> > > +++ b/include/linux/virtio_admin.h
> > > @@ -0,0 +1,68 @@
> > > +/* SPDX-License-Identifier: GPL-2.0-only
> > > + *
> > > + * Header file for virtio admin operations
> > > + */
> > > +#include <uapi/linux/virtio_pci.h>
> > > +
> > > +#ifndef _LINUX_VIRTIO_ADMIN_H
> > > +#define _LINUX_VIRTIO_ADMIN_H
> > > +
> > > +struct virtio_device;
> > > +
> > > +/**
> > > + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> > > + * @cap_list: Pointer to capability list structure containing supported_caps array
> > > + * @cap: Capability ID to check
> > > + *
> > > + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> > > + * where each bit represents a capability. Bit 0 of the first element represents
> > > + * capability ID 0, bit 1 represents capability ID 1, and so on.
> > > + *
> > > + * Return: 1 if capability is supported, 0 otherwise
> > > + */
> > > +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> > > +       (!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
> > > +
> > > +/**
> > > + * struct virtio_admin_ops - Operations for virtio admin functionality
> > > + *
> > > + * This structure contains function pointers for performing administrative
> > > + * operations on virtio devices. All data and caps pointers must be allocated
> > > + * on the heap by the caller.
> > > + */
> > > +struct virtio_admin_ops {
> > > +       /**
> > > +        * @cap_id_list_query: Query the list of supported capability IDs
> > > +        * @vdev: The virtio device to query
> > > +        * @data: Pointer to result structure (must be heap allocated)
> > > +        * Return: 0 on success, negative error code on failure
> > > +        */
> > > +       int (*cap_id_list_query)(struct virtio_device *vdev,
> > > +                                struct virtio_admin_cmd_query_cap_id_result *data);
> > > +       /**
> > > +        * @cap_get: Get capability data for a specific capability ID
> > > +        * @vdev: The virtio device
> > > +        * @id: Capability ID to retrieve
> > > +        * @caps: Pointer to capability data structure (must be heap allocated)
> > > +        * @cap_size: Size of the capability data structure
> > > +        * Return: 0 on success, negative error code on failure
> > > +        */
> > > +       int (*cap_get)(struct virtio_device *vdev,
> > > +                      u16 id,
> > > +                      void *caps,
> > > +                      size_t cap_size);
> > > +       /**
> > > +        * @cap_set: Set capability data for a specific capability ID
> > > +        * @vdev: The virtio device
> > > +        * @id: Capability ID to set
> > > +        * @caps: Pointer to capability data structure (must be heap allocated)
> > > +        * @cap_size: Size of the capability data structure
> > > +        * Return: 0 on success, negative error code on failure
> > > +        */
> > > +       int (*cap_set)(struct virtio_device *vdev,
> > > +                      u16 id,
> > > +                      const void *caps,
> > > +                      size_t cap_size);
> > > +};
> >
> > Looking at this, it's nothing admin virtqueue specific, I wonder why
> > it is not part of virtio_config_ops.
> >
> > Thanks
>
> cap things are admin commands. But what I do not get is why they
> need to be callbacks.
>
> The only thing about admin commands that is pci specific is finding
> the admin vq.

I think we had a discussion to decide to separate admin commands from
the admin vq.

Thanks


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-26  4:55       ` Jason Wang
@ 2025-09-26 14:26         ` Michael S. Tsirkin
  2025-09-26 15:08           ` Dan Jurgens
  0 siblings, 1 reply; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-26 14:26 UTC (permalink / raw)
  To: Jason Wang
  Cc: Daniel Jurgens, netdev, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet, Yishai Hadas

On Fri, Sep 26, 2025 at 12:55:11PM +0800, Jason Wang wrote:
> > > Looking at this, it's nothing admin virtqueue specific, I wonder why
> > > it is not part of virtio_config_ops.
> > >
> > > Thanks
> >
> > cap things are admin commands. But what I do not get is why they
> > need to be callbacks.
> >
> > The only thing about admin commands that is pci specific is finding
> > the admin vq.
> 
> I think we had a discussion to decide to separate admin commands from
> the admin vq.
> 
> Thanks

If what you are saying is that core should expose APIs to
submit admin commands, not to access admin vq, I think I agree.

-- 
MST


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations
  2025-09-26 14:26         ` Michael S. Tsirkin
@ 2025-09-26 15:08           ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-26 15:08 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jason Wang
  Cc: netdev, alex.williamson, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Yishai Hadas

On 9/26/25 9:26 AM, Michael S. Tsirkin wrote:
> On Fri, Sep 26, 2025 at 12:55:11PM +0800, Jason Wang wrote:
>>>> Looking at this, it's nothing admin virtqueue specific, I wonder why
>>>> it is not part of virtio_config_ops.
>>>>
>>>> Thanks
>>>
>>> cap things are admin commands. But what I do not get is why they
>>> need to be callbacks.
>>>
>>> The only thing about admin commands that is pci specific is finding
>>> the admin vq.
>>
>> I think we had a discussion to decide to separate admin commands from
>> the admin vq.
>>
>> Thanks
> 
> If what you are saying is that core should expose APIs to
> submit admin commands, not to access admin vq, I think I agree.
> 

Quick overview of what I did, to not waste a v4 if you don't agree.
Added config_ops->admin_cmd_exec. virtio_pci_modern registers
virtio_pci_modern_admin_cmd_exec to it.

Moved the logic that had been in virtio_pci_modern for building the
commands and returning the data to a new file file virtio_admin_commands.c.

That file has the 5 functions needed for cap and object creation.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
  2025-09-25 21:01   ` Michael S. Tsirkin
  2025-09-25 21:16   ` Michael S. Tsirkin
@ 2025-09-26 16:01   ` Simon Horman
  2025-09-26 18:08     ` Dan Jurgens
  2025-09-26 20:45   ` Jakub Kicinski
  3 siblings, 1 reply; 58+ messages in thread
From: Simon Horman @ 2025-09-26 16:01 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, mst, jasowang, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
> When probing a virtnet device, attempt to read the flow filter
> capabilities. In order to use the feature the caps must also
> be set. For now setting what was read is sufficient.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>

...

> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..a35533bf8377
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -0,0 +1,55 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;
> +};
> +
> +struct virtio_net_ff_selector {
> +	__u8 type;
> +	__u8 flags;
> +	__u8 reserved[2];
> +	__u8 length;
> +	__u8 reserved1[3];
> +	__u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	struct virtio_net_ff_selector selectors[];

Hi Daniel,

Sparse warns that the line above is an array of flexible structures.
I wonder if that can be addressed somehow.

> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
> +struct virtio_net_ff_actions {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 actions[];
> +};
> +#endif
> -- 
> 2.45.0
> 
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-26 16:01   ` Simon Horman
@ 2025-09-26 18:08     ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-26 18:08 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, mst, jasowang, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 9/26/25 11:01 AM, Simon Horman wrote:
> On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
>> When probing a virtnet device, attempt to read the flow filter
>> capabilities. In order to use the feature the caps must also
>> be set. For now setting what was read is sufficient.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> 
> ...
> 
>> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
>> new file mode 100644
>> index 000000000000..a35533bf8377
>> --- /dev/null
>> +++ b/include/uapi/linux/virtio_net_ff.h
>> @@ -0,0 +1,55 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
>> + *
>> + * Header file for virtio_net flow filters
>> + */
>> +#ifndef _LINUX_VIRTIO_NET_FF_H
>> +#define _LINUX_VIRTIO_NET_FF_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +
>> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
>> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
>> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
>> +
>> +struct virtio_net_ff_cap_data {
>> +	__le32 groups_limit;
>> +	__le32 classifiers_limit;
>> +	__le32 rules_limit;
>> +	__le32 rules_per_group_limit;
>> +	__u8 last_rule_priority;
>> +	__u8 selectors_per_classifier_limit;
>> +};
>> +
>> +struct virtio_net_ff_selector {
>> +	__u8 type;
>> +	__u8 flags;
>> +	__u8 reserved[2];
>> +	__u8 length;
>> +	__u8 reserved1[3];
>> +	__u8 mask[];
>> +};
>> +
>> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
>> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
>> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
>> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
>> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
>> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
>> +
>> +struct virtio_net_ff_cap_mask_data {
>> +	__u8 count;
>> +	__u8 reserved[7];
>> +	struct virtio_net_ff_selector selectors[];
> 
> Hi Daniel,
> 
> Sparse warns that the line above is an array of flexible structures.
> I wonder if that can be addressed somehow.

Right now it's aligned with the VirtIO spec. Changing the type to bytes
would satisfy the tool, but that's all.

> 
>> +};
>> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
>> +
>> +#define VIRTIO_NET_FF_ACTION_DROP 1
>> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
>> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
>> +struct virtio_net_ff_actions {
>> +	__u8 count;
>> +	__u8 reserved[7];
>> +	__u8 actions[];
>> +};
>> +#endif
>> -- 
>> 2.45.0
>>
>>


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
                     ` (2 preceding siblings ...)
  2025-09-26 16:01   ` Simon Horman
@ 2025-09-26 20:45   ` Jakub Kicinski
  3 siblings, 0 replies; 58+ messages in thread
From: Jakub Kicinski @ 2025-09-26 20:45 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, mst, jasowang, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, andrew+netdev,
	edumazet

On Tue, 23 Sep 2025 09:19:13 -0500 Daniel Jurgens wrote:
> +	struct virtio_admin_cmd_query_cap_id_result *cap_id_list __free(kfree) = NULL;

Please don't use the __free(), you already have an error path in this
function, what is the point. Plus

Quoting documentation:

  Using device-managed and cleanup.h constructs
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Netdev remains skeptical about promises of all "auto-cleanup" APIs,
  including even ``devm_`` helpers, historically. They are not the preferred
  style of implementation, merely an acceptable one.

  Use of ``guard()`` is discouraged within any function longer than 20 lines,
  ``scoped_guard()`` is considered more readable. Using normal lock/unlock is
  still (weakly) preferred.

  Low level cleanup constructs (such as ``__free()``) can be used when building
  APIs and helpers, especially scoped iterators. However, direct use of
  ``__free()`` within networking core and drivers is discouraged.
  Similar guidance applies to declaring variables mid-function.

See: https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#using-device-managed-and-cleanup-h-constructs

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
  2025-09-25 20:58   ` Michael S. Tsirkin
  2025-09-25 21:10   ` Michael S. Tsirkin
@ 2025-09-26 20:48   ` Jakub Kicinski
  2025-09-26 21:04     ` Dan Jurgens
  2 siblings, 1 reply; 58+ messages in thread
From: Jakub Kicinski @ 2025-09-26 20:48 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, mst, jasowang, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, andrew+netdev,
	edumazet

On Tue, 23 Sep 2025 09:19:15 -0500 Daniel Jurgens wrote:
> Filtering a flow requires a classifier to match the packets, and a rule
> to filter on the matches.
> 
> A classifier consists of one or more selectors. There is one selector
> per header type. A selector must only use fields set in the selector
> capabality. If partial matching is supported, the classifier mask for a
> particular field can be a subset of the mask for that field in the
> capability.
> 
> The rule consists of a priority, an action and a key. The key is a byte
> array containing headers corresponding to the selectors in the
> classifier.
> 
> This patch implements ethtool rules for ethernet headers.

What does the spec say about ordering of the rules?
If the rules are not evaluated in an equivalent way to a linear
walk / match please support location == RX_CLS_LOC_ANY only

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-26 20:48   ` Jakub Kicinski
@ 2025-09-26 21:04     ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-26 21:04 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, mst, jasowang, alex.williamson, pabeni, virtualization,
	parav, shshitrit, yohadt, xuanzhuo, eperezma,
	shameerali.kolothum.thodi, jgg, kevin.tian, andrew+netdev,
	edumazet

On 9/26/25 3:48 PM, Jakub Kicinski wrote:
> On Tue, 23 Sep 2025 09:19:15 -0500 Daniel Jurgens wrote:
>> Filtering a flow requires a classifier to match the packets, and a rule
>> to filter on the matches.
>>
>> A classifier consists of one or more selectors. There is one selector
>> per header type. A selector must only use fields set in the selector
>> capabality. If partial matching is supported, the classifier mask for a
>> particular field can be a subset of the mask for that field in the
>> capability.
>>
>> The rule consists of a priority, an action and a key. The key is a byte
>> array containing headers corresponding to the selectors in the
>> classifier.
>>
>> This patch implements ethtool rules for ethernet headers.
> 
> What does the spec say about ordering of the rules?
> If the rules are not evaluated in an equivalent way to a linear
> walk / match please support location == RX_CLS_LOC_ANY only

It does support RX_CLS_LOC_ANY. Specifying a specific location is not
supported.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-25 20:58   ` Michael S. Tsirkin
@ 2025-09-27  4:45     ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-27  4:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 3:58 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:15AM -0500, Daniel Jurgens wrote:
>> Filtering a flow requires a classifier to match the packets, and a rule
>> to filter on the matches.
>>
>> +
>> +	cap = (struct ethhdr *)&sel_cap->mask;
>> +	mask = (struct ethhdr *)&sel->mask;
> 
> do we know they are big enough?
> 

We know they are big enough, we allocate the memory for each based on
the size of the headers for the type. We don't use sel_cap->len, which
is the length provided by the controller.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules
  2025-09-25 21:10   ` Michael S. Tsirkin
@ 2025-09-27  5:02     ` Dan Jurgens
  0 siblings, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-09-27  5:02 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 4:10 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:15AM -0500, Daniel Jurgens wrote:
>> Filtering a flow requires a classifier to match the packets, and a rule
>> to filter on the matches.

>> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
>> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
>> +	ff_rule->key_length = (u8)key_size;
> 
> Do we know that key size is <256?

We set key size based on sizeof headers even if all 5 available were in
the key it would still be less than 256.

> 
> 
>> +err_ff_rule:
>> +	kfree(ff_rule);
>> +err_eth_rule:
>> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
>> +	kfree(eth_rule);
> 
> This is a weird way to handle errors. You never added or allocated eth_rule,
> which are you erasing and freeing here?
> 
> 

Yes, it was left behind during some refactoring. Thanks.


>> +	c = kzalloc(classifier_size +
>> +		    sizeof(struct virtnet_classifier) -
>> +		    sizeof(struct virtio_net_resource_obj_ff_classifier),
> 
> do we know all this math does not overflow?
> 

Yes, classifier size is based on size_ofs



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps
  2025-09-26  2:12     ` Dan Jurgens
@ 2025-09-27  9:02       ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-27  9:02 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Thu, Sep 25, 2025 at 09:12:48PM -0500, Dan Jurgens wrote:
> On 9/25/25 4:01 PM, Michael S. Tsirkin wrote:
> > On Tue, Sep 23, 2025 at 09:19:13AM -0500, Daniel Jurgens wrote:
> >> When probing a virtnet device, attempt to read the flow filter
> >> capabilities. In order to use the feature the caps must also
> >> be set. For now setting what was read is sufficient.
> >>
> >> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> 
> >> +	ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
> >> +	sel = &ff->ff_mask->selectors[0];
> >> +
> >> +	for (int i = 0; i < ff->ff_mask->count; i++) {
> > 
> > 
> > I do not think kernel style allows this int inside loop.
> > I think you need to declare it at the beginning of the block.
> > 
> 
> checkpatch didn't mind and there are many other instances where it's done.

That's because checkpatch is kernel-wide. But netdev has some special rules:

https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#using-device-managed-and-cleanup-h-constructs

says:

"... is discouraged. Similar guidance applies to declaring variables mid-function."




^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops
  2025-09-25 20:44   ` Michael S. Tsirkin
@ 2025-09-28  4:39     ` Dan Jurgens
  2025-09-28  6:19       ` Michael S. Tsirkin
  0 siblings, 1 reply; 58+ messages in thread
From: Dan Jurgens @ 2025-09-28  4:39 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 3:44 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:20AM -0500, Daniel Jurgens wrote:
>> - Get total number of rules. There's no user interface for this. It is

>> +int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
>> +				   struct ethtool_rxnfc *info)
>> +{
>> +	if (!ff->ff_supported)
>> +		return -EOPNOTSUPP;
>> +
>> +	info->rule_cnt = ff->ethtool.num_rules;
>> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;
> 
> hmm. what if rules_limit has the high bit set?
> or matches any of
> #define RX_CLS_LOC_ANY          0xffffffff
> #define RX_CLS_LOC_FIRST        0xfffffffe
> #define RX_CLS_LOC_LAST         0xfffffffd
> by chance?
> 

FIRST, LAST, and ANY are only used in the insert rule flows (in the
userspace tool, not the kernel).

SPECIAL is used get flow count to advertise the capability that we
support ANY on insert.

Since we do support ANY on insert there's no harm if rules limit has the
high bit set.

As a practical matter I can't imagine a rules within 3 orders of
magnitude of 2B.

If you'd like I can mask off that bit setting the caps, but I don't
think it's needed.



^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops
  2025-09-28  4:39     ` Dan Jurgens
@ 2025-09-28  6:19       ` Michael S. Tsirkin
  0 siblings, 0 replies; 58+ messages in thread
From: Michael S. Tsirkin @ 2025-09-28  6:19 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On Sat, Sep 27, 2025 at 11:39:28PM -0500, Dan Jurgens wrote:
> As a practical matter I can't imagine a rules within 3 orders of
> magnitude of 2B.

Thinking about malicious devices here - virtio is commonly used with
CoCo. I don't specifically ask to validate everything if bugs are
not exploitable. Just that you keep this usecase in mind.


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules
  2025-09-25 20:53   ` Michael S. Tsirkin
  2025-09-25 21:13     ` Dan Jurgens
@ 2025-10-01 14:15     ` Dan Jurgens
  1 sibling, 0 replies; 58+ messages in thread
From: Dan Jurgens @ 2025-10-01 14:15 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, alex.williamson, pabeni, virtualization, parav,
	shshitrit, yohadt, xuanzhuo, eperezma, shameerali.kolothum.thodi,
	jgg, kevin.tian, kuba, andrew+netdev, edumazet

On 9/25/25 3:53 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 23, 2025 at 09:19:17AM -0500, Daniel Jurgens wrote:
>> Add support for IP_USER type rules from ethtool.

>> +static bool validate_ip4_mask(const struct virtnet_ff *ff,
>> +			      const struct virtio_net_ff_selector *sel,
>> +			      const struct virtio_net_ff_selector *sel_cap)
> 
> I'd prefer that all functions have virtnet prefix,
> avoid polluting the global namespace.
> 
How do static functions pollute the global namespace?


>>  
>> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> +		      const struct ethtool_rx_flow_spec *fs)

>> +	key->daddr = l3_val->ip4dst;
>> +
>> +	if (mask->protocol) {
>> +		mask->protocol = l3_mask->proto;
> 
> Is this right? You just checked mask->protocol and are
> now overriding it?
> 

Right, should be l3_mask->protocol. Our controller was setting based on
types so this wasn't exposed as a bug.



^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2025-10-01 14:15 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-23 14:19 [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Daniel Jurgens
2025-09-23 14:19 ` [PATCH net-next v3 01/11] virtio-pci: Expose generic device capability operations Daniel Jurgens
2025-09-24  1:16   ` Jason Wang
2025-09-24  6:22     ` Michael S. Tsirkin
2025-09-24 19:02       ` Dan Jurgens
2025-09-25  6:16         ` Michael S. Tsirkin
2025-09-25  9:51           ` Parav Pandit
2025-09-25 10:35             ` Michael S. Tsirkin
2025-09-25 10:45               ` Parav Pandit
2025-09-25 11:49                 ` Michael S. Tsirkin
2025-09-25 12:09                   ` Parav Pandit
2025-09-25 13:08                     ` Michael S. Tsirkin
2025-09-25 16:53                       ` Dan Jurgens
2025-09-25 16:55                         ` Michael S. Tsirkin
2025-09-26  4:55       ` Jason Wang
2025-09-26 14:26         ` Michael S. Tsirkin
2025-09-26 15:08           ` Dan Jurgens
2025-09-24  6:16   ` Michael S. Tsirkin
2025-09-23 14:19 ` [PATCH net-next v3 02/11] virtio-pci: Expose object create and destroy API Daniel Jurgens
2025-09-23 14:19 ` [PATCH net-next v3 03/11] virtio_net: Create virtio_net directory Daniel Jurgens
2025-09-25  3:56   ` Xuan Zhuo
2025-09-25  6:13     ` Michael S. Tsirkin
2025-09-25 15:48       ` Dan Jurgens
2025-09-25 20:10         ` Michael S. Tsirkin
2025-09-25 21:17   ` Michael S. Tsirkin
2025-09-23 14:19 ` [PATCH net-next v3 04/11] virtio_net: Query and set flow filter caps Daniel Jurgens
2025-09-25 21:01   ` Michael S. Tsirkin
2025-09-26  2:12     ` Dan Jurgens
2025-09-27  9:02       ` Michael S. Tsirkin
2025-09-25 21:16   ` Michael S. Tsirkin
2025-09-26  4:54     ` Dan Jurgens
2025-09-26 16:01   ` Simon Horman
2025-09-26 18:08     ` Dan Jurgens
2025-09-26 20:45   ` Jakub Kicinski
2025-09-23 14:19 ` [PATCH net-next v3 05/11] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
2025-09-25 21:13   ` Michael S. Tsirkin
2025-09-23 14:19 ` [PATCH net-next v3 06/11] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
2025-09-25 20:58   ` Michael S. Tsirkin
2025-09-27  4:45     ` Dan Jurgens
2025-09-25 21:10   ` Michael S. Tsirkin
2025-09-27  5:02     ` Dan Jurgens
2025-09-26 20:48   ` Jakub Kicinski
2025-09-26 21:04     ` Dan Jurgens
2025-09-23 14:19 ` [PATCH net-next v3 07/11] virtio_net: Use existing classifier if possible Daniel Jurgens
2025-09-25 20:53   ` Michael S. Tsirkin
2025-09-23 14:19 ` [PATCH net-next v3 08/11] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
2025-09-25 20:53   ` Michael S. Tsirkin
2025-09-25 21:13     ` Dan Jurgens
2025-09-25 21:20       ` Michael S. Tsirkin
2025-10-01 14:15     ` Dan Jurgens
2025-09-23 14:19 ` [PATCH net-next v3 09/11] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
2025-09-25 20:47   ` Michael S. Tsirkin
2025-09-23 14:19 ` [PATCH net-next v3 10/11] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
2025-09-23 14:19 ` [PATCH net-next v3 11/11] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
2025-09-25 20:44   ` Michael S. Tsirkin
2025-09-28  4:39     ` Dan Jurgens
2025-09-28  6:19       ` Michael S. Tsirkin
2025-09-25 21:19 ` [PATCH net-next v3 00/11] virtio_net: Add ethtool flow rules support Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).