[PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support
@ 2025-11-18 14:38 Daniel Jurgens
  2025-11-18 14:38 ` [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
                   ` (11 more replies)
  0 siblings, 12 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

This series implements ethtool flow rules support for virtio_net using the
virtio flow filter (FF) specification. The implementation allows users to
configure packet filtering rules through ethtool commands, directing
packets to specific receive queues, or dropping them based on various
header fields.

The series starts with infrastructure changes to expose virtio PCI admin
capabilities and object management APIs. It then creates the virtio_net
directory structure and implements the flow filter functionality with support
for:

- Layer 2 (Ethernet) flow rules
- IPv4 and IPv6 flow rules  
- TCP and UDP flow rules (both IPv4 and IPv6)
- Rule querying and management operations

Setting, deleting and viewing flow filters, -1 action is drop, positive
integers steer to that RQ:

$ ethtool -u ens9
4 RX rings available
Total 0 rules

$ ethtool -U ens9 flow-type ether src 1c:34:da:4a:33:dd action 0
Added rule with ID 0
$ ethtool -U ens9 flow-type udp4 dst-port 5001 action 3
Added rule with ID 1
$ ethtool -U ens9 flow-type tcp6 src-ip fc00::2 dst-port 5001 action 2
Added rule with ID 2
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action 1
Added rule with ID 3
$ ethtool -U ens9 flow-type ip6 dst-ip fc00::1 action -1
Added rule with ID 4
$ ethtool -U ens9 flow-type ip6 src-ip fc00::2 action -1
Added rule with ID 5
$ ethtool -U ens9 delete 4
$ ethtool -u ens9
4 RX rings available
Total 5 rules

Filter: 0
        Flow Type: Raw Ethernet
        Src MAC addr: 1C:34:DA:4A:33:DD mask: 00:00:00:00:00:00
        Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
        Ethertype: 0x0 mask: 0xFFFF
        Action: Direct to queue 0

Filter: 1
        Rule Type: UDP over IPv4
        Src IP addr: 0.0.0.0 mask: 255.255.255.255
        Dest IP addr: 0.0.0.0 mask: 255.255.255.255
        TOS: 0x0 mask: 0xff
        Src port: 0 mask: 0xffff
        Dest port: 5001 mask: 0x0
        Action: Direct to queue 3

Filter: 2
        Rule Type: TCP over IPv6
        Src IP addr: fc00::2 mask: ::
        Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
        Traffic Class: 0x0 mask: 0xff
        Src port: 0 mask: 0xffff
        Dest port: 5001 mask: 0x0
        Action: Direct to queue 2

Filter: 3
        Rule Type: Raw IPv4
        Src IP addr: 192.168.51.101 mask: 0.0.0.0
        Dest IP addr: 0.0.0.0 mask: 255.255.255.255
        TOS: 0x0 mask: 0xff
        Protocol: 0 mask: 0xff
        L4 bytes: 0x0 mask: 0xffffffff
        Action: Direct to queue 1

Filter: 5
        Rule Type: Raw IPv6
        Src IP addr: fc00::2 mask: ::
        Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
        Traffic Class: 0x0 mask: 0xff
        Protocol: 0 mask: 0xff
        L4 bytes: 0x0 mask: 0xffffffff
        Action: Drop

---
v2: https://lore.kernel.org/netdev/20250908164046.25051-1-danielj@nvidia.com/
  - Fix sparse warnings
  - Fix memory leak on subsequent failure to allocate
  - Fix some Typos

v3: https://lore.kernel.org/netdev/20250923141920.283862-1-danielj@nvidia.com/
  - Added admin_ops to virtio_device kdoc.

v4:
  - Fixed double free bug inserting flows
  - Fixed incorrect protocol field check parsing ip4 headers.
  - (u8 *) changed to (void *)
  - Added kdoc comments to UAPI changes.
  - No longer split up virtio_net.c
  - Added config op to execute admin commands.
      - virtio_pci assigns vp_modern_admin_cmd_exec to this callback.
  - Moved admin command API to new core file virtio_admin_commands.c

v5: 
  - Fixed compile error
  - Fixed static analysis warning on () after macro
  - Added missing fields to kdoc comments
  - Aligned parameter name between prototype and kdoc

v6:
  - Fix sparse warning "array of flexible structures" Jakub K/Simon H
  - Use new variable and validate ff_mask_size before set_cap. MST

v7:
  - Change virtnet_ff_init to return a value. Allow -EOPNOTSUPP. Xuan
  - Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abini
  - Move for (int i removal hung back a patch. Paolo Abini

v8
  - Removed unused num_classifiers. Jason Wang
  - Use real_ff_mask_size when setting the selector caps. Jason Wang

v9:
  - Set err to -ENOMEM after alloc failures in virtnet_ff_init. Simon H

v10:
  - Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
    Jason Wang/Paolo Abeni

v11:
  - Return -EINVAL if any resource limit is 0. Simon Horman
  - Ensure we don't overrun alloced space of ff->ff_mask by moving the
    real_ff_mask_size > ff_mask_size check into the loop. Simon Horman

Daniel Jurgens (12):
  virtio_pci: Remove supported_cap size build assert
  virtio: Add config_op for admin commands
  virtio: Expose generic device capability operations
  virtio: Expose object create and destroy API
  virtio_net: Query and set flow filter caps
  virtio_net: Create a FF group for ethtool steering
  virtio_net: Implement layer 2 ethtool flow rules
  virtio_net: Use existing classifier if possible
  virtio_net: Implement IPv4 ethtool flow rules
  virtio_net: Add support for IPv6 ethtool steering
  virtio_net: Add support for TCP and UDP ethtool rules
  virtio_net: Add get ethtool flow rules ops

 drivers/net/virtio_net.c               | 1154 ++++++++++++++++++++++++
 drivers/virtio/Makefile                |    2 +-
 drivers/virtio/virtio_admin_commands.c |  165 ++++
 drivers/virtio/virtio_pci_common.h     |    1 -
 drivers/virtio/virtio_pci_modern.c     |   10 +-
 include/linux/virtio_admin.h           |  125 +++
 include/linux/virtio_config.h          |    6 +
 include/uapi/linux/virtio_net_ff.h     |  156 ++++
 include/uapi/linux/virtio_pci.h        |    7 +-
 9 files changed, 1615 insertions(+), 11 deletions(-)
 create mode 100644 drivers/virtio/virtio_admin_commands.c
 create mode 100644 include/linux/virtio_admin.h
 create mode 100644 include/uapi/linux/virtio_net_ff.h

-- 
2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-19  7:38   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 02/12] virtio: Add config_op for admin commands Daniel Jurgens
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

The cap ID list can be more than 64 bits. Remove the build assert. Also
remove caching of the supported caps, it wasn't used.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

---
v4: New patch for V4
v5:
   - support_caps -> supported_caps (Alok Tiwari)
   - removed unused variable (test robot)
---
 drivers/virtio/virtio_pci_common.h | 1 -
 drivers/virtio/virtio_pci_modern.c | 8 +-------
 2 files changed, 1 insertion(+), 8 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 8cd01de27baf..fc26e035e7a6 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -48,7 +48,6 @@ struct virtio_pci_admin_vq {
 	/* Protects virtqueue access. */
 	spinlock_t lock;
 	u64 supported_cmds;
-	u64 supported_caps;
 	u8 max_dev_parts_objects;
 	struct ida dev_parts_ida;
 	/* Name of the admin queue: avq.$vq_index. */
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index dd0e65f71d41..ff11de5b3d69 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -304,7 +304,6 @@ virtio_pci_admin_cmd_dev_parts_objects_enable(struct virtio_device *virtio_dev)
 
 static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
 {
-	struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
 	struct virtio_admin_cmd_query_cap_id_result *data;
 	struct virtio_admin_cmd cmd = {};
 	struct scatterlist result_sg;
@@ -323,12 +322,7 @@ static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
 	if (ret)
 		goto end;
 
-	/* Max number of caps fits into a single u64 */
-	BUILD_BUG_ON(sizeof(data->supported_caps) > sizeof(u64));
-
-	vp_dev->admin_vq.supported_caps = le64_to_cpu(data->supported_caps[0]);
-
-	if (!(vp_dev->admin_vq.supported_caps & (1 << VIRTIO_DEV_PARTS_CAP)))
+	if (!(le64_to_cpu(data->supported_caps[0]) & (1 << VIRTIO_DEV_PARTS_CAP)))
 		goto end;
 
 	virtio_pci_admin_cmd_dev_parts_objects_enable(virtio_dev);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 02/12] virtio: Add config_op for admin commands
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
  2025-11-18 14:38 ` [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-19  7:36   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 03/12] virtio: Expose generic device capability operations Daniel Jurgens
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

This will allow device drivers to issue administration commands.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

---
v4: New patch for v4
---
 drivers/virtio/virtio_pci_modern.c | 2 ++
 include/linux/virtio_config.h      | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index ff11de5b3d69..acc3f958f96a 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -1236,6 +1236,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
 	.get_shm_region  = vp_get_shm_region,
 	.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
 	.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
+	.admin_cmd_exec = vp_modern_admin_cmd_exec,
 };
 
 static const struct virtio_config_ops virtio_pci_config_ops = {
@@ -1256,6 +1257,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
 	.get_shm_region  = vp_get_shm_region,
 	.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
 	.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
+	.admin_cmd_exec = vp_modern_admin_cmd_exec,
 };
 
 /* the PCI probing function */
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 16001e9f9b39..19606609254e 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -108,6 +108,10 @@ struct virtqueue_info {
  *	Returns 0 on success or error status
  *	If disable_vq_and_reset is set, then enable_vq_after_reset must also be
  *	set.
+ * @admin_cmd_exec: Execute an admin VQ command.
+ *	vdev: the virtio_device
+ *	cmd: the command to execute
+ *	Returns 0 on success or error status
  */
 struct virtio_config_ops {
 	void (*get)(struct virtio_device *vdev, unsigned offset,
@@ -137,6 +141,8 @@ struct virtio_config_ops {
 			       struct virtio_shm_region *region, u8 id);
 	int (*disable_vq_and_reset)(struct virtqueue *vq);
 	int (*enable_vq_after_reset)(struct virtqueue *vq);
+	int (*admin_cmd_exec)(struct virtio_device *vdev,
+			      struct virtio_admin_cmd *cmd);
 };
 
 /**
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 03/12] virtio: Expose generic device capability operations
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
  2025-11-18 14:38 ` [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
  2025-11-18 14:38 ` [PATCH net-next v11 02/12] virtio: Add config_op for admin commands Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 21:42   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 04/12] virtio: Expose object create and destroy API Daniel Jurgens
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Currently querying and setting capabilities is restricted to a single
capability and contained within the virtio PCI driver. However, each
device type has generic and device specific capabilities, that may be
queried and set. In subsequent patches virtio_net will query and set
flow filter capabilities.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

---
v4: Moved this logic from virtio_pci_modern to new file
    virtio_admin_commands.
---
 drivers/virtio/Makefile                |  2 +-
 drivers/virtio/virtio_admin_commands.c | 90 ++++++++++++++++++++++++++
 include/linux/virtio_admin.h           | 80 +++++++++++++++++++++++
 include/uapi/linux/virtio_pci.h        |  7 +-
 4 files changed, 176 insertions(+), 3 deletions(-)
 create mode 100644 drivers/virtio/virtio_admin_commands.c
 create mode 100644 include/linux/virtio_admin.h

diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index eefcfe90d6b8..2b4a204dde33 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
+obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
 obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
 obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
 obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
new file mode 100644
index 000000000000..94751d16b3c4
--- /dev/null
+++ b/drivers/virtio/virtio_admin_commands.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_admin.h>
+
+int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
+				   struct virtio_admin_cmd_query_cap_id_result *data)
+{
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist result_sg;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	sg_init_one(&result_sg, data, sizeof(*data));
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
+	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+	cmd.result_sg = &result_sg;
+
+	return vdev->config->admin_cmd_exec(vdev, &cmd);
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
+
+int virtio_admin_cap_get(struct virtio_device *vdev,
+			 u16 id,
+			 void *caps,
+			 size_t cap_size)
+{
+	struct virtio_admin_cmd_cap_get_data *data;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist result_sg;
+	struct scatterlist data_sg;
+	int err;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->id = cpu_to_le16(id);
+	sg_init_one(&data_sg, data, sizeof(*data));
+	sg_init_one(&result_sg, caps, cap_size);
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DEVICE_CAP_GET);
+	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+	cmd.data_sg = &data_sg;
+	cmd.result_sg = &result_sg;
+
+	err = vdev->config->admin_cmd_exec(vdev, &cmd);
+	kfree(data);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_get);
+
+int virtio_admin_cap_set(struct virtio_device *vdev,
+			 u16 id,
+			 const void *caps,
+			 size_t cap_size)
+{
+	struct virtio_admin_cmd_cap_set_data *data;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist data_sg;
+	size_t data_size;
+	int err;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	data_size = sizeof(*data) + cap_size;
+	data = kzalloc(data_size, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->id = cpu_to_le16(id);
+	memcpy(data->cap_specific_data, caps, cap_size);
+	sg_init_one(&data_sg, data, data_size);
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
+	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+	cmd.data_sg = &data_sg;
+	cmd.result_sg = NULL;
+
+	err = vdev->config->admin_cmd_exec(vdev, &cmd);
+	kfree(data);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
new file mode 100644
index 000000000000..36df97b6487a
--- /dev/null
+++ b/include/linux/virtio_admin.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Header file for virtio admin operations
+ */
+#include <uapi/linux/virtio_pci.h>
+
+#ifndef _LINUX_VIRTIO_ADMIN_H
+#define _LINUX_VIRTIO_ADMIN_H
+
+struct virtio_device;
+
+/**
+ * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
+ * @cap_list: Pointer to capability list structure containing supported_caps array
+ * @cap: Capability ID to check
+ *
+ * The cap_list contains a supported_caps array of little-endian 64-bit integers
+ * where each bit represents a capability. Bit 0 of the first element represents
+ * capability ID 0, bit 1 represents capability ID 1, and so on.
+ *
+ * Return: 1 if capability is supported, 0 otherwise
+ */
+#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
+	(!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
+
+/**
+ * virtio_admin_cap_id_list_query - Query the list of available capability IDs
+ * @vdev: The virtio device to query
+ * @data: Pointer to result structure (must be heap allocated)
+ *
+ * This function queries the virtio device for the list of available capability
+ * IDs that can be used with virtio_admin_cap_get() and virtio_admin_cap_set().
+ * The result is stored in the provided data structure.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability queries, or a negative error code on other failures.
+ */
+int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
+				   struct virtio_admin_cmd_query_cap_id_result *data);
+
+/**
+ * virtio_admin_cap_get - Get capability data for a specific capability ID
+ * @vdev: The virtio device
+ * @id: Capability ID to retrieve
+ * @caps: Pointer to capability data structure (must be heap allocated)
+ * @cap_size: Size of the capability data structure
+ *
+ * This function retrieves a specific capability from the virtio device.
+ * The capability data is stored in the provided buffer. The caller must
+ * ensure the buffer is large enough to hold the capability data.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability retrieval, or a negative error code on other failures.
+ */
+int virtio_admin_cap_get(struct virtio_device *vdev,
+			 u16 id,
+			 void *caps,
+			 size_t cap_size);
+
+/**
+ * virtio_admin_cap_set - Set capability data for a specific capability ID
+ * @vdev: The virtio device
+ * @id: Capability ID to set
+ * @caps: Pointer to capability data structure (must be heap allocated)
+ * @cap_size: Size of the capability data structure
+ *
+ * This function sets a specific capability on the virtio device.
+ * The capability data is read from the provided buffer and applied
+ * to the device. The device may validate the capability data before
+ * applying it.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability setting, or a negative error code on other failures.
+ */
+int virtio_admin_cap_set(struct virtio_device *vdev,
+			 u16 id,
+			 const void *caps,
+			 size_t cap_size);
+
+#endif /* _LINUX_VIRTIO_ADMIN_H */
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index c691ac210ce2..0d5ca0cff629 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -315,15 +315,18 @@ struct virtio_admin_cmd_notify_info_result {
 
 #define VIRTIO_DEV_PARTS_CAP 0x0000
 
+/* Update this value to largest implemented cap number. */
+#define VIRTIO_ADMIN_MAX_CAP 0x0fff
+
 struct virtio_dev_parts_cap {
 	__u8 get_parts_resource_objects_limit;
 	__u8 set_parts_resource_objects_limit;
 };
 
-#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
+#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP, 64)
 
 struct virtio_admin_cmd_query_cap_id_result {
-	__le64 supported_caps[MAX_CAP_ID];
+	__le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
 };
 
 struct virtio_admin_cmd_cap_get_data {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 04/12] virtio: Expose object create and destroy API
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (2 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 03/12] virtio: Expose generic device capability operations Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 22:14   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Object create and destroy were implemented specifically for dev parts
device objects. Create general purpose APIs for use by upper layer
drivers.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>

---
v4: Moved this logic from virtio_pci_modern to new file
    virtio_admin_commands.
v5: Added missing params, and synced names in comments (Alok Tiwari)
---
 drivers/virtio/virtio_admin_commands.c | 75 ++++++++++++++++++++++++++
 include/linux/virtio_admin.h           | 44 +++++++++++++++
 2 files changed, 119 insertions(+)

diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
index 94751d16b3c4..2b80548ba3bc 100644
--- a/drivers/virtio/virtio_admin_commands.c
+++ b/drivers/virtio/virtio_admin_commands.c
@@ -88,3 +88,78 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
 	return err;
 }
 EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
+
+int virtio_admin_obj_create(struct virtio_device *vdev,
+			    u16 obj_type,
+			    u32 obj_id,
+			    u16 group_type,
+			    u64 group_member_id,
+			    const void *obj_specific_data,
+			    size_t obj_specific_data_size)
+{
+	size_t data_size = sizeof(struct virtio_admin_cmd_resource_obj_create_data);
+	struct virtio_admin_cmd_resource_obj_create_data *obj_create_data;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist data_sg;
+	void *data;
+	int err;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	data_size += obj_specific_data_size;
+	data = kzalloc(data_size, GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	obj_create_data = data;
+	obj_create_data->hdr.type = cpu_to_le16(obj_type);
+	obj_create_data->hdr.id = cpu_to_le32(obj_id);
+	memcpy(obj_create_data->resource_obj_specific_data, obj_specific_data,
+	       obj_specific_data_size);
+	sg_init_one(&data_sg, data, data_size);
+
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_CREATE);
+	cmd.group_type = cpu_to_le16(group_type);
+	cmd.group_member_id = cpu_to_le64(group_member_id);
+	cmd.data_sg = &data_sg;
+
+	err = vdev->config->admin_cmd_exec(vdev, &cmd);
+	kfree(data);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_obj_create);
+
+int virtio_admin_obj_destroy(struct virtio_device *vdev,
+			     u16 obj_type,
+			     u32 obj_id,
+			     u16 group_type,
+			     u64 group_member_id)
+{
+	struct virtio_admin_cmd_resource_obj_cmd_hdr *data;
+	struct virtio_admin_cmd cmd = {};
+	struct scatterlist data_sg;
+	int err;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	data = kzalloc(sizeof(*data), GFP_KERNEL);
+	if (!data)
+		return -ENOMEM;
+
+	data->type = cpu_to_le16(obj_type);
+	data->id = cpu_to_le32(obj_id);
+	sg_init_one(&data_sg, data, sizeof(*data));
+	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_DESTROY);
+	cmd.group_type = cpu_to_le16(group_type);
+	cmd.group_member_id = cpu_to_le64(group_member_id);
+	cmd.data_sg = &data_sg;
+
+	err = vdev->config->admin_cmd_exec(vdev, &cmd);
+	kfree(data);
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_obj_destroy);
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
index 36df97b6487a..039b996f73ec 100644
--- a/include/linux/virtio_admin.h
+++ b/include/linux/virtio_admin.h
@@ -77,4 +77,48 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
 			 const void *caps,
 			 size_t cap_size);
 
+/**
+ * virtio_admin_obj_create - Create an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to create
+ * @obj_id: ID for the new object
+ * @group_type: administrative group type for the operation
+ * @group_member_id: member identifier within the administrative group
+ * @obj_specific_data: object-specific data for creation
+ * @obj_specific_data_size: size of the object-specific data in bytes
+ *
+ * Creates a new object on the virtio device with the specified type and ID.
+ * The object may require object-specific data for proper initialization.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object creation, or a negative error code on other failures.
+ */
+int virtio_admin_obj_create(struct virtio_device *vdev,
+			    u16 obj_type,
+			    u32 obj_id,
+			    u16 group_type,
+			    u64 group_member_id,
+			    const void *obj_specific_data,
+			    size_t obj_specific_data_size);
+
+/**
+ * virtio_admin_obj_destroy - Destroy an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to destroy
+ * @obj_id: ID of the object to destroy
+ * @group_type: administrative group type for the operation
+ * @group_member_id: member identifier within the administrative group
+ *
+ * Destroys an existing object on the virtio device with the specified type
+ * and ID.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object destruction, or a negative error code on other failures.
+ */
+int virtio_admin_obj_destroy(struct virtio_device *vdev,
+			     u16 obj_type,
+			     u32 obj_id,
+			     u16 group_type,
+			     u64 group_member_id);
+
 #endif /* _LINUX_VIRTIO_ADMIN_H */
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (3 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 04/12] virtio: Expose object create and destroy API Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 22:06   ` Michael S. Tsirkin
                     ` (3 more replies)
  2025-11-18 14:38 ` [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
                   ` (6 subsequent siblings)
  11 siblings, 4 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

When probing a virtnet device, attempt to read the flow filter
capabilities. In order to use the feature the caps must also
be set. For now setting what was read is sufficient.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>

---
v4:
    - Validate the length in the selector caps
    - Removed __free usage.
    - Removed for(int.
v5:
    - Remove unneed () after MAX_SEL_LEN macro (test bot)
v6:
    - Fix sparse warning "array of flexible structures" Jakub K/Simon H
    - Use new variable and validate ff_mask_size before set_cap. MST
v7:
    - Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abeni
    - Return errors from virtnet_ff_init, -ENOTSUPP is not fatal. Xuan

v8:
    - Use real_ff_mask_size when setting the selector caps. Jason Wang

v9:
    - Set err after failed memory allocations. Simon Horman

v10:
    - Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
      Jason/Paolo.

v11:
    - Return -EINVAL if any resource limit is 0. Simon Horman
    - Ensure we don't overrun alloced space of ff->ff_mask by moving the
      real_ff_mask_size > ff_mask_size check into the loop. Simon Horman
---
 drivers/net/virtio_net.c           | 201 +++++++++++++++++++++++++++++
 include/linux/virtio_admin.h       |   1 +
 include/uapi/linux/virtio_net_ff.h |  91 +++++++++++++
 3 files changed, 293 insertions(+)
 create mode 100644 include/uapi/linux/virtio_net_ff.h

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index cfa006b88688..3615f45ac358 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -26,6 +26,9 @@
 #include <net/netdev_rx_queue.h>
 #include <net/netdev_queues.h>
 #include <net/xdp_sock_drv.h>
+#include <linux/virtio_admin.h>
+#include <net/ipv6.h>
+#include <net/ip.h>
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -281,6 +284,14 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
 	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
 };
 
+struct virtnet_ff {
+	struct virtio_device *vdev;
+	bool ff_supported;
+	struct virtio_net_ff_cap_data *ff_caps;
+	struct virtio_net_ff_cap_mask_data *ff_mask;
+	struct virtio_net_ff_actions *ff_actions;
+};
+
 #define VIRTNET_Q_TYPE_RX 0
 #define VIRTNET_Q_TYPE_TX 1
 #define VIRTNET_Q_TYPE_CQ 2
@@ -493,6 +504,8 @@ struct virtnet_info {
 	struct failover *failover;
 
 	u64 device_stats_cap;
+
+	struct virtnet_ff ff;
 };
 
 struct padded_vnet_hdr {
@@ -6774,6 +6787,183 @@ static const struct xdp_metadata_ops virtnet_xdp_metadata_ops = {
 	.xmo_rx_hash			= virtnet_xdp_rx_hash,
 };
 
+static size_t get_mask_size(u16 type)
+{
+	switch (type) {
+	case VIRTIO_NET_FF_MASK_TYPE_ETH:
+		return sizeof(struct ethhdr);
+	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+		return sizeof(struct iphdr);
+	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+		return sizeof(struct ipv6hdr);
+	case VIRTIO_NET_FF_MASK_TYPE_TCP:
+		return sizeof(struct tcphdr);
+	case VIRTIO_NET_FF_MASK_TYPE_UDP:
+		return sizeof(struct udphdr);
+	}
+
+	return 0;
+}
+
+#define MAX_SEL_LEN (sizeof(struct ipv6hdr))
+
+static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
+{
+	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
+			      sizeof(struct virtio_net_ff_selector) *
+			      VIRTIO_NET_FF_MASK_TYPE_MAX;
+	struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
+	struct virtio_net_ff_selector *sel;
+	size_t real_ff_mask_size;
+	int err;
+	int i;
+
+	if (!vdev->config->admin_cmd_exec)
+		return -EOPNOTSUPP;
+
+	cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
+	if (!cap_id_list)
+		return -ENOMEM;
+
+	err = virtio_admin_cap_id_list_query(vdev, cap_id_list);
+	if (err)
+		goto err_cap_list;
+
+	if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_RESOURCE_CAP) &&
+	      VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_SELECTOR_CAP) &&
+	      VIRTIO_CAP_IN_LIST(cap_id_list,
+				 VIRTIO_NET_FF_ACTION_CAP))) {
+		err = -EOPNOTSUPP;
+		goto err_cap_list;
+	}
+
+	ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
+	if (!ff->ff_caps) {
+		err = -ENOMEM;
+		goto err_cap_list;
+	}
+
+	err = virtio_admin_cap_get(vdev,
+				   VIRTIO_NET_FF_RESOURCE_CAP,
+				   ff->ff_caps,
+				   sizeof(*ff->ff_caps));
+
+	if (err)
+		goto err_ff;
+
+	if (!ff->ff_caps->groups_limit ||
+	    !ff->ff_caps->classifiers_limit ||
+	    !ff->ff_caps->rules_limit ||
+	    !ff->ff_caps->rules_per_group_limit) {
+		err = -EINVAL;
+		goto err_ff;
+	}
+
+	/* VIRTIO_NET_FF_MASK_TYPE start at 1 */
+	for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
+		ff_mask_size += get_mask_size(i);
+
+	ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
+	if (!ff->ff_mask) {
+		err = -ENOMEM;
+		goto err_ff;
+	}
+
+	err = virtio_admin_cap_get(vdev,
+				   VIRTIO_NET_FF_SELECTOR_CAP,
+				   ff->ff_mask,
+				   ff_mask_size);
+
+	if (err)
+		goto err_ff_mask;
+
+	ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
+					VIRTIO_NET_FF_ACTION_MAX,
+					GFP_KERNEL);
+	if (!ff->ff_actions) {
+		err = -ENOMEM;
+		goto err_ff_mask;
+	}
+
+	err = virtio_admin_cap_get(vdev,
+				   VIRTIO_NET_FF_ACTION_CAP,
+				   ff->ff_actions,
+				   sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+
+	if (err)
+		goto err_ff_action;
+
+	err = virtio_admin_cap_set(vdev,
+				   VIRTIO_NET_FF_RESOURCE_CAP,
+				   ff->ff_caps,
+				   sizeof(*ff->ff_caps));
+	if (err)
+		goto err_ff_action;
+
+	real_ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
+	sel = (void *)&ff->ff_mask->selectors[0];
+
+	for (i = 0; i < ff->ff_mask->count; i++) {
+		if (sel->length > MAX_SEL_LEN) {
+			err = -EINVAL;
+			goto err_ff_action;
+		}
+		real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
+		if (real_ff_mask_size > ff_mask_size) {
+			err = -EINVAL;
+			goto err_ff_action;
+		}
+		sel = (void *)sel + sizeof(*sel) + sel->length;
+	}
+
+	err = virtio_admin_cap_set(vdev,
+				   VIRTIO_NET_FF_SELECTOR_CAP,
+				   ff->ff_mask,
+				   real_ff_mask_size);
+	if (err)
+		goto err_ff_action;
+
+	err = virtio_admin_cap_set(vdev,
+				   VIRTIO_NET_FF_ACTION_CAP,
+				   ff->ff_actions,
+				   sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+	if (err)
+		goto err_ff_action;
+
+	ff->vdev = vdev;
+	ff->ff_supported = true;
+
+	kfree(cap_id_list);
+
+	return 0;
+
+err_ff_action:
+	kfree(ff->ff_actions);
+	ff->ff_actions = NULL;
+err_ff_mask:
+	kfree(ff->ff_mask);
+	ff->ff_mask = NULL;
+err_ff:
+	kfree(ff->ff_caps);
+	ff->ff_caps = NULL;
+err_cap_list:
+	kfree(cap_id_list);
+
+	return err;
+}
+
+static void virtnet_ff_cleanup(struct virtnet_ff *ff)
+{
+	if (!ff->ff_supported)
+		return;
+
+	kfree(ff->ff_actions);
+	kfree(ff->ff_mask);
+	kfree(ff->ff_caps);
+}
+
 static int virtnet_probe(struct virtio_device *vdev)
 {
 	int i, err = -ENOMEM;
@@ -7137,6 +7327,15 @@ static int virtnet_probe(struct virtio_device *vdev)
 	}
 	vi->guest_offloads_capable = vi->guest_offloads;
 
+	/* Initialize flow filters. Not supported is an acceptable and common
+	 * return code
+	 */
+	err = virtnet_ff_init(&vi->ff, vi->vdev);
+	if (err && err != -EOPNOTSUPP) {
+		rtnl_unlock();
+		goto free_unregister_netdev;
+	}
+
 	rtnl_unlock();
 
 	err = virtnet_cpu_notif_add(vi);
@@ -7152,6 +7351,7 @@ static int virtnet_probe(struct virtio_device *vdev)
 
 free_unregister_netdev:
 	unregister_netdev(dev);
+	virtnet_ff_cleanup(&vi->ff);
 free_failover:
 	net_failover_destroy(vi->failover);
 free_vqs:
@@ -7201,6 +7401,7 @@ static void virtnet_remove(struct virtio_device *vdev)
 	virtnet_free_irq_moder(vi);
 
 	unregister_netdev(vi->dev);
+	virtnet_ff_cleanup(&vi->ff);
 
 	net_failover_destroy(vi->failover);
 
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
index 039b996f73ec..db0f42346ca9 100644
--- a/include/linux/virtio_admin.h
+++ b/include/linux/virtio_admin.h
@@ -3,6 +3,7 @@
  * Header file for virtio admin operations
  */
 #include <uapi/linux/virtio_pci.h>
+#include <uapi/linux/virtio_net_ff.h>
 
 #ifndef _LINUX_VIRTIO_ADMIN_H
 #define _LINUX_VIRTIO_ADMIN_H
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
new file mode 100644
index 000000000000..bd7a194a9959
--- /dev/null
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -0,0 +1,91 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+ *
+ * Header file for virtio_net flow filters
+ */
+#ifndef _LINUX_VIRTIO_NET_FF_H
+#define _LINUX_VIRTIO_NET_FF_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+
+#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
+#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
+#define VIRTIO_NET_FF_ACTION_CAP 0x802
+
+/**
+ * struct virtio_net_ff_cap_data - Flow filter resource capability limits
+ * @groups_limit: maximum number of flow filter groups supported by the device
+ * @classifiers_limit: maximum number of classifiers supported by the device
+ * @rules_limit: maximum number of rules supported device-wide across all groups
+ * @rules_per_group_limit: maximum number of rules allowed in a single group
+ * @last_rule_priority: priority value associated with the lowest-priority rule
+ * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
+ *
+ * The limits are reported by the device and describe resource capacities for
+ * flow filters. Multi-byte fields are little-endian.
+ */
+struct virtio_net_ff_cap_data {
+	__le32 groups_limit;
+	__le32 classifiers_limit;
+	__le32 rules_limit;
+	__le32 rules_per_group_limit;
+	__u8 last_rule_priority;
+	__u8 selectors_per_classifier_limit;
+};
+
+/**
+ * struct virtio_net_ff_selector - Selector mask descriptor
+ * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
+ * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @length: size in bytes of @mask
+ * @reserved1: must be set to 0 by the driver and ignored by the device
+ * @mask: variable-length mask payload for @type, length given by @length
+ *
+ * A selector describes a header mask that a classifier can apply. The format
+ * of @mask depends on @type.
+ */
+struct virtio_net_ff_selector {
+	__u8 type;
+	__u8 flags;
+	__u8 reserved[2];
+	__u8 length;
+	__u8 reserved1[3];
+	__u8 mask[];
+};
+
+#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
+#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
+#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
+#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
+#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
+#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
+
+/**
+ * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
+ * @count: number of entries in @selectors
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @selectors: array of supported selector descriptors
+ */
+struct virtio_net_ff_cap_mask_data {
+	__u8 count;
+	__u8 reserved[7];
+	__u8 selectors[];
+};
+#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
+
+#define VIRTIO_NET_FF_ACTION_DROP 1
+#define VIRTIO_NET_FF_ACTION_RX_VQ 2
+#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
+/**
+ * struct virtio_net_ff_actions - Supported flow actions
+ * @count: number of supported actions in @actions
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
+ */
+struct virtio_net_ff_actions {
+	__u8 count;
+	__u8 reserved[7];
+	__u8 actions[];
+};
+#endif
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (4 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-19  9:36   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

All ethtool steering rules will go in one group, create it during
initialization.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Documented UAPI
---
 drivers/net/virtio_net.c           | 29 +++++++++++++++++++++++++++++
 include/uapi/linux/virtio_net_ff.h | 15 +++++++++++++++
 2 files changed, 44 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 3615f45ac358..900d597726f7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -284,6 +284,9 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
 	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
 };
 
+#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
+#define VIRTNET_FF_MAX_GROUPS 1
+
 struct virtnet_ff {
 	struct virtio_device *vdev;
 	bool ff_supported;
@@ -6812,6 +6815,7 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
 			      sizeof(struct virtio_net_ff_selector) *
 			      VIRTIO_NET_FF_MASK_TYPE_MAX;
+	struct virtio_net_resource_obj_ff_group ethtool_group = {};
 	struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
 	struct virtio_net_ff_selector *sel;
 	size_t real_ff_mask_size;
@@ -6895,6 +6899,12 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	if (le32_to_cpu(ff->ff_caps->groups_limit) < VIRTNET_FF_MAX_GROUPS) {
+		err = -ENOSPC;
+		goto err_ff_action;
+	}
+	ff->ff_caps->groups_limit = cpu_to_le32(VIRTNET_FF_MAX_GROUPS);
+
 	err = virtio_admin_cap_set(vdev,
 				   VIRTIO_NET_FF_RESOURCE_CAP,
 				   ff->ff_caps,
@@ -6932,6 +6942,19 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	ethtool_group.group_priority = cpu_to_le16(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+
+	/* Use priority for the object ID. */
+	err = virtio_admin_obj_create(vdev,
+				      VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+				      VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
+				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
+				      0,
+				      &ethtool_group,
+				      sizeof(ethtool_group));
+	if (err)
+		goto err_ff_action;
+
 	ff->vdev = vdev;
 	ff->ff_supported = true;
 
@@ -6959,6 +6982,12 @@ static void virtnet_ff_cleanup(struct virtnet_ff *ff)
 	if (!ff->ff_supported)
 		return;
 
+	virtio_admin_obj_destroy(ff->vdev,
+				 VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+				 VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
+				 VIRTIO_ADMIN_GROUP_TYPE_SELF,
+				 0);
+
 	kfree(ff->ff_actions);
 	kfree(ff->ff_mask);
 	kfree(ff->ff_caps);
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index bd7a194a9959..6d1f953c2b46 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -12,6 +12,8 @@
 #define VIRTIO_NET_FF_SELECTOR_CAP 0x801
 #define VIRTIO_NET_FF_ACTION_CAP 0x802
 
+#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+
 /**
  * struct virtio_net_ff_cap_data - Flow filter resource capability limits
  * @groups_limit: maximum number of flow filter groups supported by the device
@@ -88,4 +90,17 @@ struct virtio_net_ff_actions {
 	__u8 reserved[7];
 	__u8 actions[];
 };
+
+/**
+ * struct virtio_net_resource_obj_ff_group - Flow filter group object
+ * @group_priority: priority of the group used to order evaluation
+ *
+ * This structure is the payload for the VIRTIO_NET_RESOURCE_OBJ_FF_GROUP
+ * administrative object. Devices use @group_priority to order flow filter
+ * groups. Multi-byte fields are little-endian.
+ */
+struct virtio_net_resource_obj_ff_group {
+	__le16 group_priority;
+};
+
 #endif
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (5 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 19:01   ` Michael S. Tsirkin
  2025-11-19  9:26   ` Michael S. Tsirkin
  2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
                   ` (4 subsequent siblings)
  11 siblings, 2 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Filtering a flow requires a classifier to match the packets, and a rule
to filter on the matches.

A classifier consists of one or more selectors. There is one selector
per header type. A selector must only use fields set in the selector
capability. If partial matching is supported, the classifier mask for a
particular field can be a subset of the mask for that field in the
capability.

The rule consists of a priority, an action and a key. The key is a byte
array containing headers corresponding to the selectors in the
classifier.

This patch implements ethtool rules for ethernet headers.

Example:
$ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
Added rule with ID 1

The rule in the example directs received packets with the specified
destination MAC address to rq 30.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
    - Fixed double free bug in error flows
    - Build bug on for classifier struct ordering.
    - (u8 *) to (void *) casting.
    - Documentation in UAPI
    - Answered questions about overflow with no changes.
v6:
    - Fix sparse warning "array of flexible structures" Jakub K/Simon H
v7:
    - Move for (int i -> for (i hunk from next patch. Paolo Abeni
---
 drivers/net/virtio_net.c           | 462 +++++++++++++++++++++++++++++
 include/uapi/linux/virtio_net_ff.h |  50 ++++
 2 files changed, 512 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 900d597726f7..de1a23c71449 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -284,6 +284,11 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
 	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
 };
 
+struct virtnet_ethtool_ff {
+	struct xarray rules;
+	int    num_rules;
+};
+
 #define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
 #define VIRTNET_FF_MAX_GROUPS 1
 
@@ -293,8 +298,16 @@ struct virtnet_ff {
 	struct virtio_net_ff_cap_data *ff_caps;
 	struct virtio_net_ff_cap_mask_data *ff_mask;
 	struct virtio_net_ff_actions *ff_actions;
+	struct xarray classifiers;
+	int num_classifiers;
+	struct virtnet_ethtool_ff ethtool;
 };
 
+static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+				       struct ethtool_rx_flow_spec *fs,
+				       u16 curr_queue_pairs);
+static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
+
 #define VIRTNET_Q_TYPE_RX 0
 #define VIRTNET_Q_TYPE_TX 1
 #define VIRTNET_Q_TYPE_CQ 2
@@ -5653,6 +5666,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
 	return vi->curr_queue_pairs;
 }
 
+static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+
+	switch (info->cmd) {
+	case ETHTOOL_SRXCLSRLINS:
+		return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
+						   vi->curr_queue_pairs);
+	case ETHTOOL_SRXCLSRLDEL:
+		return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
+	}
+
+	return -EOPNOTSUPP;
+}
+
 static const struct ethtool_ops virtnet_ethtool_ops = {
 	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
 		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
@@ -5679,6 +5707,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
 	.get_rxfh_fields = virtnet_get_hashflow,
 	.set_rxfh_fields = virtnet_set_hashflow,
 	.get_rx_ring_count = virtnet_get_rx_ring_count,
+	.set_rxnfc = virtnet_set_rxnfc,
 };
 
 static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
@@ -6790,6 +6819,428 @@ static const struct xdp_metadata_ops virtnet_xdp_metadata_ops = {
 	.xmo_rx_hash			= virtnet_xdp_rx_hash,
 };
 
+struct virtnet_ethtool_rule {
+	struct ethtool_rx_flow_spec flow_spec;
+	u32 classifier_id;
+};
+
+/* The classifier struct must be the last field in this struct */
+struct virtnet_classifier {
+	size_t size;
+	u32 id;
+	struct virtio_net_resource_obj_ff_classifier classifier;
+};
+
+static_assert(sizeof(struct virtnet_classifier) ==
+	      ALIGN(offsetofend(struct virtnet_classifier, classifier),
+		    __alignof__(struct virtnet_classifier)),
+	      "virtnet_classifier: classifier must be the last member");
+
+static bool check_mask_vs_cap(const void *m, const void *c,
+			      u16 len, bool partial)
+{
+	const u8 *mask = m;
+	const u8 *cap = c;
+	int i;
+
+	for (i = 0; i < len; i++) {
+		if (partial && ((mask[i] & cap[i]) != mask[i]))
+			return false;
+		if (!partial && mask[i] != cap[i])
+			return false;
+	}
+
+	return true;
+}
+
+static
+struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
+						u8 selector_type)
+{
+	struct virtio_net_ff_selector *sel;
+	void *buf;
+	int i;
+
+	buf = &ff->ff_mask->selectors;
+	sel = buf;
+
+	for (i = 0; i < ff->ff_mask->count; i++) {
+		if (sel->type == selector_type)
+			return sel;
+
+		buf += sizeof(struct virtio_net_ff_selector) + sel->length;
+		sel = buf;
+	}
+
+	return NULL;
+}
+
+static bool validate_eth_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct ethhdr *cap, *mask;
+	struct ethhdr zeros = {};
+
+	cap = (struct ethhdr *)&sel_cap->mask;
+	mask = (struct ethhdr *)&sel->mask;
+
+	if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
+	    !check_mask_vs_cap(mask->h_dest, cap->h_dest,
+			       sizeof(mask->h_dest), partial_mask))
+		return false;
+
+	if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
+	    !check_mask_vs_cap(mask->h_source, cap->h_source,
+			       sizeof(mask->h_source), partial_mask))
+		return false;
+
+	if (mask->h_proto &&
+	    !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
+			       sizeof(__be16), partial_mask))
+		return false;
+
+	return true;
+}
+
+static bool validate_mask(const struct virtnet_ff *ff,
+			  const struct virtio_net_ff_selector *sel)
+{
+	struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
+
+	if (!sel_cap)
+		return false;
+
+	switch (sel->type) {
+	case VIRTIO_NET_FF_MASK_TYPE_ETH:
+		return validate_eth_mask(ff, sel, sel_cap);
+	}
+
+	return false;
+}
+
+static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+{
+	int err;
+
+	err = xa_alloc(&ff->classifiers, &c->id, c,
+		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
+		       GFP_KERNEL);
+	if (err)
+		return err;
+
+	err = virtio_admin_obj_create(ff->vdev,
+				      VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+				      c->id,
+				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
+				      0,
+				      &c->classifier,
+				      c->size);
+	if (err)
+		goto err_xarray;
+
+	return 0;
+
+err_xarray:
+	xa_erase(&ff->classifiers, c->id);
+
+	return err;
+}
+
+static void destroy_classifier(struct virtnet_ff *ff,
+			       u32 classifier_id)
+{
+	struct virtnet_classifier *c;
+
+	c = xa_load(&ff->classifiers, classifier_id);
+	if (c) {
+		virtio_admin_obj_destroy(ff->vdev,
+					 VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+					 c->id,
+					 VIRTIO_ADMIN_GROUP_TYPE_SELF,
+					 0);
+
+		xa_erase(&ff->classifiers, c->id);
+		kfree(c);
+	}
+}
+
+static void destroy_ethtool_rule(struct virtnet_ff *ff,
+				 struct virtnet_ethtool_rule *eth_rule)
+{
+	ff->ethtool.num_rules--;
+
+	virtio_admin_obj_destroy(ff->vdev,
+				 VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+				 eth_rule->flow_spec.location,
+				 VIRTIO_ADMIN_GROUP_TYPE_SELF,
+				 0);
+
+	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+	destroy_classifier(ff, eth_rule->classifier_id);
+	kfree(eth_rule);
+}
+
+static int insert_rule(struct virtnet_ff *ff,
+		       struct virtnet_ethtool_rule *eth_rule,
+		       u32 classifier_id,
+		       const u8 *key,
+		       size_t key_size)
+{
+	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
+	struct virtio_net_resource_obj_ff_rule *ff_rule;
+	int err;
+
+	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
+	if (!ff_rule)
+		return -ENOMEM;
+
+	/* Intentionally leave the priority as 0. All rules have the same
+	 * priority.
+	 */
+	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+	ff_rule->classifier_id = cpu_to_le32(classifier_id);
+	ff_rule->key_length = (u8)key_size;
+	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
+					     VIRTIO_NET_FF_ACTION_DROP :
+					     VIRTIO_NET_FF_ACTION_RX_VQ;
+	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
+					       cpu_to_le16(fs->ring_cookie) : 0;
+	memcpy(&ff_rule->keys, key, key_size);
+
+	err = virtio_admin_obj_create(ff->vdev,
+				      VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+				      fs->location,
+				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
+				      0,
+				      ff_rule,
+				      sizeof(*ff_rule) + key_size);
+	if (err)
+		goto err_ff_rule;
+
+	eth_rule->classifier_id = classifier_id;
+	ff->ethtool.num_rules++;
+	kfree(ff_rule);
+
+	return 0;
+
+err_ff_rule:
+	kfree(ff_rule);
+
+	return err;
+}
+
+static u32 flow_type_mask(u32 flow_type)
+{
+	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+}
+
+static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
+{
+	switch (fs->flow_type) {
+	case ETHER_FLOW:
+		return true;
+	}
+
+	return false;
+}
+
+static int validate_flow_input(struct virtnet_ff *ff,
+			       const struct ethtool_rx_flow_spec *fs,
+			       u16 curr_queue_pairs)
+{
+	/* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
+	if (fs->location != RX_CLS_LOC_ANY)
+		return -EOPNOTSUPP;
+
+	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
+	    fs->ring_cookie >= curr_queue_pairs)
+		return -EINVAL;
+
+	if (fs->flow_type != flow_type_mask(fs->flow_type))
+		return -EOPNOTSUPP;
+
+	if (!supported_flow_type(fs))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
+				 size_t *key_size, size_t *classifier_size,
+				 int *num_hdrs)
+{
+	*num_hdrs = 1;
+	*key_size = sizeof(struct ethhdr);
+	/*
+	 * The classifier size is the size of the classifier header, a selector
+	 * header for each type of header in the match criteria, and each header
+	 * providing the mask for matching against.
+	 */
+	*classifier_size = *key_size +
+			   sizeof(struct virtio_net_resource_obj_ff_classifier) +
+			   sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
+}
+
+static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
+				   u8 *key,
+				   const struct ethtool_rx_flow_spec *fs)
+{
+	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
+	struct ethhdr *eth_k = (struct ethhdr *)key;
+
+	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
+	selector->length = sizeof(struct ethhdr);
+
+	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+}
+
+static int
+validate_classifier_selectors(struct virtnet_ff *ff,
+			      struct virtio_net_resource_obj_ff_classifier *classifier,
+			      int num_hdrs)
+{
+	struct virtio_net_ff_selector *selector = (void *)classifier->selectors;
+	int i;
+
+	for (i = 0; i < num_hdrs; i++) {
+		if (!validate_mask(ff, selector))
+			return -EINVAL;
+
+		selector = (((void *)selector) + sizeof(*selector) +
+					selector->length);
+	}
+
+	return 0;
+}
+
+static int build_and_insert(struct virtnet_ff *ff,
+			    struct virtnet_ethtool_rule *eth_rule)
+{
+	struct virtio_net_resource_obj_ff_classifier *classifier;
+	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
+	struct virtio_net_ff_selector *selector;
+	struct virtnet_classifier *c;
+	size_t classifier_size;
+	size_t key_size;
+	int num_hdrs;
+	u8 *key;
+	int err;
+
+	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
+
+	key = kzalloc(key_size, GFP_KERNEL);
+	if (!key)
+		return -ENOMEM;
+
+	/*
+	 * virtio_net_ff_obj_ff_classifier is already included in the
+	 * classifier_size.
+	 */
+	c = kzalloc(classifier_size +
+		    sizeof(struct virtnet_classifier) -
+		    sizeof(struct virtio_net_resource_obj_ff_classifier),
+		    GFP_KERNEL);
+	if (!c) {
+		kfree(key);
+		return -ENOMEM;
+	}
+
+	c->size = classifier_size;
+	classifier = &c->classifier;
+	classifier->count = num_hdrs;
+	selector = (void *)&classifier->selectors[0];
+
+	setup_eth_hdr_key_mask(selector, key, fs);
+
+	err = validate_classifier_selectors(ff, classifier, num_hdrs);
+	if (err)
+		goto err_key;
+
+	err = setup_classifier(ff, c);
+	if (err)
+		goto err_classifier;
+
+	err = insert_rule(ff, eth_rule, c->id, key, key_size);
+	if (err) {
+		/* destroy_classifier will free the classifier */
+		destroy_classifier(ff, c->id);
+		goto err_key;
+	}
+
+	return 0;
+
+err_classifier:
+	kfree(c);
+err_key:
+	kfree(key);
+
+	return err;
+}
+
+static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+				       struct ethtool_rx_flow_spec *fs,
+				       u16 curr_queue_pairs)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	int err;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	err = validate_flow_input(ff, fs, curr_queue_pairs);
+	if (err)
+		return err;
+
+	eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
+	if (!eth_rule)
+		return -ENOMEM;
+
+	err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
+		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
+		       GFP_KERNEL);
+	if (err)
+		goto err_rule;
+
+	eth_rule->flow_spec = *fs;
+
+	err = build_and_insert(ff, eth_rule);
+	if (err)
+		goto err_xa;
+
+	return err;
+
+err_xa:
+	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+
+err_rule:
+	fs->location = RX_CLS_LOC_ANY;
+	kfree(eth_rule);
+
+	return err;
+}
+
+static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	int err = 0;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	eth_rule = xa_load(&ff->ethtool.rules, location);
+	if (!eth_rule) {
+		err = -ENOENT;
+		goto out;
+	}
+
+	destroy_ethtool_rule(ff, eth_rule);
+out:
+	return err;
+}
+
 static size_t get_mask_size(u16 type)
 {
 	switch (type) {
@@ -6955,6 +7406,8 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 	if (err)
 		goto err_ff_action;
 
+	xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
+	xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
 	ff->vdev = vdev;
 	ff->ff_supported = true;
 
@@ -6979,9 +7432,18 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
 
 static void virtnet_ff_cleanup(struct virtnet_ff *ff)
 {
+	struct virtnet_ethtool_rule *eth_rule;
+	unsigned long i;
+
 	if (!ff->ff_supported)
 		return;
 
+	xa_for_each(&ff->ethtool.rules, i, eth_rule)
+		destroy_ethtool_rule(ff, eth_rule);
+
+	xa_destroy(&ff->ethtool.rules);
+	xa_destroy(&ff->classifiers);
+
 	virtio_admin_obj_destroy(ff->vdev,
 				 VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
 				 VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index 6d1f953c2b46..c98aa4942bee 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -13,6 +13,8 @@
 #define VIRTIO_NET_FF_ACTION_CAP 0x802
 
 #define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
+#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
 
 /**
  * struct virtio_net_ff_cap_data - Flow filter resource capability limits
@@ -103,4 +105,52 @@ struct virtio_net_resource_obj_ff_group {
 	__le16 group_priority;
 };
 
+/**
+ * struct virtio_net_resource_obj_ff_classifier - Flow filter classifier object
+ * @count: number of selector entries in @selectors
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @selectors: array of selector descriptors that define match masks
+ *
+ * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER administrative object.
+ * Each selector describes a header mask used to match packets
+ * (see struct virtio_net_ff_selector). Selectors appear in the order they are
+ * to be applied.
+ */
+struct virtio_net_resource_obj_ff_classifier {
+	__u8 count;
+	__u8 reserved[7];
+	__u8 selectors[];
+};
+
+/**
+ * struct virtio_net_resource_obj_ff_rule - Flow filter rule object
+ * @group_id: identifier of the target flow filter group
+ * @classifier_id: identifier of the classifier referenced by this rule
+ * @rule_priority: relative priority of this rule within the group
+ * @key_length: number of bytes in @keys
+ * @action: action to perform, one of VIRTIO_NET_FF_ACTION_*
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @vq_index: RX virtqueue index for VIRTIO_NET_FF_ACTION_RX_VQ, 0 otherwise
+ * @reserved1: must be set to 0 by the driver and ignored by the device
+ * @keys: concatenated key bytes matching the classifier's selectors order
+ *
+ * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_RULE administrative object.
+ * @group_id and @classifier_id refer to previously created objects of types
+ * VIRTIO_NET_RESOURCE_OBJ_FF_GROUP and VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER
+ * respectively. The key bytes are compared against packet headers using the
+ * masks provided by the classifier's selectors. Multi-byte fields are
+ * little-endian.
+ */
+struct virtio_net_resource_obj_ff_rule {
+	__le32 group_id;
+	__le32 classifier_id;
+	__u8 rule_priority;
+	__u8 key_length; /* length of key in bytes */
+	__u8 action;
+	__u8 reserved;
+	__le16 vq_index;
+	__u8 reserved1[2];
+	__u8 keys[];
+};
+
 #endif
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (6 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 21:55   ` Michael S. Tsirkin
                     ` (2 more replies)
  2025-11-18 14:38 ` [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
                   ` (3 subsequent siblings)
  11 siblings, 3 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Classifiers can be used by more than one rule. If there is an existing
classifier, use it instead of creating a new one.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
    - Fixed typo in commit message
    - for (int -> for (

v8:
    - Removed unused num_classifiers. Jason Wang
---
 drivers/net/virtio_net.c | 40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index de1a23c71449..f392ea30f2c7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -299,7 +299,6 @@ struct virtnet_ff {
 	struct virtio_net_ff_cap_mask_data *ff_mask;
 	struct virtio_net_ff_actions *ff_actions;
 	struct xarray classifiers;
-	int num_classifiers;
 	struct virtnet_ethtool_ff ethtool;
 };
 
@@ -6827,6 +6826,7 @@ struct virtnet_ethtool_rule {
 /* The classifier struct must be the last field in this struct */
 struct virtnet_classifier {
 	size_t size;
+	refcount_t refcount;
 	u32 id;
 	struct virtio_net_resource_obj_ff_classifier classifier;
 };
@@ -6920,11 +6920,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
 	return false;
 }
 
-static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+static int setup_classifier(struct virtnet_ff *ff,
+			    struct virtnet_classifier **c)
 {
+	struct virtnet_classifier *tmp;
+	unsigned long i;
 	int err;
 
-	err = xa_alloc(&ff->classifiers, &c->id, c,
+	xa_for_each(&ff->classifiers, i, tmp) {
+		if ((*c)->size == tmp->size &&
+		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
+			refcount_inc(&tmp->refcount);
+			kfree(*c);
+			*c = tmp;
+			goto out;
+		}
+	}
+
+	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
 		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
 		       GFP_KERNEL);
 	if (err)
@@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
 
 	err = virtio_admin_obj_create(ff->vdev,
 				      VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
-				      c->id,
+				      (*c)->id,
 				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
 				      0,
-				      &c->classifier,
-				      c->size);
+				      &(*c)->classifier,
+				      (*c)->size);
 	if (err)
 		goto err_xarray;
 
+	refcount_set(&(*c)->refcount, 1);
+out:
 	return 0;
 
 err_xarray:
-	xa_erase(&ff->classifiers, c->id);
+	xa_erase(&ff->classifiers, (*c)->id);
 
 	return err;
 }
 
-static void destroy_classifier(struct virtnet_ff *ff,
-			       u32 classifier_id)
+static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
 {
 	struct virtnet_classifier *c;
 
 	c = xa_load(&ff->classifiers, classifier_id);
-	if (c) {
+	if (c && refcount_dec_and_test(&c->refcount)) {
 		virtio_admin_obj_destroy(ff->vdev,
 					 VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
 					 c->id,
@@ -6978,7 +6992,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
 				 0);
 
 	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
-	destroy_classifier(ff, eth_rule->classifier_id);
+	try_destroy_classifier(ff, eth_rule->classifier_id);
 	kfree(eth_rule);
 }
 
@@ -7159,14 +7173,14 @@ static int build_and_insert(struct virtnet_ff *ff,
 	if (err)
 		goto err_key;
 
-	err = setup_classifier(ff, c);
+	err = setup_classifier(ff, &c);
 	if (err)
 		goto err_classifier;
 
 	err = insert_rule(ff, eth_rule, c->id, key, key_size);
 	if (err) {
 		/* destroy_classifier will free the classifier */
-		destroy_classifier(ff, c->id);
+		try_destroy_classifier(ff, c->id);
 		goto err_key;
 	}
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (7 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-11-18 14:38 ` Daniel Jurgens
  2025-11-18 21:31   ` Michael S. Tsirkin
  2025-11-18 14:39 ` [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:38 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Add support for IP_USER type rules from ethtool.

Example:
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
Added rule with ID 1

The example rule will drop packets with the source IP specified.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
    - Fixed bug in protocol check of parse_ip4
    - (u8 *) to (void *) casting.
    - Alignment issues.
---
 drivers/net/virtio_net.c | 122 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 115 insertions(+), 7 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f392ea30f2c7..c1adba60b6a8 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -6904,6 +6904,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_ip4_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct iphdr *cap, *mask;
+
+	cap = (struct iphdr *)&sel_cap->mask;
+	mask = (struct iphdr *)&sel->mask;
+
+	if (mask->saddr &&
+	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+			       sizeof(__be32), partial_mask))
+		return false;
+
+	if (mask->daddr &&
+	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+			       sizeof(__be32), partial_mask))
+		return false;
+
+	if (mask->protocol &&
+	    !check_mask_vs_cap(&mask->protocol, &cap->protocol,
+			       sizeof(u8), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -6915,11 +6943,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
 	switch (sel->type) {
 	case VIRTIO_NET_FF_MASK_TYPE_ETH:
 		return validate_eth_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+		return validate_ip4_mask(ff, sel, sel_cap);
 	}
 
 	return false;
 }
 
+static void parse_ip4(struct iphdr *mask, struct iphdr *key,
+		      const struct ethtool_rx_flow_spec *fs)
+{
+	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
+	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
+
+	mask->saddr = l3_mask->ip4src;
+	mask->daddr = l3_mask->ip4dst;
+	key->saddr = l3_val->ip4src;
+	key->daddr = l3_val->ip4dst;
+
+	if (l3_mask->proto) {
+		mask->protocol = l3_mask->proto;
+		key->protocol = l3_val->proto;
+	}
+}
+
+static bool has_ipv4(u32 flow_type)
+{
+	return flow_type == IP_USER_FLOW;
+}
+
 static int setup_classifier(struct virtnet_ff *ff,
 			    struct virtnet_classifier **c)
 {
@@ -7054,6 +7107,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 {
 	switch (fs->flow_type) {
 	case ETHER_FLOW:
+	case IP_USER_FLOW:
 		return true;
 	}
 
@@ -7082,11 +7136,23 @@ static int validate_flow_input(struct virtnet_ff *ff,
 }
 
 static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
-				 size_t *key_size, size_t *classifier_size,
-				 int *num_hdrs)
+				size_t *key_size, size_t *classifier_size,
+				int *num_hdrs)
 {
+	size_t size = sizeof(struct ethhdr);
+
 	*num_hdrs = 1;
 	*key_size = sizeof(struct ethhdr);
+
+	if (fs->flow_type == ETHER_FLOW)
+		goto done;
+
+	++(*num_hdrs);
+	if (has_ipv4(fs->flow_type))
+		size += sizeof(struct iphdr);
+
+done:
+	*key_size = size;
 	/*
 	 * The classifier size is the size of the classifier header, a selector
 	 * header for each type of header in the match criteria, and each header
@@ -7098,8 +7164,9 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 }
 
 static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
-				   u8 *key,
-				   const struct ethtool_rx_flow_spec *fs)
+				  u8 *key,
+				  const struct ethtool_rx_flow_spec *fs,
+				  int num_hdrs)
 {
 	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
 	struct ethhdr *eth_k = (struct ethhdr *)key;
@@ -7107,8 +7174,33 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
 	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
 	selector->length = sizeof(struct ethhdr);
 
-	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
-	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+	if (num_hdrs > 1) {
+		eth_m->h_proto = cpu_to_be16(0xffff);
+		eth_k->h_proto = cpu_to_be16(ETH_P_IP);
+	} else {
+		memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+		memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+	}
+}
+
+static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
+			     u8 *key,
+			     const struct ethtool_rx_flow_spec *fs)
+{
+	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+	struct iphdr *v4_k = (struct iphdr *)key;
+
+	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+	selector->length = sizeof(struct iphdr);
+
+	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+	    fs->h_u.usr_ip4_spec.tos ||
+	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+		return -EOPNOTSUPP;
+
+	parse_ip4(v4_m, v4_k, fs);
+
+	return 0;
 }
 
 static int
@@ -7130,6 +7222,13 @@ validate_classifier_selectors(struct virtnet_ff *ff,
 	return 0;
 }
 
+static
+struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
+{
+	return (void *)sel + sizeof(struct virtio_net_ff_selector) +
+		sel->length;
+}
+
 static int build_and_insert(struct virtnet_ff *ff,
 			    struct virtnet_ethtool_rule *eth_rule)
 {
@@ -7167,8 +7266,17 @@ static int build_and_insert(struct virtnet_ff *ff,
 	classifier->count = num_hdrs;
 	selector = (void *)&classifier->selectors[0];
 
-	setup_eth_hdr_key_mask(selector, key, fs);
+	setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
+	if (num_hdrs == 1)
+		goto validate;
+
+	selector = next_selector(selector);
+
+	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+	if (err)
+		goto err_classifier;
 
+validate:
 	err = validate_classifier_selectors(ff, classifier, num_hdrs);
 	if (err)
 		goto err_key;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (8 preceding siblings ...)
  2025-11-18 14:38 ` [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-11-18 14:39 ` Daniel Jurgens
  2025-11-18 21:45   ` Michael S. Tsirkin
  2025-11-18 21:48   ` Michael S. Tsirkin
  2025-11-18 14:39 ` [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
  2025-11-18 14:39 ` [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
  11 siblings, 2 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:39 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Implement support for IPV6_USER_FLOW type rules.

Example:
$ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
Added rule with ID 0

The example rule will forward packets with the specified source and
destination IP addresses to RX ring 3.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: commit message typo
---
 drivers/net/virtio_net.c | 89 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 81 insertions(+), 8 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index c1adba60b6a8..78fc8f01b6c4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -6932,6 +6932,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_ip6_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct ipv6hdr *cap, *mask;
+
+	cap = (struct ipv6hdr *)&sel_cap->mask;
+	mask = (struct ipv6hdr *)&sel->mask;
+
+	if (!ipv6_addr_any(&mask->saddr) &&
+	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+			       sizeof(cap->saddr), partial_mask))
+		return false;
+
+	if (!ipv6_addr_any(&mask->daddr) &&
+	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+			       sizeof(cap->daddr), partial_mask))
+		return false;
+
+	if (mask->nexthdr &&
+	    !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
+	    sizeof(cap->nexthdr), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -6946,6 +6974,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
 
 	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
 		return validate_ip4_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+		return validate_ip6_mask(ff, sel, sel_cap);
 	}
 
 	return false;
@@ -6968,11 +6999,38 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
 	}
 }
 
+static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
+		      const struct ethtool_rx_flow_spec *fs)
+{
+	const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
+	const struct ethtool_usrip6_spec *l3_val  = &fs->h_u.usr_ip6_spec;
+
+	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
+		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
+		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
+	}
+
+	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
+		memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
+		memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
+	}
+
+	if (l3_mask->l4_proto) {
+		mask->nexthdr = l3_mask->l4_proto;
+		key->nexthdr = l3_val->l4_proto;
+	}
+}
+
 static bool has_ipv4(u32 flow_type)
 {
 	return flow_type == IP_USER_FLOW;
 }
 
+static bool has_ipv6(u32 flow_type)
+{
+	return flow_type == IPV6_USER_FLOW;
+}
+
 static int setup_classifier(struct virtnet_ff *ff,
 			    struct virtnet_classifier **c)
 {
@@ -7108,6 +7166,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 	switch (fs->flow_type) {
 	case ETHER_FLOW:
 	case IP_USER_FLOW:
+	case IPV6_USER_FLOW:
 		return true;
 	}
 
@@ -7150,7 +7209,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 	++(*num_hdrs);
 	if (has_ipv4(fs->flow_type))
 		size += sizeof(struct iphdr);
-
+	else if (has_ipv6(fs->flow_type))
+		size += sizeof(struct ipv6hdr);
 done:
 	*key_size = size;
 	/*
@@ -7187,18 +7247,31 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 			     u8 *key,
 			     const struct ethtool_rx_flow_spec *fs)
 {
+	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
 	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+	struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
 	struct iphdr *v4_k = (struct iphdr *)key;
 
-	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
-	selector->length = sizeof(struct iphdr);
+	if (has_ipv6(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
+		selector->length = sizeof(struct ipv6hdr);
 
-	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
-	    fs->h_u.usr_ip4_spec.tos ||
-	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
-		return -EOPNOTSUPP;
+		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+		    fs->h_u.usr_ip6_spec.tclass)
+			return -EOPNOTSUPP;
 
-	parse_ip4(v4_m, v4_k, fs);
+		parse_ip6(v6_m, v6_k, fs);
+	} else {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+		selector->length = sizeof(struct iphdr);
+
+		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+		    fs->h_u.usr_ip4_spec.tos ||
+		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+			return -EOPNOTSUPP;
+
+		parse_ip4(v4_m, v4_k, fs);
+	}
 
 	return 0;
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (9 preceding siblings ...)
  2025-11-18 14:39 ` [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-11-18 14:39 ` Daniel Jurgens
  2025-11-19  9:14   ` Michael S. Tsirkin
  2025-11-18 14:39 ` [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
  11 siblings, 1 reply; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:39 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

Implement TCP and UDP V4/V6 ethtool flow types.

Examples:
$ ethtool -U ens9 flow-type udp4 dst-ip 192.168.5.2 dst-port\
4321 action 20
Added rule with ID 4

This example directs IPv4 UDP traffic with the specified address and
port to queue 20.

$ ethtool -U ens9 flow-type tcp6 src-ip 2001:db8::1 src-port 1234 dst-ip\
2001:db8::2 dst-port 4321 action 12
Added rule with ID 5

This example directs IPv6 TCP traffic with the specified address and
port to queue 12.

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: (*num_hdrs)++ to ++(*num_hdrs)
---
 drivers/net/virtio_net.c | 207 +++++++++++++++++++++++++++++++++++++--
 1 file changed, 198 insertions(+), 9 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 78fc8f01b6c4..17e33927f434 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -6960,6 +6960,52 @@ static bool validate_ip6_mask(const struct virtnet_ff *ff,
 	return true;
 }
 
+static bool validate_tcp_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct tcphdr *cap, *mask;
+
+	cap = (struct tcphdr *)&sel_cap->mask;
+	mask = (struct tcphdr *)&sel->mask;
+
+	if (mask->source &&
+	    !check_mask_vs_cap(&mask->source, &cap->source,
+	    sizeof(cap->source), partial_mask))
+		return false;
+
+	if (mask->dest &&
+	    !check_mask_vs_cap(&mask->dest, &cap->dest,
+	    sizeof(cap->dest), partial_mask))
+		return false;
+
+	return true;
+}
+
+static bool validate_udp_mask(const struct virtnet_ff *ff,
+			      const struct virtio_net_ff_selector *sel,
+			      const struct virtio_net_ff_selector *sel_cap)
+{
+	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+	struct udphdr *cap, *mask;
+
+	cap = (struct udphdr *)&sel_cap->mask;
+	mask = (struct udphdr *)&sel->mask;
+
+	if (mask->source &&
+	    !check_mask_vs_cap(&mask->source, &cap->source,
+	    sizeof(cap->source), partial_mask))
+		return false;
+
+	if (mask->dest &&
+	    !check_mask_vs_cap(&mask->dest, &cap->dest,
+	    sizeof(cap->dest), partial_mask))
+		return false;
+
+	return true;
+}
+
 static bool validate_mask(const struct virtnet_ff *ff,
 			  const struct virtio_net_ff_selector *sel)
 {
@@ -6977,11 +7023,45 @@ static bool validate_mask(const struct virtnet_ff *ff,
 
 	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
 		return validate_ip6_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_TCP:
+		return validate_tcp_mask(ff, sel, sel_cap);
+
+	case VIRTIO_NET_FF_MASK_TYPE_UDP:
+		return validate_udp_mask(ff, sel, sel_cap);
 	}
 
 	return false;
 }
 
+static void set_tcp(struct tcphdr *mask, struct tcphdr *key,
+		    __be16 psrc_m, __be16 psrc_k,
+		    __be16 pdst_m, __be16 pdst_k)
+{
+	if (psrc_m) {
+		mask->source = psrc_m;
+		key->source = psrc_k;
+	}
+	if (pdst_m) {
+		mask->dest = pdst_m;
+		key->dest = pdst_k;
+	}
+}
+
+static void set_udp(struct udphdr *mask, struct udphdr *key,
+		    __be16 psrc_m, __be16 psrc_k,
+		    __be16 pdst_m, __be16 pdst_k)
+{
+	if (psrc_m) {
+		mask->source = psrc_m;
+		key->source = psrc_k;
+	}
+	if (pdst_m) {
+		mask->dest = pdst_m;
+		key->dest = pdst_k;
+	}
+}
+
 static void parse_ip4(struct iphdr *mask, struct iphdr *key,
 		      const struct ethtool_rx_flow_spec *fs)
 {
@@ -7023,12 +7103,26 @@ static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
 
 static bool has_ipv4(u32 flow_type)
 {
-	return flow_type == IP_USER_FLOW;
+	return flow_type == TCP_V4_FLOW ||
+	       flow_type == UDP_V4_FLOW ||
+	       flow_type == IP_USER_FLOW;
 }
 
 static bool has_ipv6(u32 flow_type)
 {
-	return flow_type == IPV6_USER_FLOW;
+	return flow_type == TCP_V6_FLOW ||
+	       flow_type == UDP_V6_FLOW ||
+	       flow_type == IPV6_USER_FLOW;
+}
+
+static bool has_tcp(u32 flow_type)
+{
+	return flow_type == TCP_V4_FLOW || flow_type == TCP_V6_FLOW;
+}
+
+static bool has_udp(u32 flow_type)
+{
+	return flow_type == UDP_V4_FLOW || flow_type == UDP_V6_FLOW;
 }
 
 static int setup_classifier(struct virtnet_ff *ff,
@@ -7167,6 +7261,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
 	case ETHER_FLOW:
 	case IP_USER_FLOW:
 	case IPV6_USER_FLOW:
+	case TCP_V4_FLOW:
+	case TCP_V6_FLOW:
+	case UDP_V4_FLOW:
+	case UDP_V6_FLOW:
 		return true;
 	}
 
@@ -7211,6 +7309,12 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
 		size += sizeof(struct iphdr);
 	else if (has_ipv6(fs->flow_type))
 		size += sizeof(struct ipv6hdr);
+
+	if (has_tcp(fs->flow_type) || has_udp(fs->flow_type)) {
+		++(*num_hdrs);
+		size += has_tcp(fs->flow_type) ? sizeof(struct tcphdr) :
+						 sizeof(struct udphdr);
+	}
 done:
 	*key_size = size;
 	/*
@@ -7245,7 +7349,8 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
 
 static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 			     u8 *key,
-			     const struct ethtool_rx_flow_spec *fs)
+			     const struct ethtool_rx_flow_spec *fs,
+			     int num_hdrs)
 {
 	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
 	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
@@ -7256,21 +7361,93 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
 		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
 		selector->length = sizeof(struct ipv6hdr);
 
-		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
-		    fs->h_u.usr_ip6_spec.tclass)
+		if (num_hdrs == 2 && (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+				      fs->h_u.usr_ip6_spec.tclass))
 			return -EOPNOTSUPP;
 
 		parse_ip6(v6_m, v6_k, fs);
+
+		if (num_hdrs > 2) {
+			v6_m->nexthdr = 0xff;
+			if (has_tcp(fs->flow_type))
+				v6_k->nexthdr = IPPROTO_TCP;
+			else
+				v6_k->nexthdr = IPPROTO_UDP;
+		}
 	} else {
 		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
 		selector->length = sizeof(struct iphdr);
 
-		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
-		    fs->h_u.usr_ip4_spec.tos ||
-		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
+		if (num_hdrs == 2 &&
+		    (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+		     fs->h_u.usr_ip4_spec.tos ||
+		     fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4))
 			return -EOPNOTSUPP;
 
 		parse_ip4(v4_m, v4_k, fs);
+
+		if (num_hdrs > 2) {
+			v4_m->protocol = 0xff;
+			if (has_tcp(fs->flow_type))
+				v4_k->protocol = IPPROTO_TCP;
+			else
+				v4_k->protocol = IPPROTO_UDP;
+		}
+	}
+
+	return 0;
+}
+
+static int setup_transport_key_mask(struct virtio_net_ff_selector *selector,
+				    u8 *key,
+				    struct ethtool_rx_flow_spec *fs)
+{
+	struct tcphdr *tcp_m = (struct tcphdr *)&selector->mask;
+	struct udphdr *udp_m = (struct udphdr *)&selector->mask;
+	const struct ethtool_tcpip6_spec *v6_l4_mask;
+	const struct ethtool_tcpip4_spec *v4_l4_mask;
+	const struct ethtool_tcpip6_spec *v6_l4_key;
+	const struct ethtool_tcpip4_spec *v4_l4_key;
+	struct tcphdr *tcp_k = (struct tcphdr *)key;
+	struct udphdr *udp_k = (struct udphdr *)key;
+
+	if (has_tcp(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_TCP;
+		selector->length = sizeof(struct tcphdr);
+
+		if (has_ipv6(fs->flow_type)) {
+			v6_l4_mask = &fs->m_u.tcp_ip6_spec;
+			v6_l4_key = &fs->h_u.tcp_ip6_spec;
+
+			set_tcp(tcp_m, tcp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+				v6_l4_mask->pdst, v6_l4_key->pdst);
+		} else {
+			v4_l4_mask = &fs->m_u.tcp_ip4_spec;
+			v4_l4_key = &fs->h_u.tcp_ip4_spec;
+
+			set_tcp(tcp_m, tcp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+				v4_l4_mask->pdst, v4_l4_key->pdst);
+		}
+
+	} else if (has_udp(fs->flow_type)) {
+		selector->type = VIRTIO_NET_FF_MASK_TYPE_UDP;
+		selector->length = sizeof(struct udphdr);
+
+		if (has_ipv6(fs->flow_type)) {
+			v6_l4_mask = &fs->m_u.udp_ip6_spec;
+			v6_l4_key = &fs->h_u.udp_ip6_spec;
+
+			set_udp(udp_m, udp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+				v6_l4_mask->pdst, v6_l4_key->pdst);
+		} else {
+			v4_l4_mask = &fs->m_u.udp_ip4_spec;
+			v4_l4_key = &fs->h_u.udp_ip4_spec;
+
+			set_udp(udp_m, udp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+				v4_l4_mask->pdst, v4_l4_key->pdst);
+		}
+	} else {
+		return -EOPNOTSUPP;
 	}
 
 	return 0;
@@ -7310,6 +7487,7 @@ static int build_and_insert(struct virtnet_ff *ff,
 	struct virtio_net_ff_selector *selector;
 	struct virtnet_classifier *c;
 	size_t classifier_size;
+	size_t key_offset;
 	size_t key_size;
 	int num_hdrs;
 	u8 *key;
@@ -7343,9 +7521,20 @@ static int build_and_insert(struct virtnet_ff *ff,
 	if (num_hdrs == 1)
 		goto validate;
 
+	key_offset = selector->length;
+	selector = next_selector(selector);
+
+	err = setup_ip_key_mask(selector, key + key_offset, fs, num_hdrs);
+	if (err)
+		goto err_classifier;
+
+	if (num_hdrs == 2)
+		goto validate;
+
+	key_offset += selector->length;
 	selector = next_selector(selector);
 
-	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+	err = setup_transport_key_mask(selector, key + key_offset, fs);
 	if (err)
 		goto err_classifier;
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops
  2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
                   ` (10 preceding siblings ...)
  2025-11-18 14:39 ` [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
@ 2025-11-18 14:39 ` Daniel Jurgens
  2025-11-18 18:49   ` Michael S. Tsirkin
  2025-11-18 22:39   ` Michael S. Tsirkin
  11 siblings, 2 replies; 66+ messages in thread
From: Daniel Jurgens @ 2025-11-18 14:39 UTC (permalink / raw)
  To: netdev, mst, jasowang, pabeni
  Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
	kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens

- Get total number of rules. There's no user interface for this. It is
  used to allocate an appropriately sized buffer for getting all the
  rules.

- Get specific rule
$ ethtool -u ens9 rule 0
	Filter: 0
		Rule Type: UDP over IPv4
		Src IP addr: 0.0.0.0 mask: 255.255.255.255
		Dest IP addr: 192.168.5.2 mask: 0.0.0.0
		TOS: 0x0 mask: 0xff
		Src port: 0 mask: 0xffff
		Dest port: 4321 mask: 0x0
		Action: Direct to queue 16

- Get all rules:
$ ethtool -u ens9
31 RX rings available
Total 2 rules

Filter: 0
        Rule Type: UDP over IPv4
        Src IP addr: 0.0.0.0 mask: 255.255.255.255
        Dest IP addr: 192.168.5.2 mask: 0.0.0.0
...

Filter: 1
        Flow Type: Raw Ethernet
        Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
        Dest MAC addr: 08:11:22:33:44:54 mask: 00:00:00:00:00:00

Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Answered questions about rules_limit overflow with no changes.
---
 drivers/net/virtio_net.c | 78 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 17e33927f434..5823ba12f1eb 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -306,6 +306,13 @@ static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
 				       struct ethtool_rx_flow_spec *fs,
 				       u16 curr_queue_pairs);
 static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
+static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+					  struct ethtool_rxnfc *info);
+static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+				    struct ethtool_rxnfc *info);
+static int
+virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+			      struct ethtool_rxnfc *info, u32 *rule_locs);
 
 #define VIRTNET_Q_TYPE_RX 0
 #define VIRTNET_Q_TYPE_TX 1
@@ -5665,6 +5672,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
 	return vi->curr_queue_pairs;
 }
 
+static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+	struct virtnet_info *vi = netdev_priv(dev);
+	int rc = 0;
+
+	switch (info->cmd) {
+	case ETHTOOL_GRXCLSRLCNT:
+		rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
+		break;
+	case ETHTOOL_GRXCLSRULE:
+		rc = virtnet_ethtool_get_flow(&vi->ff, info);
+		break;
+	case ETHTOOL_GRXCLSRLALL:
+		rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
+		break;
+	default:
+		rc = -EOPNOTSUPP;
+	}
+
+	return rc;
+}
+
 static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
 {
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -5706,6 +5735,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
 	.get_rxfh_fields = virtnet_get_hashflow,
 	.set_rxfh_fields = virtnet_set_hashflow,
 	.get_rx_ring_count = virtnet_get_rx_ring_count,
+	.get_rxnfc = virtnet_get_rxnfc,
 	.set_rxnfc = virtnet_set_rxnfc,
 };
 
@@ -7625,6 +7655,54 @@ static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
 	return err;
 }
 
+static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+					  struct ethtool_rxnfc *info)
+{
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	info->rule_cnt = ff->ethtool.num_rules;
+	info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;
+
+	return 0;
+}
+
+static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+				    struct ethtool_rxnfc *info)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	eth_rule = xa_load(&ff->ethtool.rules, info->fs.location);
+	if (!eth_rule)
+		return -ENOENT;
+
+	info->fs = eth_rule->flow_spec;
+
+	return 0;
+}
+
+static int
+virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+			      struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+	struct virtnet_ethtool_rule *eth_rule;
+	unsigned long i = 0;
+	int idx = 0;
+
+	if (!ff->ff_supported)
+		return -EOPNOTSUPP;
+
+	xa_for_each(&ff->ethtool.rules, i, eth_rule)
+		rule_locs[idx++] = i;
+
+	info->data = le32_to_cpu(ff->ff_caps->rules_limit);
+
+	return 0;
+}
+
 static size_t get_mask_size(u16 type)
 {
 	switch (type) {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops
  2025-11-18 14:39 ` [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
@ 2025-11-18 18:49   ` Michael S. Tsirkin
  2025-11-19 16:24     ` Dan Jurgens
  2025-11-18 22:39   ` Michael S. Tsirkin
  1 sibling, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 18:49 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:39:02AM -0600, Daniel Jurgens wrote:
> - Get total number of rules. There's no user interface for this. It is
>   used to allocate an appropriately sized buffer for getting all the
>   rules.
> 
> - Get specific rule
> $ ethtool -u ens9 rule 0
> 	Filter: 0
> 		Rule Type: UDP over IPv4
> 		Src IP addr: 0.0.0.0 mask: 255.255.255.255
> 		Dest IP addr: 192.168.5.2 mask: 0.0.0.0
> 		TOS: 0x0 mask: 0xff
> 		Src port: 0 mask: 0xffff
> 		Dest port: 4321 mask: 0x0
> 		Action: Direct to queue 16
> 
> - Get all rules:
> $ ethtool -u ens9
> 31 RX rings available
> Total 2 rules
> 
> Filter: 0
>         Rule Type: UDP over IPv4
>         Src IP addr: 0.0.0.0 mask: 255.255.255.255
>         Dest IP addr: 192.168.5.2 mask: 0.0.0.0
> ...
> 
> Filter: 1
>         Flow Type: Raw Ethernet
>         Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
>         Dest MAC addr: 08:11:22:33:44:54 mask: 00:00:00:00:00:00
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4: Answered questions about rules_limit overflow with no changes.
> ---
>  drivers/net/virtio_net.c | 78 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 78 insertions(+)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 17e33927f434..5823ba12f1eb 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -306,6 +306,13 @@ static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
>  				       struct ethtool_rx_flow_spec *fs,
>  				       u16 curr_queue_pairs);
>  static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
> +static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
> +					  struct ethtool_rxnfc *info);
> +static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
> +				    struct ethtool_rxnfc *info);
> +static int
> +virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
> +			      struct ethtool_rxnfc *info, u32 *rule_locs);
>  
>  #define VIRTNET_Q_TYPE_RX 0
>  #define VIRTNET_Q_TYPE_TX 1
> @@ -5665,6 +5672,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +	int rc = 0;
> +
> +	switch (info->cmd) {
> +	case ETHTOOL_GRXCLSRLCNT:
> +		rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRULE:
> +		rc = virtnet_ethtool_get_flow(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRLALL:
> +		rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
> +		break;
> +	default:
> +		rc = -EOPNOTSUPP;
> +	}
> +
> +	return rc;
> +}
> +
>  static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);
> @@ -5706,6 +5735,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.get_rxfh_fields = virtnet_get_hashflow,
>  	.set_rxfh_fields = virtnet_set_hashflow,
>  	.get_rx_ring_count = virtnet_get_rx_ring_count,
> +	.get_rxnfc = virtnet_get_rxnfc,
>  	.set_rxnfc = virtnet_set_rxnfc,
>  };
>  
> @@ -7625,6 +7655,54 @@ static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
>  	return err;
>  }
>  
> +static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
> +					  struct ethtool_rxnfc *info)
> +{
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	info->rule_cnt = ff->ethtool.num_rules;
> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;
> +
> +	return 0;
> +}
> +
> +static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
> +				    struct ethtool_rxnfc *info)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	eth_rule = xa_load(&ff->ethtool.rules, info->fs.location);
> +	if (!eth_rule)
> +		return -ENOENT;
> +
> +	info->fs = eth_rule->flow_spec;
> +
> +	return 0;
> +}
> +
> +static int
> +virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
> +			      struct ethtool_rxnfc *info, u32 *rule_locs)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	unsigned long i = 0;
> +	int idx = 0;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
> +		rule_locs[idx++] = i;
> +
> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit);
> +
> +	return 0;
> +}

So I see


 * For %ETHTOOL_GRXCLSRLALL, @rule_cnt specifies the array size of the
 * user buffer for @rule_locs on entry.  On return, @data is the size
 * of the rule table, @rule_cnt is the number of defined rules, and
 * @rule_locs contains the locations of the defined rules.  Drivers
 * must use the second parameter to get_rxnfc() instead of @rule_locs.
 *


Should this set @rule_cnt?





> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-18 14:38 ` [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-11-18 19:01   ` Michael S. Tsirkin
  2025-11-19  6:07     ` Dan Jurgens
  2025-11-19  9:20     ` Michael S. Tsirkin
  2025-11-19  9:26   ` Michael S. Tsirkin
  1 sibling, 2 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 19:01 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:57AM -0600, Daniel Jurgens wrote:
> Filtering a flow requires a classifier to match the packets, and a rule
> to filter on the matches.
> 
> A classifier consists of one or more selectors. There is one selector
> per header type. A selector must only use fields set in the selector
> capability. If partial matching is supported, the classifier mask for a
> particular field can be a subset of the mask for that field in the
> capability.
> 
> The rule consists of a priority, an action and a key. The key is a byte
> array containing headers corresponding to the selectors in the
> classifier.
> 
> This patch implements ethtool rules for ethernet headers.
> 
> Example:
> $ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
> Added rule with ID 1
> 
> The rule in the example directs received packets with the specified
> destination MAC address to rq 30.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
>     - Fixed double free bug in error flows
>     - Build bug on for classifier struct ordering.
>     - (u8 *) to (void *) casting.
>     - Documentation in UAPI
>     - Answered questions about overflow with no changes.
> v6:
>     - Fix sparse warning "array of flexible structures" Jakub K/Simon H
> v7:
>     - Move for (int i -> for (i hunk from next patch. Paolo Abeni
> ---
>  drivers/net/virtio_net.c           | 462 +++++++++++++++++++++++++++++
>  include/uapi/linux/virtio_net_ff.h |  50 ++++
>  2 files changed, 512 insertions(+)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 900d597726f7..de1a23c71449 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -284,6 +284,11 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
>  	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
>  };
>  
> +struct virtnet_ethtool_ff {
> +	struct xarray rules;
> +	int    num_rules;
> +};
> +
>  #define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
>  #define VIRTNET_FF_MAX_GROUPS 1
>  
> @@ -293,8 +298,16 @@ struct virtnet_ff {
>  	struct virtio_net_ff_cap_data *ff_caps;
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
> +	struct xarray classifiers;
> +	int num_classifiers;
> +	struct virtnet_ethtool_ff ethtool;
>  };
>  
> +static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				       struct ethtool_rx_flow_spec *fs,
> +				       u16 curr_queue_pairs);
> +static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
> +
>  #define VIRTNET_Q_TYPE_RX 0
>  #define VIRTNET_Q_TYPE_TX 1
>  #define VIRTNET_Q_TYPE_CQ 2
> @@ -5653,6 +5666,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +
> +	switch (info->cmd) {
> +	case ETHTOOL_SRXCLSRLINS:
> +		return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
> +						   vi->curr_queue_pairs);
> +	case ETHTOOL_SRXCLSRLDEL:
> +		return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
> +	}
> +
> +	return -EOPNOTSUPP;
> +}
> +
>  static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
>  		ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
> @@ -5679,6 +5707,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>  	.get_rxfh_fields = virtnet_get_hashflow,
>  	.set_rxfh_fields = virtnet_set_hashflow,
>  	.get_rx_ring_count = virtnet_get_rx_ring_count,
> +	.set_rxnfc = virtnet_set_rxnfc,
>  };
>  
>  static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
> @@ -6790,6 +6819,428 @@ static const struct xdp_metadata_ops virtnet_xdp_metadata_ops = {
>  	.xmo_rx_hash			= virtnet_xdp_rx_hash,
>  };
>  
> +struct virtnet_ethtool_rule {
> +	struct ethtool_rx_flow_spec flow_spec;
> +	u32 classifier_id;
> +};
> +
> +/* The classifier struct must be the last field in this struct */
> +struct virtnet_classifier {
> +	size_t size;
> +	u32 id;
> +	struct virtio_net_resource_obj_ff_classifier classifier;
> +};
> +
> +static_assert(sizeof(struct virtnet_classifier) ==
> +	      ALIGN(offsetofend(struct virtnet_classifier, classifier),
> +		    __alignof__(struct virtnet_classifier)),
> +	      "virtnet_classifier: classifier must be the last member");
> +
> +static bool check_mask_vs_cap(const void *m, const void *c,
> +			      u16 len, bool partial)
> +{
> +	const u8 *mask = m;
> +	const u8 *cap = c;
> +	int i;
> +
> +	for (i = 0; i < len; i++) {
> +		if (partial && ((mask[i] & cap[i]) != mask[i]))
> +			return false;
> +		if (!partial && mask[i] != cap[i])
> +			return false;
> +	}
> +
> +	return true;
> +}
> +
> +static
> +struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
> +						u8 selector_type)
> +{
> +	struct virtio_net_ff_selector *sel;
> +	void *buf;
> +	int i;
> +
> +	buf = &ff->ff_mask->selectors;
> +	sel = buf;
> +
> +	for (i = 0; i < ff->ff_mask->count; i++) {
> +		if (sel->type == selector_type)
> +			return sel;
> +
> +		buf += sizeof(struct virtio_net_ff_selector) + sel->length;
> +		sel = buf;
> +	}
> +
> +	return NULL;
> +}
> +
> +static bool validate_eth_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ethhdr *cap, *mask;
> +	struct ethhdr zeros = {};
> +
> +	cap = (struct ethhdr *)&sel_cap->mask;
> +	mask = (struct ethhdr *)&sel->mask;
> +
> +	if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
> +	    !check_mask_vs_cap(mask->h_dest, cap->h_dest,
> +			       sizeof(mask->h_dest), partial_mask))
> +		return false;
> +
> +	if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
> +	    !check_mask_vs_cap(mask->h_source, cap->h_source,
> +			       sizeof(mask->h_source), partial_mask))
> +		return false;
> +
> +	if (mask->h_proto &&
> +	    !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
> +			       sizeof(__be16), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
> +static bool validate_mask(const struct virtnet_ff *ff,
> +			  const struct virtio_net_ff_selector *sel)
> +{
> +	struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
> +
> +	if (!sel_cap)
> +		return false;
> +
> +	switch (sel->type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return validate_eth_mask(ff, sel, sel_cap);
> +	}
> +
> +	return false;
> +}
> +
> +static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +{
> +	int err;
> +
> +	err = xa_alloc(&ff->classifiers, &c->id, c,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		return err;
> +
> +	err = virtio_admin_obj_create(ff->vdev,
> +				      VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +				      c->id,
> +				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				      0,
> +				      &c->classifier,
> +				      c->size);
> +	if (err)
> +		goto err_xarray;
> +
> +	return 0;
> +
> +err_xarray:
> +	xa_erase(&ff->classifiers, c->id);
> +
> +	return err;
> +}
> +
> +static void destroy_classifier(struct virtnet_ff *ff,
> +			       u32 classifier_id)
> +{
> +	struct virtnet_classifier *c;
> +
> +	c = xa_load(&ff->classifiers, classifier_id);
> +	if (c) {
> +		virtio_admin_obj_destroy(ff->vdev,
> +					 VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> +					 c->id,
> +					 VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +					 0);
> +
> +		xa_erase(&ff->classifiers, c->id);
> +		kfree(c);
> +	}
> +}
> +
> +static void destroy_ethtool_rule(struct virtnet_ff *ff,
> +				 struct virtnet_ethtool_rule *eth_rule)
> +{
> +	ff->ethtool.num_rules--;
> +
> +	virtio_admin_obj_destroy(ff->vdev,
> +				 VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +				 eth_rule->flow_spec.location,
> +				 VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				 0);
> +
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +	destroy_classifier(ff, eth_rule->classifier_id);
> +	kfree(eth_rule);
> +}
> +
> +static int insert_rule(struct virtnet_ff *ff,
> +		       struct virtnet_ethtool_rule *eth_rule,
> +		       u32 classifier_id,
> +		       const u8 *key,
> +		       size_t key_size)
> +{
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_resource_obj_ff_rule *ff_rule;
> +	int err;
> +
> +	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
> +	if (!ff_rule)
> +		return -ENOMEM;
> +
> +	/* Intentionally leave the priority as 0. All rules have the same
> +	 * priority.
> +	 */
> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
> +	ff_rule->key_length = (u8)key_size;

I don't think you need this cast. 

BTW why do you insist on making all this math in size_t variables?

Just u8 should do, and in calculate_flow_sizes you can do a BUG_ON to check
it does not overflow.






> +	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
> +					     VIRTIO_NET_FF_ACTION_DROP :
> +					     VIRTIO_NET_FF_ACTION_RX_VQ;
> +	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
> +					       cpu_to_le16(fs->ring_cookie) : 0;
> +	memcpy(&ff_rule->keys, key, key_size);
> +
> +	err = virtio_admin_obj_create(ff->vdev,
> +				      VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +				      fs->location,
> +				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				      0,
> +				      ff_rule,
> +				      sizeof(*ff_rule) + key_size);
> +	if (err)
> +		goto err_ff_rule;
> +
> +	eth_rule->classifier_id = classifier_id;
> +	ff->ethtool.num_rules++;
> +	kfree(ff_rule);
> +
> +	return 0;
> +
> +err_ff_rule:
> +	kfree(ff_rule);
> +
> +	return err;
> +}
> +
> +static u32 flow_type_mask(u32 flow_type)
> +{
> +	return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
> +}
> +
> +static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
> +{
> +	switch (fs->flow_type) {
> +	case ETHER_FLOW:
> +		return true;
> +	}
> +
> +	return false;
> +}
> +
> +static int validate_flow_input(struct virtnet_ff *ff,
> +			       const struct ethtool_rx_flow_spec *fs,
> +			       u16 curr_queue_pairs)
> +{
> +	/* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
> +	if (fs->location != RX_CLS_LOC_ANY)
> +		return -EOPNOTSUPP;
> +
> +	if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
> +	    fs->ring_cookie >= curr_queue_pairs)
> +		return -EINVAL;
> +
> +	if (fs->flow_type != flow_type_mask(fs->flow_type))
> +		return -EOPNOTSUPP;
> +
> +	if (!supported_flow_type(fs))
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> +				 size_t *key_size, size_t *classifier_size,
> +				 int *num_hdrs)
> +{
> +	*num_hdrs = 1;
> +	*key_size = sizeof(struct ethhdr);
> +	/*
> +	 * The classifier size is the size of the classifier header, a selector
> +	 * header for each type of header in the match criteria, and each header
> +	 * providing the mask for matching against.
> +	 */
> +	*classifier_size = *key_size +
> +			   sizeof(struct virtio_net_resource_obj_ff_classifier) +
> +			   sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
> +}
> +
> +static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
> +				   u8 *key,
> +				   const struct ethtool_rx_flow_spec *fs)
> +{
> +	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
> +	struct ethhdr *eth_k = (struct ethhdr *)key;
> +
> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
> +	selector->length = sizeof(struct ethhdr);
> +
> +	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> +	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +}
> +
> +static int
> +validate_classifier_selectors(struct virtnet_ff *ff,
> +			      struct virtio_net_resource_obj_ff_classifier *classifier,
> +			      int num_hdrs)
> +{
> +	struct virtio_net_ff_selector *selector = (void *)classifier->selectors;
> +	int i;
> +
> +	for (i = 0; i < num_hdrs; i++) {
> +		if (!validate_mask(ff, selector))
> +			return -EINVAL;
> +
> +		selector = (((void *)selector) + sizeof(*selector) +
> +					selector->length);
> +	}
> +
> +	return 0;
> +}
> +
> +static int build_and_insert(struct virtnet_ff *ff,
> +			    struct virtnet_ethtool_rule *eth_rule)
> +{
> +	struct virtio_net_resource_obj_ff_classifier *classifier;
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_ff_selector *selector;
> +	struct virtnet_classifier *c;
> +	size_t classifier_size;
> +	size_t key_size;
> +	int num_hdrs;
> +	u8 *key;
> +	int err;
> +
> +	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
> +
> +	key = kzalloc(key_size, GFP_KERNEL);
> +	if (!key)
> +		return -ENOMEM;
> +
> +	/*
> +	 * virtio_net_ff_obj_ff_classifier is already included in the
> +	 * classifier_size.
> +	 */
> +	c = kzalloc(classifier_size +
> +		    sizeof(struct virtnet_classifier) -
> +		    sizeof(struct virtio_net_resource_obj_ff_classifier),
> +		    GFP_KERNEL);
> +	if (!c) {
> +		kfree(key);
> +		return -ENOMEM;
> +	}
> +
> +	c->size = classifier_size;
> +	classifier = &c->classifier;
> +	classifier->count = num_hdrs;
> +	selector = (void *)&classifier->selectors[0];
> +
> +	setup_eth_hdr_key_mask(selector, key, fs);
> +
> +	err = validate_classifier_selectors(ff, classifier, num_hdrs);
> +	if (err)
> +		goto err_key;
> +
> +	err = setup_classifier(ff, c);
> +	if (err)
> +		goto err_classifier;
> +
> +	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> +	if (err) {
> +		/* destroy_classifier will free the classifier */
> +		destroy_classifier(ff, c->id);
> +		goto err_key;
> +	}
> +
> +	return 0;
> +
> +err_classifier:
> +	kfree(c);
> +err_key:
> +	kfree(key);
> +
> +	return err;
> +}
> +
> +static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> +				       struct ethtool_rx_flow_spec *fs,
> +				       u16 curr_queue_pairs)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	err = validate_flow_input(ff, fs, curr_queue_pairs);
> +	if (err)
> +		return err;
> +
> +	eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
> +	if (!eth_rule)
> +		return -ENOMEM;
> +
> +	err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
> +		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
> +		       GFP_KERNEL);
> +	if (err)
> +		goto err_rule;
> +
> +	eth_rule->flow_spec = *fs;
> +
> +	err = build_and_insert(ff, eth_rule);
> +	if (err)
> +		goto err_xa;
> +
> +	return err;
> +
> +err_xa:
> +	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +
> +err_rule:
> +	fs->location = RX_CLS_LOC_ANY;
> +	kfree(eth_rule);
> +
> +	return err;
> +}
> +
> +static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
> +{
> +	struct virtnet_ethtool_rule *eth_rule;
> +	int err = 0;
> +
> +	if (!ff->ff_supported)
> +		return -EOPNOTSUPP;
> +
> +	eth_rule = xa_load(&ff->ethtool.rules, location);
> +	if (!eth_rule) {
> +		err = -ENOENT;
> +		goto out;
> +	}
> +
> +	destroy_ethtool_rule(ff, eth_rule);
> +out:
> +	return err;
> +}
> +
>  static size_t get_mask_size(u16 type)
>  {
>  	switch (type) {
> @@ -6955,6 +7406,8 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
> +	xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
>  	ff->vdev = vdev;
>  	ff->ff_supported = true;
>  
> @@ -6979,9 +7432,18 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  
>  static void virtnet_ff_cleanup(struct virtnet_ff *ff)
>  {
> +	struct virtnet_ethtool_rule *eth_rule;
> +	unsigned long i;
> +
>  	if (!ff->ff_supported)
>  		return;
>  
> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
> +		destroy_ethtool_rule(ff, eth_rule);
> +
> +	xa_destroy(&ff->ethtool.rules);
> +	xa_destroy(&ff->classifiers);
> +
>  	virtio_admin_obj_destroy(ff->vdev,
>  				 VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
>  				 VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> index 6d1f953c2b46..c98aa4942bee 100644
> --- a/include/uapi/linux/virtio_net_ff.h
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -13,6 +13,8 @@
>  #define VIRTIO_NET_FF_ACTION_CAP 0x802
>  
>  #define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
>  
>  /**
>   * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> @@ -103,4 +105,52 @@ struct virtio_net_resource_obj_ff_group {
>  	__le16 group_priority;
>  };
>  
> +/**
> + * struct virtio_net_resource_obj_ff_classifier - Flow filter classifier object
> + * @count: number of selector entries in @selectors
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @selectors: array of selector descriptors that define match masks
> + *
> + * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER administrative object.
> + * Each selector describes a header mask used to match packets
> + * (see struct virtio_net_ff_selector). Selectors appear in the order they are
> + * to be applied.
> + */
> +struct virtio_net_resource_obj_ff_classifier {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 selectors[];
> +};
> +
> +/**
> + * struct virtio_net_resource_obj_ff_rule - Flow filter rule object
> + * @group_id: identifier of the target flow filter group
> + * @classifier_id: identifier of the classifier referenced by this rule
> + * @rule_priority: relative priority of this rule within the group
> + * @key_length: number of bytes in @keys
> + * @action: action to perform, one of VIRTIO_NET_FF_ACTION_*
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @vq_index: RX virtqueue index for VIRTIO_NET_FF_ACTION_RX_VQ, 0 otherwise
> + * @reserved1: must be set to 0 by the driver and ignored by the device
> + * @keys: concatenated key bytes matching the classifier's selectors order
> + *
> + * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_RULE administrative object.
> + * @group_id and @classifier_id refer to previously created objects of types
> + * VIRTIO_NET_RESOURCE_OBJ_FF_GROUP and VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER
> + * respectively. The key bytes are compared against packet headers using the
> + * masks provided by the classifier's selectors. Multi-byte fields are
> + * little-endian.
> + */
> +struct virtio_net_resource_obj_ff_rule {
> +	__le32 group_id;
> +	__le32 classifier_id;
> +	__u8 rule_priority;
> +	__u8 key_length; /* length of key in bytes */
> +	__u8 action;
> +	__u8 reserved;
> +	__le16 vq_index;
> +	__u8 reserved1[2];
> +	__u8 keys[];
> +};
> +
>  #endif
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-18 14:38 ` [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-11-18 21:31   ` Michael S. Tsirkin
  2025-11-19  7:03     ` Dan Jurgens
  2025-11-19  9:18     ` Michael S. Tsirkin
  0 siblings, 2 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 21:31 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:59AM -0600, Daniel Jurgens wrote:
> Add support for IP_USER type rules from ethtool.
> 
> Example:
> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
> Added rule with ID 1
> 
> The example rule will drop packets with the source IP specified.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
>     - Fixed bug in protocol check of parse_ip4
>     - (u8 *) to (void *) casting.
>     - Alignment issues.
> ---
>  drivers/net/virtio_net.c | 122 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 115 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index f392ea30f2c7..c1adba60b6a8 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -6904,6 +6904,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
>  	return true;
>  }
>  
> +static bool validate_ip4_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct iphdr *cap, *mask;
> +
> +	cap = (struct iphdr *)&sel_cap->mask;
> +	mask = (struct iphdr *)&sel->mask;
> +
> +	if (mask->saddr &&
> +	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> +			       sizeof(__be32), partial_mask))
> +		return false;
> +
> +	if (mask->daddr &&
> +	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> +			       sizeof(__be32), partial_mask))
> +		return false;
> +
> +	if (mask->protocol &&
> +	    !check_mask_vs_cap(&mask->protocol, &cap->protocol,
> +			       sizeof(u8), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
>  static bool validate_mask(const struct virtnet_ff *ff,
>  			  const struct virtio_net_ff_selector *sel)
>  {
> @@ -6915,11 +6943,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  	switch (sel->type) {
>  	case VIRTIO_NET_FF_MASK_TYPE_ETH:
>  		return validate_eth_mask(ff, sel, sel_cap);
> +
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> +		return validate_ip4_mask(ff, sel, sel_cap);
>  	}
>  
>  	return false;
>  }
>  
> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> +		      const struct ethtool_rx_flow_spec *fs)
> +{
> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
> +
> +	mask->saddr = l3_mask->ip4src;
> +	mask->daddr = l3_mask->ip4dst;
> +	key->saddr = l3_val->ip4src;
> +	key->daddr = l3_val->ip4dst;
> +
> +	if (l3_mask->proto) {

you seem to check mask for proto here but the ethtool_usrip4_spec doc
seems to say the mask for proto must be 0. 


what gives?


> +		mask->protocol = l3_mask->proto;
> +		key->protocol = l3_val->proto;
> +	}
> +}
> +
> +static bool has_ipv4(u32 flow_type)
> +{
> +	return flow_type == IP_USER_FLOW;
> +}
> +
>  static int setup_classifier(struct virtnet_ff *ff,
>  			    struct virtnet_classifier **c)
>  {
> @@ -7054,6 +7107,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  {
>  	switch (fs->flow_type) {
>  	case ETHER_FLOW:
> +	case IP_USER_FLOW:
>  		return true;
>  	}
>  
> @@ -7082,11 +7136,23 @@ static int validate_flow_input(struct virtnet_ff *ff,
>  }
>  
>  static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> -				 size_t *key_size, size_t *classifier_size,
> -				 int *num_hdrs)
> +				size_t *key_size, size_t *classifier_size,
> +				int *num_hdrs)
>  {
> +	size_t size = sizeof(struct ethhdr);
> +
>  	*num_hdrs = 1;
>  	*key_size = sizeof(struct ethhdr);

So *key_size  is assigned here ...

> +
> +	if (fs->flow_type == ETHER_FLOW)
> +		goto done;
> +
> +	++(*num_hdrs);
> +	if (has_ipv4(fs->flow_type))
> +		size += sizeof(struct iphdr);
> +

... never used

> +done:
> +	*key_size = size;

and over-written here.


what is going on here, is that this is spaghetti code
misusing goto for if instructions which obscures the flow.

It should be if (fs->flow_type != ETHER_FLOW) {

	... rest of code ...
}

and then it will be clear doing *key_size = size once is enough.


>  	/*
>  	 * The classifier size is the size of the classifier header, a selector
>  	 * header for each type of header in the match criteria, and each header
> @@ -7098,8 +7164,9 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>  }
>  
>  static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
> -				   u8 *key,
> -				   const struct ethtool_rx_flow_spec *fs)
> +				  u8 *key,
> +				  const struct ethtool_rx_flow_spec *fs,
> +				  int num_hdrs)
>  {
>  	struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
>  	struct ethhdr *eth_k = (struct ethhdr *)key;
> @@ -7107,8 +7174,33 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
>  	selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
>  	selector->length = sizeof(struct ethhdr);
>  
> -	memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> -	memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +	if (num_hdrs > 1) {
> +		eth_m->h_proto = cpu_to_be16(0xffff);
> +		eth_k->h_proto = cpu_to_be16(ETH_P_IP);
> +	} else {
> +		memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> +		memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> +	}
> +}
> +
> +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> +			     u8 *key,
> +			     const struct ethtool_rx_flow_spec *fs)
> +{
> +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> +	struct iphdr *v4_k = (struct iphdr *)key;
> +
> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> +	selector->length = sizeof(struct iphdr);
> +
> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> +	    fs->h_u.usr_ip4_spec.tos ||
> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> +		return -EOPNOTSUPP;

So include/uapi/linux/ethtool.h says:

 * struct ethtool_usrip4_spec - general flow specification for IPv4
 * @ip4src: Source host
 * @ip4dst: Destination host
 * @l4_4_bytes: First 4 bytes of transport (layer 4) header
 * @tos: Type-of-service
 * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
 * @proto: Transport protocol number; mask must be 0

I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
documentation? But then shouldn't you check the mask
as well? and mask for proto?





> +
> +	parse_ip4(v4_m, v4_k, fs);
> +
> +	return 0;
>  }
>  
>  static int
> @@ -7130,6 +7222,13 @@ validate_classifier_selectors(struct virtnet_ff *ff,
>  	return 0;
>  }
>  
> +static
> +struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
> +{
> +	return (void *)sel + sizeof(struct virtio_net_ff_selector) +
> +		sel->length;
> +}
> +
>  static int build_and_insert(struct virtnet_ff *ff,
>  			    struct virtnet_ethtool_rule *eth_rule)
>  {
> @@ -7167,8 +7266,17 @@ static int build_and_insert(struct virtnet_ff *ff,
>  	classifier->count = num_hdrs;
>  	selector = (void *)&classifier->selectors[0];
>  
> -	setup_eth_hdr_key_mask(selector, key, fs);
> +	setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
> +	if (num_hdrs == 1)
> +		goto validate;


Please stop abusing goto's for if.
this is not error handling, not breaking out of loops ...


please do not.


> +
> +	selector = next_selector(selector);
> +
> +	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
> +	if (err)
> +		goto err_classifier;
>  
> +validate:
>  	err = validate_classifier_selectors(ff, classifier, num_hdrs);
>  	if (err)
>  		goto err_key;
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 03/12] virtio: Expose generic device capability operations
  2025-11-18 14:38 ` [PATCH net-next v11 03/12] virtio: Expose generic device capability operations Daniel Jurgens
@ 2025-11-18 21:42   ` Michael S. Tsirkin
  2025-11-19  3:27     ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 21:42 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:53AM -0600, Daniel Jurgens wrote:
> Currently querying and setting capabilities is restricted to a single
> capability and contained within the virtio PCI driver. However, each
> device type has generic and device specific capabilities, that may be
> queried and set. In subsequent patches virtio_net will query and set
> flow filter capabilities.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
> ---
> v4: Moved this logic from virtio_pci_modern to new file
>     virtio_admin_commands.
> ---
>  drivers/virtio/Makefile                |  2 +-
>  drivers/virtio/virtio_admin_commands.c | 90 ++++++++++++++++++++++++++
>  include/linux/virtio_admin.h           | 80 +++++++++++++++++++++++
>  include/uapi/linux/virtio_pci.h        |  7 +-
>  4 files changed, 176 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/virtio/virtio_admin_commands.c
>  create mode 100644 include/linux/virtio_admin.h
> 
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index eefcfe90d6b8..2b4a204dde33 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,5 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
> -obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> +obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
>  obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
>  obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
>  obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
> new file mode 100644
> index 000000000000..94751d16b3c4
> --- /dev/null
> +++ b/drivers/virtio/virtio_admin_commands.c
> @@ -0,0 +1,90 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_admin.h>
> +
> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
> +				   struct virtio_admin_cmd_query_cap_id_result *data)
> +{
> +	struct virtio_admin_cmd cmd = {};
> +	struct scatterlist result_sg;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	sg_init_one(&result_sg, data, sizeof(*data));
> +	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
> +	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> +	cmd.result_sg = &result_sg;
> +
> +	return vdev->config->admin_cmd_exec(vdev, &cmd);
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
> +
> +int virtio_admin_cap_get(struct virtio_device *vdev,
> +			 u16 id,
> +			 void *caps,
> +			 size_t cap_size)
> +{
> +	struct virtio_admin_cmd_cap_get_data *data;
> +	struct virtio_admin_cmd cmd = {};
> +	struct scatterlist result_sg;
> +	struct scatterlist data_sg;
> +	int err;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	data = kzalloc(sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->id = cpu_to_le16(id);
> +	sg_init_one(&data_sg, data, sizeof(*data));
> +	sg_init_one(&result_sg, caps, cap_size);
> +	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DEVICE_CAP_GET);
> +	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> +	cmd.data_sg = &data_sg;
> +	cmd.result_sg = &result_sg;
> +
> +	err = vdev->config->admin_cmd_exec(vdev, &cmd);
> +	kfree(data);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_get);
> +
> +int virtio_admin_cap_set(struct virtio_device *vdev,
> +			 u16 id,
> +			 const void *caps,
> +			 size_t cap_size)
> +{
> +	struct virtio_admin_cmd_cap_set_data *data;
> +	struct virtio_admin_cmd cmd = {};
> +	struct scatterlist data_sg;
> +	size_t data_size;
> +	int err;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	data_size = sizeof(*data) + cap_size;
> +	data = kzalloc(data_size, GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->id = cpu_to_le16(id);
> +	memcpy(data->cap_specific_data, caps, cap_size);
> +	sg_init_one(&data_sg, data, data_size);
> +	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
> +	cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> +	cmd.data_sg = &data_sg;
> +	cmd.result_sg = NULL;
> +
> +	err = vdev->config->admin_cmd_exec(vdev, &cmd);
> +	kfree(data);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> new file mode 100644
> index 000000000000..36df97b6487a
> --- /dev/null
> +++ b/include/linux/virtio_admin.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: GPL-2.0-only
> + *
> + * Header file for virtio admin operations
> + */
> +#include <uapi/linux/virtio_pci.h>
> +
> +#ifndef _LINUX_VIRTIO_ADMIN_H
> +#define _LINUX_VIRTIO_ADMIN_H


Guards normally come before #include - there is no
point in pulling in uapi/linux/virtio_pci.h - just
extra work for the compiler.



> +
> +struct virtio_device;
> +
> +/**
> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> + * @cap_list: Pointer to capability list structure containing supported_caps array
> + * @cap: Capability ID to check
> + *
> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> + * where each bit represents a capability. Bit 0 of the first element represents
> + * capability ID 0, bit 1 represents capability ID 1, and so on.
> + *
> + * Return: 1 if capability is supported, 0 otherwise
> + */
> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> +	(!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))

while this works if cap is a variable, it will behave
unexpectedly if cap or even cap_list is an expression.

A standard practice is to put all macro arguments in brackets:
!!(1 & (le64_to_cpu((cap_list)->supported_caps[(cap) / 64]) >> (cap) % 64)))





> +
> +/**
> + * virtio_admin_cap_id_list_query - Query the list of available capability IDs
> + * @vdev: The virtio device to query
> + * @data: Pointer to result structure (must be heap allocated)
> + *
> + * This function queries the virtio device for the list of available capability
> + * IDs that can be used with virtio_admin_cap_get() and virtio_admin_cap_set().
> + * The result is stored in the provided data structure.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability queries, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
> +				   struct virtio_admin_cmd_query_cap_id_result *data);
> +
> +/**
> + * virtio_admin_cap_get - Get capability data for a specific capability ID
> + * @vdev: The virtio device
> + * @id: Capability ID to retrieve
> + * @caps: Pointer to capability data structure (must be heap allocated)
> + * @cap_size: Size of the capability data structure
> + *
> + * This function retrieves a specific capability from the virtio device.
> + * The capability data is stored in the provided buffer. The caller must
> + * ensure the buffer is large enough to hold the capability data.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability retrieval, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_get(struct virtio_device *vdev,
> +			 u16 id,
> +			 void *caps,
> +			 size_t cap_size);
> +
> +/**
> + * virtio_admin_cap_set - Set capability data for a specific capability ID
> + * @vdev: The virtio device
> + * @id: Capability ID to set
> + * @caps: Pointer to capability data structure (must be heap allocated)
> + * @cap_size: Size of the capability data structure
> + *
> + * This function sets a specific capability on the virtio device.
> + * The capability data is read from the provided buffer and applied
> + * to the device. The device may validate the capability data before
> + * applying it.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability setting, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_set(struct virtio_device *vdev,
> +			 u16 id,
> +			 const void *caps,
> +			 size_t cap_size);
> +
> +#endif /* _LINUX_VIRTIO_ADMIN_H */
> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> index c691ac210ce2..0d5ca0cff629 100644
> --- a/include/uapi/linux/virtio_pci.h
> +++ b/include/uapi/linux/virtio_pci.h
> @@ -315,15 +315,18 @@ struct virtio_admin_cmd_notify_info_result {
>  
>  #define VIRTIO_DEV_PARTS_CAP 0x0000
>  
> +/* Update this value to largest implemented cap number. */

implemented by what?

> +#define VIRTIO_ADMIN_MAX_CAP 0x0fff
> +
>  struct virtio_dev_parts_cap {
>  	__u8 get_parts_resource_objects_limit;
>  	__u8 set_parts_resource_objects_limit;
>  };
>  
> -#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
> +#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP, 64)

Don't you mean VIRTIO_ADMIN_MAX_CAP + 1 here?
E.g. if VIRTIO_ADMIN_MAX_CAP was 0 we would need space for 1 capability,
right?

>  
>  struct virtio_admin_cmd_query_cap_id_result {
> -	__le64 supported_caps[MAX_CAP_ID];
> +	__le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
>  };
>  

I feel it's worth explaining in commit log you are changing a
uapi structure, and explaining that it is safe.


>  struct virtio_admin_cmd_cap_get_data {
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-18 14:39 ` [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-11-18 21:45   ` Michael S. Tsirkin
  2025-11-19  7:35     ` Dan Jurgens
  2025-11-18 21:48   ` Michael S. Tsirkin
  1 sibling, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 21:45 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:39:00AM -0600, Daniel Jurgens wrote:
> Implement support for IPV6_USER_FLOW type rules.
> 
> Example:
> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
> Added rule with ID 0
> 
> The example rule will forward packets with the specified source and
> destination IP addresses to RX ring 3.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>


I find it weird that this does not modify setup_eth_hdr_key_mask

So it still hardcodes ETH_P_IP for all IP flows?
For IPv6, should it not use ETH_P_IPV6 instead?

how does it work?

> ---
> v4: commit message typo
> ---
>  drivers/net/virtio_net.c | 89 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c1adba60b6a8..78fc8f01b6c4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -6932,6 +6932,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
>  	return true;
>  }
>  
> +static bool validate_ip6_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ipv6hdr *cap, *mask;
> +
> +	cap = (struct ipv6hdr *)&sel_cap->mask;
> +	mask = (struct ipv6hdr *)&sel->mask;
> +
> +	if (!ipv6_addr_any(&mask->saddr) &&
> +	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> +			       sizeof(cap->saddr), partial_mask))
> +		return false;
> +
> +	if (!ipv6_addr_any(&mask->daddr) &&
> +	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> +			       sizeof(cap->daddr), partial_mask))
> +		return false;
> +
> +	if (mask->nexthdr &&
> +	    !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
> +	    sizeof(cap->nexthdr), partial_mask))
> +		return false;
> +
> +	return true;
> +}
> +
>  static bool validate_mask(const struct virtnet_ff *ff,
>  			  const struct virtio_net_ff_selector *sel)
>  {
> @@ -6946,6 +6974,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  
>  	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
>  		return validate_ip4_mask(ff, sel, sel_cap);
> +
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return validate_ip6_mask(ff, sel, sel_cap);
>  	}
>  
>  	return false;
> @@ -6968,11 +6999,38 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>  	}
>  }
>  
> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> +		      const struct ethtool_rx_flow_spec *fs)
> +{
> +	const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
> +	const struct ethtool_usrip6_spec *l3_val  = &fs->h_u.usr_ip6_spec;
> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
> +		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
> +		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
> +	}
> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
> +		memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> +		memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> +	}
> +
> +	if (l3_mask->l4_proto) {
> +		mask->nexthdr = l3_mask->l4_proto;
> +		key->nexthdr = l3_val->l4_proto;
> +	}
> +}
> +
>  static bool has_ipv4(u32 flow_type)
>  {
>  	return flow_type == IP_USER_FLOW;
>  }
>  
> +static bool has_ipv6(u32 flow_type)
> +{
> +	return flow_type == IPV6_USER_FLOW;
> +}
> +
>  static int setup_classifier(struct virtnet_ff *ff,
>  			    struct virtnet_classifier **c)
>  {
> @@ -7108,6 +7166,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  	switch (fs->flow_type) {
>  	case ETHER_FLOW:
>  	case IP_USER_FLOW:
> +	case IPV6_USER_FLOW:
>  		return true;
>  	}
>  
> @@ -7150,7 +7209,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>  	++(*num_hdrs);
>  	if (has_ipv4(fs->flow_type))
>  		size += sizeof(struct iphdr);
> -
> +	else if (has_ipv6(fs->flow_type))
> +		size += sizeof(struct ipv6hdr);
>  done:
>  	*key_size = size;
>  	/*
> @@ -7187,18 +7247,31 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>  			     u8 *key,
>  			     const struct ethtool_rx_flow_spec *fs)
>  {
> +	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
>  	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> +	struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
>  	struct iphdr *v4_k = (struct iphdr *)key;
>  
> -	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> -	selector->length = sizeof(struct iphdr);
> +	if (has_ipv6(fs->flow_type)) {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
> +		selector->length = sizeof(struct ipv6hdr);
>  
> -	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> -	    fs->h_u.usr_ip4_spec.tos ||
> -	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> -		return -EOPNOTSUPP;
> +		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip6_spec.tclass)
> +			return -EOPNOTSUPP;
>  
> -	parse_ip4(v4_m, v4_k, fs);
> +		parse_ip6(v6_m, v6_k, fs);
> +	} else {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> +		selector->length = sizeof(struct iphdr);
> +
> +		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip4_spec.tos ||
> +		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> +			return -EOPNOTSUPP;
> +
> +		parse_ip4(v4_m, v4_k, fs);
> +	}
>  
>  	return 0;
>  }
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-18 14:39 ` [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
  2025-11-18 21:45   ` Michael S. Tsirkin
@ 2025-11-18 21:48   ` Michael S. Tsirkin
  2025-11-19 16:04     ` Dan Jurgens
  1 sibling, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 21:48 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:39:00AM -0600, Daniel Jurgens wrote:
> Implement support for IPV6_USER_FLOW type rules.
> 
> Example:
> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
> Added rule with ID 0
> 
> The example rule will forward packets with the specified source and
> destination IP addresses to RX ring 3.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4: commit message typo
> ---
>  drivers/net/virtio_net.c | 89 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 81 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index c1adba60b6a8..78fc8f01b6c4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -6932,6 +6932,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
>  	return true;
>  }
>  
> +static bool validate_ip6_mask(const struct virtnet_ff *ff,
> +			      const struct virtio_net_ff_selector *sel,
> +			      const struct virtio_net_ff_selector *sel_cap)
> +{
> +	bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> +	struct ipv6hdr *cap, *mask;
> +
> +	cap = (struct ipv6hdr *)&sel_cap->mask;
> +	mask = (struct ipv6hdr *)&sel->mask;
> +
> +	if (!ipv6_addr_any(&mask->saddr) &&
> +	    !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> +			       sizeof(cap->saddr), partial_mask))
> +		return false;
> +
> +	if (!ipv6_addr_any(&mask->daddr) &&
> +	    !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> +			       sizeof(cap->daddr), partial_mask))
> +		return false;
> +
> +	if (mask->nexthdr &&
> +	    !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
> +	    sizeof(cap->nexthdr), partial_mask))

indent sizeof here please - align on ( not on !.

> +		return false;
> +
> +	return true;
> +}
> +
>  static bool validate_mask(const struct virtnet_ff *ff,
>  			  const struct virtio_net_ff_selector *sel)
>  {
> @@ -6946,6 +6974,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  
>  	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
>  		return validate_ip4_mask(ff, sel, sel_cap);
> +
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return validate_ip6_mask(ff, sel, sel_cap);
>  	}
>  
>  	return false;
> @@ -6968,11 +6999,38 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>  	}
>  }
>  
> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> +		      const struct ethtool_rx_flow_spec *fs)
> +{
> +	const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
> +	const struct ethtool_usrip6_spec *l3_val  = &fs->h_u.usr_ip6_spec;
> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
> +		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
> +		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
> +	}

so this checks mask then copies but parse_ip4 copies unconditionally?
why?

> +
> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
> +		memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> +		memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> +	}
> +
> +	if (l3_mask->l4_proto) {
> +		mask->nexthdr = l3_mask->l4_proto;
> +		key->nexthdr = l3_val->l4_proto;
> +	}
> +}
> +
>  static bool has_ipv4(u32 flow_type)
>  {
>  	return flow_type == IP_USER_FLOW;
>  }
>  
> +static bool has_ipv6(u32 flow_type)
> +{
> +	return flow_type == IPV6_USER_FLOW;
> +}
> +
>  static int setup_classifier(struct virtnet_ff *ff,
>  			    struct virtnet_classifier **c)
>  {
> @@ -7108,6 +7166,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  	switch (fs->flow_type) {
>  	case ETHER_FLOW:
>  	case IP_USER_FLOW:
> +	case IPV6_USER_FLOW:
>  		return true;
>  	}
>  
> @@ -7150,7 +7209,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>  	++(*num_hdrs);
>  	if (has_ipv4(fs->flow_type))
>  		size += sizeof(struct iphdr);
> -
> +	else if (has_ipv6(fs->flow_type))
> +		size += sizeof(struct ipv6hdr);
>  done:
>  	*key_size = size;
>  	/*
> @@ -7187,18 +7247,31 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>  			     u8 *key,
>  			     const struct ethtool_rx_flow_spec *fs)
>  {
> +	struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
>  	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> +	struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
>  	struct iphdr *v4_k = (struct iphdr *)key;
>  
> -	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> -	selector->length = sizeof(struct iphdr);
> +	if (has_ipv6(fs->flow_type)) {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
> +		selector->length = sizeof(struct ipv6hdr);
>  
> -	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> -	    fs->h_u.usr_ip4_spec.tos ||
> -	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> -		return -EOPNOTSUPP;
> +		if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip6_spec.tclass)
> +			return -EOPNOTSUPP;
>  
> -	parse_ip4(v4_m, v4_k, fs);
> +		parse_ip6(v6_m, v6_k, fs);
> +	} else {
> +		selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> +		selector->length = sizeof(struct iphdr);
> +
> +		if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> +		    fs->h_u.usr_ip4_spec.tos ||
> +		    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> +			return -EOPNOTSUPP;
> +
> +		parse_ip4(v4_m, v4_k, fs);
> +	}
>  
>  	return 0;
>  }
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-11-18 21:55   ` Michael S. Tsirkin
  2025-11-19  6:26     ` Dan Jurgens
  2025-11-19  7:42   ` Michael S. Tsirkin
  2025-11-19  9:01   ` Michael S. Tsirkin
  2 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 21:55 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> Classifiers can be used by more than one rule. If there is an existing
> classifier, use it instead of creating a new one.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
>     - Fixed typo in commit message
>     - for (int -> for (
> 
> v8:
>     - Removed unused num_classifiers. Jason Wang
> ---
>  drivers/net/virtio_net.c | 40 +++++++++++++++++++++++++++-------------
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index de1a23c71449..f392ea30f2c7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -299,7 +299,6 @@ struct virtnet_ff {
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
>  	struct xarray classifiers;
> -	int num_classifiers;
>  	struct virtnet_ethtool_ff ethtool;
>  };
>  
> @@ -6827,6 +6826,7 @@ struct virtnet_ethtool_rule {
>  /* The classifier struct must be the last field in this struct */
>  struct virtnet_classifier {
>  	size_t size;
> +	refcount_t refcount;
>  	u32 id;
>  	struct virtio_net_resource_obj_ff_classifier classifier;
>  };
> @@ -6920,11 +6920,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  	return false;
>  }
>  
> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +static int setup_classifier(struct virtnet_ff *ff,
> +			    struct virtnet_classifier **c)
>  {
> +	struct virtnet_classifier *tmp;
> +	unsigned long i;
>  	int err;
>  
> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> +	xa_for_each(&ff->classifiers, i, tmp) {
> +		if ((*c)->size == tmp->size &&
> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {

note that classifier has padding bytes.
comparing these with memcmp is not safe, is it?


> +			refcount_inc(&tmp->refcount);
> +			kfree(*c);
> +			*c = tmp;
> +			goto out;
> +		}
> +	}
> +
> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>  		       GFP_KERNEL);
>  	if (err)

what kind of locking prevents two threads racing in this code?


> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>  
>  	err = virtio_admin_obj_create(ff->vdev,
>  				      VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> -				      c->id,
> +				      (*c)->id,
>  				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
>  				      0,
> -				      &c->classifier,
> -				      c->size);
> +				      &(*c)->classifier,
> +				      (*c)->size);
>  	if (err)
>  		goto err_xarray;
>  
> +	refcount_set(&(*c)->refcount, 1);


so you insert uninitialized refcount? can't another thread find it
meanwhile?

> +out:
>  	return 0;
>  
>  err_xarray:
> -	xa_erase(&ff->classifiers, c->id);
> +	xa_erase(&ff->classifiers, (*c)->id);
>  
>  	return err;
>  }
>  
> -static void destroy_classifier(struct virtnet_ff *ff,
> -			       u32 classifier_id)
> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
>  {
>  	struct virtnet_classifier *c;
>  
>  	c = xa_load(&ff->classifiers, classifier_id);
> -	if (c) {
> +	if (c && refcount_dec_and_test(&c->refcount)) {
>  		virtio_admin_obj_destroy(ff->vdev,
>  					 VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
>  					 c->id,


same questions about locking.


> @@ -6978,7 +6992,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
>  				 0);
>  
>  	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> -	destroy_classifier(ff, eth_rule->classifier_id);
> +	try_destroy_classifier(ff, eth_rule->classifier_id);
>  	kfree(eth_rule);
>  }
>  
> @@ -7159,14 +7173,14 @@ static int build_and_insert(struct virtnet_ff *ff,
>  	if (err)
>  		goto err_key;
>  
> -	err = setup_classifier(ff, c);
> +	err = setup_classifier(ff, &c);
>  	if (err)
>  		goto err_classifier;
>  
>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
>  	if (err) {
>  		/* destroy_classifier will free the classifier */

will free is no longer correct, is it?

> -		destroy_classifier(ff, c->id);
> +		try_destroy_classifier(ff, c->id);
>  		goto err_key;
>  	}
>  
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-11-18 22:06   ` Michael S. Tsirkin
  2025-11-19  5:57     ` Dan Jurgens
  2025-11-18 23:03   ` Michael S. Tsirkin
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 22:06 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
> When probing a virtnet device, attempt to read the flow filter
> capabilities. In order to use the feature the caps must also
> be set. For now setting what was read is sufficient.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> 
> ---
> v4:
>     - Validate the length in the selector caps
>     - Removed __free usage.
>     - Removed for(int.
> v5:
>     - Remove unneed () after MAX_SEL_LEN macro (test bot)
> v6:
>     - Fix sparse warning "array of flexible structures" Jakub K/Simon H
>     - Use new variable and validate ff_mask_size before set_cap. MST
> v7:
>     - Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abeni
>     - Return errors from virtnet_ff_init, -ENOTSUPP is not fatal. Xuan
> 
> v8:
>     - Use real_ff_mask_size when setting the selector caps. Jason Wang
> 
> v9:
>     - Set err after failed memory allocations. Simon Horman
> 
> v10:
>     - Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
>       Jason/Paolo.
> 
> v11:
>     - Return -EINVAL if any resource limit is 0. Simon Horman
>     - Ensure we don't overrun alloced space of ff->ff_mask by moving the
>       real_ff_mask_size > ff_mask_size check into the loop. Simon Horman
> ---
>  drivers/net/virtio_net.c           | 201 +++++++++++++++++++++++++++++
>  include/linux/virtio_admin.h       |   1 +
>  include/uapi/linux/virtio_net_ff.h |  91 +++++++++++++
>  3 files changed, 293 insertions(+)
>  create mode 100644 include/uapi/linux/virtio_net_ff.h
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index cfa006b88688..3615f45ac358 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -26,6 +26,9 @@
>  #include <net/netdev_rx_queue.h>
>  #include <net/netdev_queues.h>
>  #include <net/xdp_sock_drv.h>
> +#include <linux/virtio_admin.h>
> +#include <net/ipv6.h>
> +#include <net/ip.h>
>  
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -281,6 +284,14 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
>  	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
>  };
>  
> +struct virtnet_ff {
> +	struct virtio_device *vdev;
> +	bool ff_supported;
> +	struct virtio_net_ff_cap_data *ff_caps;
> +	struct virtio_net_ff_cap_mask_data *ff_mask;
> +	struct virtio_net_ff_actions *ff_actions;
> +};
> +
>  #define VIRTNET_Q_TYPE_RX 0
>  #define VIRTNET_Q_TYPE_TX 1
>  #define VIRTNET_Q_TYPE_CQ 2
> @@ -493,6 +504,8 @@ struct virtnet_info {
>  	struct failover *failover;
>  
>  	u64 device_stats_cap;
> +
> +	struct virtnet_ff ff;
>  };
>  
>  struct padded_vnet_hdr {
> @@ -6774,6 +6787,183 @@ static const struct xdp_metadata_ops virtnet_xdp_metadata_ops = {
>  	.xmo_rx_hash			= virtnet_xdp_rx_hash,
>  };
>  
> +static size_t get_mask_size(u16 type)
> +{
> +	switch (type) {
> +	case VIRTIO_NET_FF_MASK_TYPE_ETH:
> +		return sizeof(struct ethhdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> +		return sizeof(struct iphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> +		return sizeof(struct ipv6hdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_TCP:
> +		return sizeof(struct tcphdr);
> +	case VIRTIO_NET_FF_MASK_TYPE_UDP:
> +		return sizeof(struct udphdr);
> +	}
> +
> +	return 0;
> +}
> +
> +#define MAX_SEL_LEN (sizeof(struct ipv6hdr))
> +
> +static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
> +{
> +	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
> +			      sizeof(struct virtio_net_ff_selector) *
> +			      VIRTIO_NET_FF_MASK_TYPE_MAX;
> +	struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
> +	struct virtio_net_ff_selector *sel;
> +	size_t real_ff_mask_size;
> +	int err;
> +	int i;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
> +	if (!cap_id_list)
> +		return -ENOMEM;
> +
> +	err = virtio_admin_cap_id_list_query(vdev, cap_id_list);
> +	if (err)
> +		goto err_cap_list;
> +
> +	if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_RESOURCE_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_SELECTOR_CAP) &&
> +	      VIRTIO_CAP_IN_LIST(cap_id_list,
> +				 VIRTIO_NET_FF_ACTION_CAP))) {
> +		err = -EOPNOTSUPP;
> +		goto err_cap_list;
> +	}
> +
> +	ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
> +	if (!ff->ff_caps) {
> +		err = -ENOMEM;
> +		goto err_cap_list;
> +	}
> +
> +	err = virtio_admin_cap_get(vdev,
> +				   VIRTIO_NET_FF_RESOURCE_CAP,
> +				   ff->ff_caps,
> +				   sizeof(*ff->ff_caps));
> +
> +	if (err)
> +		goto err_ff;
> +
> +	if (!ff->ff_caps->groups_limit ||
> +	    !ff->ff_caps->classifiers_limit ||
> +	    !ff->ff_caps->rules_limit ||
> +	    !ff->ff_caps->rules_per_group_limit) {
> +		err = -EINVAL;
> +		goto err_ff;
> +	}
> +
> +	/* VIRTIO_NET_FF_MASK_TYPE start at 1 */
> +	for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
> +		ff_mask_size += get_mask_size(i);
> +
> +	ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
> +	if (!ff->ff_mask) {
> +		err = -ENOMEM;
> +		goto err_ff;
> +	}
> +
> +	err = virtio_admin_cap_get(vdev,
> +				   VIRTIO_NET_FF_SELECTOR_CAP,
> +				   ff->ff_mask,
> +				   ff_mask_size);
> +
> +	if (err)
> +		goto err_ff_mask;
> +
> +	ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
> +					VIRTIO_NET_FF_ACTION_MAX,
> +					GFP_KERNEL);
> +	if (!ff->ff_actions) {
> +		err = -ENOMEM;
> +		goto err_ff_mask;
> +	}
> +
> +	err = virtio_admin_cap_get(vdev,
> +				   VIRTIO_NET_FF_ACTION_CAP,
> +				   ff->ff_actions,
> +				   sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_admin_cap_set(vdev,
> +				   VIRTIO_NET_FF_RESOURCE_CAP,
> +				   ff->ff_caps,
> +				   sizeof(*ff->ff_caps));
> +	if (err)
> +		goto err_ff_action;
> +
> +	real_ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
> +	sel = (void *)&ff->ff_mask->selectors[0];

why the cast? discards type checks for no good reason I can see.

And &...[0] is unnecessarily funky. Just plain ff->ff_mask->selectors
will do.


> +
> +	for (i = 0; i < ff->ff_mask->count; i++) {
> +		if (sel->length > MAX_SEL_LEN) {
> +			err = -EINVAL;
> +			goto err_ff_action;
> +		}
> +		real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
> +		if (real_ff_mask_size > ff_mask_size) {
> +			err = -EINVAL;
> +			goto err_ff_action;
> +		}
> +		sel = (void *)sel + sizeof(*sel) + sel->length;

I guess the MAX_SEL_LEN check guarantees the allocation
is big enough? Let's add a BUG_ON just in case.

> +	}
> +
> +	err = virtio_admin_cap_set(vdev,
> +				   VIRTIO_NET_FF_SELECTOR_CAP,
> +				   ff->ff_mask,
> +				   real_ff_mask_size);
> +	if (err)
> +		goto err_ff_action;
> +
> +	err = virtio_admin_cap_set(vdev,
> +				   VIRTIO_NET_FF_ACTION_CAP,
> +				   ff->ff_actions,
> +				   sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +	if (err)
> +		goto err_ff_action;
> +
> +	ff->vdev = vdev;
> +	ff->ff_supported = true;
> +
> +	kfree(cap_id_list);
> +
> +	return 0;
> +
> +err_ff_action:
> +	kfree(ff->ff_actions);
> +	ff->ff_actions = NULL;
> +err_ff_mask:
> +	kfree(ff->ff_mask);
> +	ff->ff_mask = NULL;
> +err_ff:
> +	kfree(ff->ff_caps);
> +	ff->ff_caps = NULL;
> +err_cap_list:
> +	kfree(cap_id_list);
> +
> +	return err;
> +}
> +
> +static void virtnet_ff_cleanup(struct virtnet_ff *ff)
> +{
> +	if (!ff->ff_supported)
> +		return;
> +
> +	kfree(ff->ff_actions);
> +	kfree(ff->ff_mask);
> +	kfree(ff->ff_caps);
> +}
> +
>  static int virtnet_probe(struct virtio_device *vdev)
>  {
>  	int i, err = -ENOMEM;
> @@ -7137,6 +7327,15 @@ static int virtnet_probe(struct virtio_device *vdev)
>  	}
>  	vi->guest_offloads_capable = vi->guest_offloads;
>  
> +	/* Initialize flow filters. Not supported is an acceptable and common
> +	 * return code
> +	 */
> +	err = virtnet_ff_init(&vi->ff, vi->vdev);
> +	if (err && err != -EOPNOTSUPP) {
> +		rtnl_unlock();
> +		goto free_unregister_netdev;
> +	}
> +
>  	rtnl_unlock();
>  
>  	err = virtnet_cpu_notif_add(vi);
> @@ -7152,6 +7351,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>  
>  free_unregister_netdev:
>  	unregister_netdev(dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  free_failover:
>  	net_failover_destroy(vi->failover);
>  free_vqs:
> @@ -7201,6 +7401,7 @@ static void virtnet_remove(struct virtio_device *vdev)
>  	virtnet_free_irq_moder(vi);
>  
>  	unregister_netdev(vi->dev);
> +	virtnet_ff_cleanup(&vi->ff);
>  
>  	net_failover_destroy(vi->failover);
>  
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> index 039b996f73ec..db0f42346ca9 100644
> --- a/include/linux/virtio_admin.h
> +++ b/include/linux/virtio_admin.h
> @@ -3,6 +3,7 @@
>   * Header file for virtio admin operations
>   */
>  #include <uapi/linux/virtio_pci.h>
> +#include <uapi/linux/virtio_net_ff.h>
>  
>  #ifndef _LINUX_VIRTIO_ADMIN_H
>  #define _LINUX_VIRTIO_ADMIN_H


Why do it? Let net pull this header itself.


> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..bd7a194a9959
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h


you should document in commit log that you are adding
these types to UAPI, and which spec they are from.

> @@ -0,0 +1,91 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +/**
> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> + * @groups_limit: maximum number of flow filter groups supported by the device
> + * @classifiers_limit: maximum number of classifiers supported by the device
> + * @rules_limit: maximum number of rules supported device-wide across all groups
> + * @rules_per_group_limit: maximum number of rules allowed in a single group
> + * @last_rule_priority: priority value associated with the lowest-priority rule
> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
> + *
> + * The limits are reported by the device and describe resource capacities for
> + * flow filters. Multi-byte fields are little-endian.
> + */
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;


Ouch this is a problem. There is a 2 byte padding here.

This is a spec bug but I don't know if it is too late to fix.

Parav what do you think?




> +};
> +
> +/**
> + * struct virtio_net_ff_selector - Selector mask descriptor
> + * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
> + * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @length: size in bytes of @mask
> + * @reserved1: must be set to 0 by the driver and ignored by the device
> + * @mask: variable-length mask payload for @type, length given by @length
> + *
> + * A selector describes a header mask that a classifier can apply. The format
> + * of @mask depends on @type.
> + */
> +struct virtio_net_ff_selector {
> +	__u8 type;
> +	__u8 flags;
> +	__u8 reserved[2];
> +	__u8 length;
> +	__u8 reserved1[3];
> +	__u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +/**
> + * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
> + * @count: number of entries in @selectors
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @selectors: array of supported selector descriptors
> + */
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 selectors[];
> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
> +/**
> + * struct virtio_net_ff_actions - Supported flow actions
> + * @count: number of supported actions in @actions
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
> + */
> +struct virtio_net_ff_actions {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 actions[];
> +};
> +#endif
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 04/12] virtio: Expose object create and destroy API
  2025-11-18 14:38 ` [PATCH net-next v11 04/12] virtio: Expose object create and destroy API Daniel Jurgens
@ 2025-11-18 22:14   ` Michael S. Tsirkin
  2025-11-19  3:29     ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 22:14 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:54AM -0600, Daniel Jurgens wrote:
> Object create and destroy were implemented specifically for dev parts
> device objects. Create general purpose APIs for use by upper layer
> drivers.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
> ---
> v4: Moved this logic from virtio_pci_modern to new file
>     virtio_admin_commands.
> v5: Added missing params, and synced names in comments (Alok Tiwari)
> ---
>  drivers/virtio/virtio_admin_commands.c | 75 ++++++++++++++++++++++++++
>  include/linux/virtio_admin.h           | 44 +++++++++++++++
>  2 files changed, 119 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
> index 94751d16b3c4..2b80548ba3bc 100644
> --- a/drivers/virtio/virtio_admin_commands.c
> +++ b/drivers/virtio/virtio_admin_commands.c
> @@ -88,3 +88,78 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
>  	return err;
>  }
>  EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
> +
> +int virtio_admin_obj_create(struct virtio_device *vdev,
> +			    u16 obj_type,
> +			    u32 obj_id,
> +			    u16 group_type,
> +			    u64 group_member_id,
> +			    const void *obj_specific_data,
> +			    size_t obj_specific_data_size)
> +{
> +	size_t data_size = sizeof(struct virtio_admin_cmd_resource_obj_create_data);
> +	struct virtio_admin_cmd_resource_obj_create_data *obj_create_data;
> +	struct virtio_admin_cmd cmd = {};
> +	struct scatterlist data_sg;
> +	void *data;
> +	int err;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	data_size += obj_specific_data_size;
> +	data = kzalloc(data_size, GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	obj_create_data = data;
> +	obj_create_data->hdr.type = cpu_to_le16(obj_type);
> +	obj_create_data->hdr.id = cpu_to_le32(obj_id);
> +	memcpy(obj_create_data->resource_obj_specific_data, obj_specific_data,
> +	       obj_specific_data_size);
> +	sg_init_one(&data_sg, data, data_size);
> +
> +	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_CREATE);
> +	cmd.group_type = cpu_to_le16(group_type);
> +	cmd.group_member_id = cpu_to_le64(group_member_id);
> +	cmd.data_sg = &data_sg;
> +
> +	err = vdev->config->admin_cmd_exec(vdev, &cmd);
> +	kfree(data);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_obj_create);
> +
> +int virtio_admin_obj_destroy(struct virtio_device *vdev,
> +			     u16 obj_type,
> +			     u32 obj_id,
> +			     u16 group_type,
> +			     u64 group_member_id)

what's the point of making it int when none of the callers
check the return type?

> +{
> +	struct virtio_admin_cmd_resource_obj_cmd_hdr *data;
> +	struct virtio_admin_cmd cmd = {};
> +	struct scatterlist data_sg;
> +	int err;
> +
> +	if (!vdev->config->admin_cmd_exec)
> +		return -EOPNOTSUPP;
> +
> +	data = kzalloc(sizeof(*data), GFP_KERNEL);
> +	if (!data)
> +		return -ENOMEM;
> +
> +	data->type = cpu_to_le16(obj_type);
> +	data->id = cpu_to_le32(obj_id);
> +	sg_init_one(&data_sg, data, sizeof(*data));
> +	cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_DESTROY);
> +	cmd.group_type = cpu_to_le16(group_type);
> +	cmd.group_member_id = cpu_to_le64(group_member_id);
> +	cmd.data_sg = &data_sg;
> +
> +	err = vdev->config->admin_cmd_exec(vdev, &cmd);
> +	kfree(data);
> +
> +	return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_obj_destroy);
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> index 36df97b6487a..039b996f73ec 100644
> --- a/include/linux/virtio_admin.h
> +++ b/include/linux/virtio_admin.h
> @@ -77,4 +77,48 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
>  			 const void *caps,
>  			 size_t cap_size);
>  
> +/**
> + * virtio_admin_obj_create - Create an object on a virtio device
> + * @vdev: the virtio device
> + * @obj_type: type of object to create
> + * @obj_id: ID for the new object
> + * @group_type: administrative group type for the operation
> + * @group_member_id: member identifier within the administrative group
> + * @obj_specific_data: object-specific data for creation
> + * @obj_specific_data_size: size of the object-specific data in bytes
> + *
> + * Creates a new object on the virtio device with the specified type and ID.
> + * The object may require object-specific data for proper initialization.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or object creation, or a negative error code on other failures.
> + */
> +int virtio_admin_obj_create(struct virtio_device *vdev,
> +			    u16 obj_type,
> +			    u32 obj_id,
> +			    u16 group_type,
> +			    u64 group_member_id,
> +			    const void *obj_specific_data,
> +			    size_t obj_specific_data_size);
> +
> +/**
> + * virtio_admin_obj_destroy - Destroy an object on a virtio device
> + * @vdev: the virtio device
> + * @obj_type: type of object to destroy
> + * @obj_id: ID of the object to destroy
> + * @group_type: administrative group type for the operation
> + * @group_member_id: member identifier within the administrative group
> + *
> + * Destroys an existing object on the virtio device with the specified type
> + * and ID.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or object destruction, or a negative error code on other failures.
> + */
> +int virtio_admin_obj_destroy(struct virtio_device *vdev,
> +			     u16 obj_type,
> +			     u32 obj_id,
> +			     u16 group_type,
> +			     u64 group_member_id);
> +
>  #endif /* _LINUX_VIRTIO_ADMIN_H */
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops
  2025-11-18 14:39 ` [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
  2025-11-18 18:49   ` Michael S. Tsirkin
@ 2025-11-18 22:39   ` Michael S. Tsirkin
  1 sibling, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 22:39 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:39:02AM -0600, Daniel Jurgens wrote:
> @@ -5665,6 +5672,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
>  	return vi->curr_queue_pairs;
>  }
>  
> +static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
> +{
> +	struct virtnet_info *vi = netdev_priv(dev);
> +	int rc = 0;

not sure you should do this btw - it is set on all paths,
and if you do not set it here then compiler will warn if
we add a clause and forget to set rc.

> +
> +	switch (info->cmd) {
> +	case ETHTOOL_GRXCLSRLCNT:
> +		rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRULE:
> +		rc = virtnet_ethtool_get_flow(&vi->ff, info);
> +		break;
> +	case ETHTOOL_GRXCLSRLALL:
> +		rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
> +		break;
> +	default:
> +		rc = -EOPNOTSUPP;
> +	}
> +
> +	return rc;
> +}
> +
>  static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
>  {
>  	struct virtnet_info *vi = netdev_priv(dev);


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
  2025-11-18 22:06   ` Michael S. Tsirkin
@ 2025-11-18 23:03   ` Michael S. Tsirkin
  2025-11-19  4:27     ` Dan Jurgens
  2025-11-19  7:53   ` Michael S. Tsirkin
  2025-11-19  7:55   ` Michael S. Tsirkin
  3 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-18 23:03 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..bd7a194a9959
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -0,0 +1,91 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>


I do not get why you are pulling linux/kernel.h here.

include/uapi/linux/virtio_pci.h does it too,
and I think it's also a bug.

No other uapi header does this, it happens not to break userspace
because userspace puts a completely unrelated header at
the same path - uapi/linux/kernel.h .



> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +/**
> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> + * @groups_limit: maximum number of flow filter groups supported by the device
> + * @classifiers_limit: maximum number of classifiers supported by the device
> + * @rules_limit: maximum number of rules supported device-wide across all groups
> + * @rules_per_group_limit: maximum number of rules allowed in a single group
> + * @last_rule_priority: priority value associated with the lowest-priority rule
> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
> + *
> + * The limits are reported by the device and describe resource capacities for
> + * flow filters. Multi-byte fields are little-endian.
> + */
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;
> +};
> +
> +/**
> + * struct virtio_net_ff_selector - Selector mask descriptor
> + * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
> + * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @length: size in bytes of @mask
> + * @reserved1: must be set to 0 by the driver and ignored by the device
> + * @mask: variable-length mask payload for @type, length given by @length
> + *
> + * A selector describes a header mask that a classifier can apply. The format
> + * of @mask depends on @type.
> + */
> +struct virtio_net_ff_selector {
> +	__u8 type;
> +	__u8 flags;
> +	__u8 reserved[2];
> +	__u8 length;
> +	__u8 reserved1[3];
> +	__u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +/**
> + * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
> + * @count: number of entries in @selectors
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @selectors: array of supported selector descriptors
> + */
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 selectors[];
> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
> +/**
> + * struct virtio_net_ff_actions - Supported flow actions
> + * @count: number of supported actions in @actions
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
> + */
> +struct virtio_net_ff_actions {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 actions[];
> +};
> +#endif
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 03/12] virtio: Expose generic device capability operations
  2025-11-18 21:42   ` Michael S. Tsirkin
@ 2025-11-19  3:27     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  3:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 3:42 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:53AM -0600, Daniel Jurgens wrote:

>> +#ifndef _LINUX_VIRTIO_ADMIN_H
>> +#define _LINUX_VIRTIO_ADMIN_H
> 
> 
> Guards normally come before #include - there is no
> point in pulling in uapi/linux/virtio_pci.h - just
> extra work for the compiler.
> 
> 

Removed the include.

> 
>> +
>> +struct virtio_device;

>> + */
>> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
>> +	(!!(1 & (le64_to_cpu(cap_list->supported_caps[cap / 64]) >> cap % 64)))
> 
> while this works if cap is a variable, it will behave
> unexpectedly if cap or even cap_list is an expression.
> 
> A standard practice is to put all macro arguments in brackets:
> !!(1 & (le64_to_cpu((cap_list)->supported_caps[(cap) / 64]) >> (cap) % 64)))
> 
> 

done

> 
> 
> 
>> +

>>  #define VIRTIO_DEV_PARTS_CAP 0x0000
>>  
>> +/* Update this value to largest implemented cap number. */
> 
> implemented by what?

Removed the comment.

> 
>> +#define VIRTIO_ADMIN_MAX_CAP 0x0fff
>> +

>> -#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
>> +#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP, 64)
> 
> Don't you mean VIRTIO_ADMIN_MAX_CAP + 1 here?
> E.g. if VIRTIO_ADMIN_MAX_CAP was 0 we would need space for 1 capability,
> right?
> 

Added the +1, it's the same result either way here.

>>  
>>  struct virtio_admin_cmd_query_cap_id_result {
>> -	__le64 supported_caps[MAX_CAP_ID];
>> +	__le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
>>  };
>>  
> 
> I feel it's worth explaining in commit log you are changing a
> uapi structure, and explaining that it is safe.
> 

Done

> 
>>  struct virtio_admin_cmd_cap_get_data {
>> -- 
>> 2.50.1
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 04/12] virtio: Expose object create and destroy API
  2025-11-18 22:14   ` Michael S. Tsirkin
@ 2025-11-19  3:29     ` Dan Jurgens
  2025-11-19  6:39       ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  3:29 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 4:14 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:54AM -0600, Daniel Jurgens wrote:

>> +int virtio_admin_obj_destroy(struct virtio_device *vdev,
>> +			     u16 obj_type,
>> +			     u32 obj_id,
>> +			     u16 group_type,
>> +			     u64 group_member_id)
> 
> what's the point of making it int when none of the callers
> check the return type?
> 

It's an API, and return codes are available. I don't have a use for them
in this series but perhaps a future user will.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 23:03   ` Michael S. Tsirkin
@ 2025-11-19  4:27     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  4:27 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 5:03 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
>> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
>> new file mode 100644
>> index 000000000000..bd7a194a9959
>> --- /dev/null
>> +++ b/include/uapi/linux/virtio_net_ff.h
>> @@ -0,0 +1,91 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
>> + *
>> + * Header file for virtio_net flow filters
>> + */
>> +#ifndef _LINUX_VIRTIO_NET_FF_H
>> +#define _LINUX_VIRTIO_NET_FF_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
> 
> 
> I do not get why you are pulling linux/kernel.h here.
> 
> include/uapi/linux/virtio_pci.h does it too,
> and I think it's also a bug.
> 
> No other uapi header does this, it happens not to break userspace
> because userspace puts a completely unrelated header at
> the same path - uapi/linux/kernel.h .
> 

Removed it. Previously I had added the FF definitions to virtio_pci.h, I
guess I kept that include when I split this out.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 22:06   ` Michael S. Tsirkin
@ 2025-11-19  5:57     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  5:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 4:06 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
>> When probing a virtnet device, attempt to read the flow filter

>> +	real_ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
>> +	sel = (void *)&ff->ff_mask->selectors[0];
> 
> why the cast? discards type checks for no good reason I can see.
> 

Jakub and Simon both requested I change the type of ff_mask->selectors
to u8 [], to eliminate sparse warnings about array of flexible
structures. So a cast is needed here.

> And &...[0] is unnecessarily funky. Just plain ff->ff_mask->selectors
> will do.
> 

Done

> 
>> +
>> +	for (i = 0; i < ff->ff_mask->count; i++) {
>> +		if (sel->length > MAX_SEL_LEN) {
>> +			err = -EINVAL;
>> +			goto err_ff_action;
>> +		}
>> +		real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
>> +		if (real_ff_mask_size > ff_mask_size) {
>> +			err = -EINVAL;
>> +			goto err_ff_action;
>> +		}
>> +		sel = (void *)sel + sizeof(*sel) + sel->length;
> 
> I guess the MAX_SEL_LEN check guarantees the allocation
> is big enough? Let's add a BUG_ON just in case.

Added BUG_ON in both error cases.

> 


>>  
>>  #ifndef _LINUX_VIRTIO_ADMIN_H
>>  #define _LINUX_VIRTIO_ADMIN_H
> 
> 
> Why do it? Let net pull this header itself.

Done

> 
> 
>> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
>> new file mode 100644
>> index 000000000000..bd7a194a9959
>> --- /dev/null
>> +++ b/include/uapi/linux/virtio_net_ff.h
> 
> 
> you should document in commit log that you are adding
> these types to UAPI, and which spec they are from.

Done

> 
>> @@ -0,0 +1,91 @@
>> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
>> + *
>> + * Header file for virtio_net flow filters
>> + */
>> +#ifndef _LINUX_VIRTIO_NET_FF_H
>> +#define _LINUX_VIRTIO_NET_FF_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +
>> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
>> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
>> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
>> +
>> +/**
>> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
>> + * @groups_limit: maximum number of flow filter groups supported by the device
>> + * @classifiers_limit: maximum number of classifiers supported by the device
>> + * @rules_limit: maximum number of rules supported device-wide across all groups
>> + * @rules_per_group_limit: maximum number of rules allowed in a single group
>> + * @last_rule_priority: priority value associated with the lowest-priority rule
>> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
>> + *
>> + * The limits are reported by the device and describe resource capacities for
>> + * flow filters. Multi-byte fields are little-endian.
>> + */
>> +struct virtio_net_ff_cap_data {
>> +	__le32 groups_limit;
>> +	__le32 classifiers_limit;
>> +	__le32 rules_limit;
>> +	__le32 rules_per_group_limit;
>> +	__u8 last_rule_priority;
>> +	__u8 selectors_per_classifier_limit;
> 
> 
> Ouch this is a problem. There is a 2 byte padding here.
> 
> This is a spec bug but I don't know if it is too late to fix.
> 
> Parav what do you think?
> 
> 
> 
> 
>> +};
>> +
>> +/**
>> + * struct virtio_net_ff_selector - Selector mask descriptor
>> + * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
>> + * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
>> + * @reserved: must be set to 0 by the driver and ignored by the device
>> + * @length: size in bytes of @mask
>> + * @reserved1: must be set to 0 by the driver and ignored by the device
>> + * @mask: variable-length mask payload for @type, length given by @length
>> + *
>> + * A selector describes a header mask that a classifier can apply. The format
>> + * of @mask depends on @type.
>> + */
>> +struct virtio_net_ff_selector {
>> +	__u8 type;
>> +	__u8 flags;
>> +	__u8 reserved[2];
>> +	__u8 length;
>> +	__u8 reserved1[3];
>> +	__u8 mask[];
>> +};
>> +
>> +#define VIRTIO_NET_FF_MASK_TYPE_ETH  1
>> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
>> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
>> +#define VIRTIO_NET_FF_MASK_TYPE_TCP  4
>> +#define VIRTIO_NET_FF_MASK_TYPE_UDP  5
>> +#define VIRTIO_NET_FF_MASK_TYPE_MAX  VIRTIO_NET_FF_MASK_TYPE_UDP
>> +
>> +/**
>> + * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
>> + * @count: number of entries in @selectors
>> + * @reserved: must be set to 0 by the driver and ignored by the device
>> + * @selectors: array of supported selector descriptors
>> + */
>> +struct virtio_net_ff_cap_mask_data {
>> +	__u8 count;
>> +	__u8 reserved[7];
>> +	__u8 selectors[];
>> +};
>> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
>> +
>> +#define VIRTIO_NET_FF_ACTION_DROP 1
>> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
>> +#define VIRTIO_NET_FF_ACTION_MAX  VIRTIO_NET_FF_ACTION_RX_VQ
>> +/**
>> + * struct virtio_net_ff_actions - Supported flow actions
>> + * @count: number of supported actions in @actions
>> + * @reserved: must be set to 0 by the driver and ignored by the device
>> + * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
>> + */
>> +struct virtio_net_ff_actions {
>> +	__u8 count;
>> +	__u8 reserved[7];
>> +	__u8 actions[];
>> +};
>> +#endif
>> -- 
>> 2.50.1
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-18 19:01   ` Michael S. Tsirkin
@ 2025-11-19  6:07     ` Dan Jurgens
  2025-11-19  9:20     ` Michael S. Tsirkin
  1 sibling, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  6:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 1:01 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:57AM -0600, Daniel Jurgens wrote:
>> Filtering a flow requires a classifier to match the packets, and a rule

>> +		       size_t key_size)
>> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
>> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
>> +	ff_rule->key_length = (u8)key_size;
> 
> I don't think you need this cast. 
> 
> BTW why do you insist on making all this math in size_t variables?
> 
> Just u8 should do, and in calculate_flow_sizes you can do a BUG_ON to check
> it does not overflow.
> 

Ok

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-18 21:55   ` Michael S. Tsirkin
@ 2025-11-19  6:26     ` Dan Jurgens
  2025-11-19  6:35       ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  6:26 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
>> Classifiers can be used by more than one rule. If there is an existing
>> classifier, use it instead of creating a new one.

>> +	struct virtnet_classifier *tmp;
>> +	unsigned long i;
>>  	int err;
>>  
>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
>> +	xa_for_each(&ff->classifiers, i, tmp) {
>> +		if ((*c)->size == tmp->size &&
>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> 
> note that classifier has padding bytes.
> comparing these with memcmp is not safe, is it?

The reserved bytes are set to 0, this is fine.

> 
> 
>> +			refcount_inc(&tmp->refcount);
>> +			kfree(*c);
>> +			*c = tmp;
>> +			goto out;
>> +		}
>> +	}
>> +
>> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>>  		       GFP_KERNEL);
>>  	if (err)
> 
> what kind of locking prevents two threads racing in this code?

The ethtool calls happen under rtnl_lock.

> 
> 
>> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>>  		      (*c)->size);
>>  	if (err)
>>  		goto err_xarray;
>>  
>> +	refcount_set(&(*c)->refcount, 1);
> 
> 
> so you insert uninitialized refcount? can't another thread find it
> meanwhile?

Again, rtnl_lock.


>>  
>>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
>>  	if (err) {
>>  		/* destroy_classifier will free the classifier */
> 
> will free is no longer correct, is it?

Clarified the comment.

> 
>> -		destroy_classifier(ff, c->id);
>> +		try_destroy_classifier(ff, c->id);
>>  		goto err_key;
>>  	}
>>  
>> -- 
>> 2.50.1
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  6:26     ` Dan Jurgens
@ 2025-11-19  6:35       ` Michael S. Tsirkin
  2025-11-19  7:18         ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  6:35 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
> > On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> >> Classifiers can be used by more than one rule. If there is an existing
> >> classifier, use it instead of creating a new one.
> 
> >> +	struct virtnet_classifier *tmp;
> >> +	unsigned long i;
> >>  	int err;
> >>  
> >> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> >> +	xa_for_each(&ff->classifiers, i, tmp) {
> >> +		if ((*c)->size == tmp->size &&
> >> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> > 
> > note that classifier has padding bytes.
> > comparing these with memcmp is not safe, is it?
> 
> The reserved bytes are set to 0, this is fine.

I mean the compiler padding.  set to 0 where?

> > 
> > 
> >> +			refcount_inc(&tmp->refcount);
> >> +			kfree(*c);
> >> +			*c = tmp;
> >> +			goto out;
> >> +		}
> >> +	}
> >> +
> >> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
> >>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> >>  		       GFP_KERNEL);
> >>  	if (err)
> > 
> > what kind of locking prevents two threads racing in this code?
> 
> The ethtool calls happen under rtnl_lock.
> 
> > 
> > 
> >> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> >>  		      (*c)->size);
> >>  	if (err)
> >>  		goto err_xarray;
> >>  
> >> +	refcount_set(&(*c)->refcount, 1);
> > 
> > 
> > so you insert uninitialized refcount? can't another thread find it
> > meanwhile?
> 
> Again, rtnl_lock.
> 
> 
> >>  
> >>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> >>  	if (err) {
> >>  		/* destroy_classifier will free the classifier */
> > 
> > will free is no longer correct, is it?
> 
> Clarified the comment.
> 
> > 
> >> -		destroy_classifier(ff, c->id);
> >> +		try_destroy_classifier(ff, c->id);
> >>  		goto err_key;
> >>  	}
> >>  
> >> -- 
> >> 2.50.1
> > 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 04/12] virtio: Expose object create and destroy API
  2025-11-19  3:29     ` Dan Jurgens
@ 2025-11-19  6:39       ` Michael S. Tsirkin
  2025-11-19  7:21         ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  6:39 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 09:29:05PM -0600, Dan Jurgens wrote:
> On 11/18/25 4:14 PM, Michael S. Tsirkin wrote:
> > On Tue, Nov 18, 2025 at 08:38:54AM -0600, Daniel Jurgens wrote:
> 
> >> +int virtio_admin_obj_destroy(struct virtio_device *vdev,
> >> +			     u16 obj_type,
> >> +			     u32 obj_id,
> >> +			     u16 group_type,
> >> +			     u64 group_member_id)
> > 
> > what's the point of making it int when none of the callers
> > check the return type?
> > 
> 
> It's an API, and return codes are available. I don't have a use for them
> in this series but perhaps a future user will.

For starters let's address the existing use which wants it to never fail.
I would say do something with an error inside the function.
Maybe just WARN_ON_ONCE.



-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-18 21:31   ` Michael S. Tsirkin
@ 2025-11-19  7:03     ` Dan Jurgens
  2025-11-19  7:06       ` Michael S. Tsirkin
  2025-11-19  9:18     ` Michael S. Tsirkin
  1 sibling, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 3:31 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:59AM -0600, Daniel Jurgens wrote:
>> Add support for IP_USER type rules from ethtool.
>>
>> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> +		      const struct ethtool_rx_flow_spec *fs)
>> +{
>> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
>> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
>> +
>> +	mask->saddr = l3_mask->ip4src;
>> +	mask->daddr = l3_mask->ip4dst;
>> +	key->saddr = l3_val->ip4src;
>> +	key->daddr = l3_val->ip4dst;
>> +
>> +	if (l3_mask->proto) {
> 
> you seem to check mask for proto here but the ethtool_usrip4_spec doc
> seems to say the mask for proto must be 0. 
> 
> 
> what gives?
> 

Then for user_ip flows ethtool should provide 0 as the mask, and based
on your comment below I'm verifying that.

I can move this hunk to the TCP/UDP patch if you prefer.

> 
>> +		mask->protocol = l3_mask->proto;
>> +		key->protocol = l3_val->proto;
>> +	}
>> +}

>> +	size_t size = sizeof(struct ethhdr);
>> +
>>  	*num_hdrs = 1;
>>  	*key_size = sizeof(struct ethhdr);
> 
> So *key_size  is assigned here ...
> 
>> +
>> +	if (fs->flow_type == ETHER_FLOW)
>> +		goto done;
>> +
>> +	++(*num_hdrs);
>> +	if (has_ipv4(fs->flow_type))
>> +		size += sizeof(struct iphdr);
>> +
> 
> ... never used
> 
>> +done:
>> +	*key_size = size;
> 
> and over-written here.
>
> 
> what is going on here, is that this is spaghetti code
> misusing goto for if instructions which obscures the flow.
>

> It should be if (fs->flow_type != ETHER_FLOW) {
> 
> 	... rest of code ...
> }
> 
> and then it will be clear doing *key_size = size once is enough.
>

Done

> 
>>  	/*

>> +
>> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>> +	    fs->h_u.usr_ip4_spec.tos ||
>> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
>> +		return -EOPNOTSUPP;
> 
> So include/uapi/linux/ethtool.h says:
> 
>  * struct ethtool_usrip4_spec - general flow specification for IPv4
>  * @ip4src: Source host
>  * @ip4dst: Destination host
>  * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>  * @tos: Type-of-service
>  * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
>  * @proto: Transport protocol number; mask must be 0
> 
> I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
> documentation? But then shouldn't you check the mask
> as well? and mask for proto?
> 
Done


> 
> 
> 

>> -	setup_eth_hdr_key_mask(selector, key, fs);
>> +	setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
>> +	if (num_hdrs == 1)
>> +		goto validate;
> 
> 
> Please stop abusing goto's for if.
> this is not error handling, not breaking out of loops ...
> 
> 
> please do not.
> 

Done

> 
>> +
>> +	selector = next_selector(selector);
>> +
>> +	err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
>> +	if (err)
>> +		goto err_classifier;
>>  
>> +validate:
>>  	err = validate_classifier_selectors(ff, classifier, num_hdrs);
>>  	if (err)
>>  		goto err_key;
>> -- 
>> 2.50.1
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19  7:03     ` Dan Jurgens
@ 2025-11-19  7:06       ` Michael S. Tsirkin
  2025-11-19  7:17         ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:06 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 01:03:36AM -0600, Dan Jurgens wrote:
> On 11/18/25 3:31 PM, Michael S. Tsirkin wrote:
> > On Tue, Nov 18, 2025 at 08:38:59AM -0600, Daniel Jurgens wrote:
> >> Add support for IP_USER type rules from ethtool.
> >>
> >> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> >> +		      const struct ethtool_rx_flow_spec *fs)
> >> +{
> >> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
> >> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
> >> +
> >> +	mask->saddr = l3_mask->ip4src;
> >> +	mask->daddr = l3_mask->ip4dst;
> >> +	key->saddr = l3_val->ip4src;
> >> +	key->daddr = l3_val->ip4dst;
> >> +
> >> +	if (l3_mask->proto) {
> > 
> > you seem to check mask for proto here but the ethtool_usrip4_spec doc
> > seems to say the mask for proto must be 0. 
> > 
> > 
> > what gives?
> > 
> 
> Then for user_ip flows ethtool should provide 0 as the mask, and based
> on your comment below I'm verifying that.

but if it does then how did this patch work in your testing?

> I can move this hunk to the TCP/UDP patch if you prefer.


not sure what you mean so I can't comment on that.
generally it's best to add code in the same patch where
it's used - easier to review.

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19  7:06       ` Michael S. Tsirkin
@ 2025-11-19  7:17         ` Dan Jurgens
  2025-11-19  7:20           ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:17 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 1:06 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:03:36AM -0600, Dan Jurgens wrote:
>> On 11/18/25 3:31 PM, Michael S. Tsirkin wrote:
>>> On Tue, Nov 18, 2025 at 08:38:59AM -0600, Daniel Jurgens wrote:
>>>> Add support for IP_USER type rules from ethtool.
>>>>
>>>> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>>>> +		      const struct ethtool_rx_flow_spec *fs)
>>>> +{
>>>> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
>>>> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
>>>> +
>>>> +	mask->saddr = l3_mask->ip4src;
>>>> +	mask->daddr = l3_mask->ip4dst;
>>>> +	key->saddr = l3_val->ip4src;
>>>> +	key->daddr = l3_val->ip4dst;
>>>> +
>>>> +	if (l3_mask->proto) {
>>>
>>> you seem to check mask for proto here but the ethtool_usrip4_spec doc
>>> seems to say the mask for proto must be 0. 
>>>
>>>
>>> what gives?
>>>
>>
>> Then for user_ip flows ethtool should provide 0 as the mask, and based
>> on your comment below I'm verifying that.
> 
> but if it does then how did this patch work in your testing?

Why wouldn't it work? For IP only flows the proto field is not relevant.
It only filters on IP address, not port.

> 
>> I can move this hunk to the TCP/UDP patch if you prefer.
> 
> 
> not sure what you mean so I can't comment on that.
> generally it's best to add code in the same patch where
> it's used - easier to review.
> 

the l3_mask->proto will only be set for TCP/UDP flows.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  6:35       ` Michael S. Tsirkin
@ 2025-11-19  7:18         ` Dan Jurgens
  2025-11-19  7:23           ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:18 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
>> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
>>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
>>>> Classifiers can be used by more than one rule. If there is an existing
>>>> classifier, use it instead of creating a new one.
>>
>>>> +	struct virtnet_classifier *tmp;
>>>> +	unsigned long i;
>>>>  	int err;
>>>>  
>>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
>>>> +	xa_for_each(&ff->classifiers, i, tmp) {
>>>> +		if ((*c)->size == tmp->size &&
>>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
>>>
>>> note that classifier has padding bytes.
>>> comparing these with memcmp is not safe, is it?
>>
>> The reserved bytes are set to 0, this is fine.
> 
> I mean the compiler padding.  set to 0 where?

There's no compiler padding in virtio_net_ff_selector. There are
reserved fields between the count and selector array.

> 
>>>
>>>
>>>> +			refcount_inc(&tmp->refcount);
>>>> +			kfree(*c);
>>>> +			*c = tmp;
>>>> +			goto out;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>>>>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>>>>  		       GFP_KERNEL);
>>>>  	if (err)
>>>
>>> what kind of locking prevents two threads racing in this code?
>>
>> The ethtool calls happen under rtnl_lock.
>>
>>>
>>>
>>>> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>>>>  		      (*c)->size);
>>>>  	if (err)
>>>>  		goto err_xarray;
>>>>  
>>>> +	refcount_set(&(*c)->refcount, 1);
>>>
>>>
>>> so you insert uninitialized refcount? can't another thread find it
>>> meanwhile?
>>
>> Again, rtnl_lock.
>>
>>
>>>>  
>>>>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
>>>>  	if (err) {
>>>>  		/* destroy_classifier will free the classifier */
>>>
>>> will free is no longer correct, is it?
>>
>> Clarified the comment.
>>
>>>
>>>> -		destroy_classifier(ff, c->id);
>>>> +		try_destroy_classifier(ff, c->id);
>>>>  		goto err_key;
>>>>  	}
>>>>  
>>>> -- 
>>>> 2.50.1
>>>
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19  7:17         ` Dan Jurgens
@ 2025-11-19  7:20           ` Michael S. Tsirkin
  0 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:20 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 01:17:28AM -0600, Dan Jurgens wrote:
> On 11/19/25 1:06 AM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:03:36AM -0600, Dan Jurgens wrote:
> >> On 11/18/25 3:31 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Nov 18, 2025 at 08:38:59AM -0600, Daniel Jurgens wrote:
> >>>> Add support for IP_USER type rules from ethtool.
> >>>>
> >>>> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> >>>> +		      const struct ethtool_rx_flow_spec *fs)
> >>>> +{
> >>>> +	const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
> >>>> +	const struct ethtool_usrip4_spec *l3_val  = &fs->h_u.usr_ip4_spec;
> >>>> +
> >>>> +	mask->saddr = l3_mask->ip4src;
> >>>> +	mask->daddr = l3_mask->ip4dst;
> >>>> +	key->saddr = l3_val->ip4src;
> >>>> +	key->daddr = l3_val->ip4dst;
> >>>> +
> >>>> +	if (l3_mask->proto) {
> >>>
> >>> you seem to check mask for proto here but the ethtool_usrip4_spec doc
> >>> seems to say the mask for proto must be 0. 
> >>>
> >>>
> >>> what gives?
> >>>
> >>
> >> Then for user_ip flows ethtool should provide 0 as the mask, and based
> >> on your comment below I'm verifying that.
> > 
> > but if it does then how did this patch work in your testing?
> 
> Why wouldn't it work? For IP only flows the proto field is not relevant.
> It only filters on IP address, not port.

I mean it's dead code with mask 0.

> > 
> >> I can move this hunk to the TCP/UDP patch if you prefer.
> > 
> > 
> > not sure what you mean so I can't comment on that.
> > generally it's best to add code in the same patch where
> > it's used - easier to review.
> > 
> 
> the l3_mask->proto will only be set for TCP/UDP flows.

I'd move it there in that case then.

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 04/12] virtio: Expose object create and destroy API
  2025-11-19  6:39       ` Michael S. Tsirkin
@ 2025-11-19  7:21         ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:21 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 12:39 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 09:29:05PM -0600, Dan Jurgens wrote:
>> On 11/18/25 4:14 PM, Michael S. Tsirkin wrote:
>>> On Tue, Nov 18, 2025 at 08:38:54AM -0600, Daniel Jurgens wrote:
>>
>>>> +int virtio_admin_obj_destroy(struct virtio_device *vdev,
>>>> +			     u16 obj_type,
>>>> +			     u32 obj_id,
>>>> +			     u16 group_type,
>>>> +			     u64 group_member_id)
>>>
>>> what's the point of making it int when none of the callers
>>> check the return type?
>>>
>>
>> It's an API, and return codes are available. I don't have a use for them
>> in this series but perhaps a future user will.
> 
> For starters let's address the existing use which wants it to never fail.
> I would say do something with an error inside the function.
> Maybe just WARN_ON_ONCE.
> 

I can add that, in my case there's no recourse if it fails anyway.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  7:18         ` Dan Jurgens
@ 2025-11-19  7:23           ` Michael S. Tsirkin
  2025-11-19  7:33             ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:23 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 01:18:56AM -0600, Dan Jurgens wrote:
> On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
> >> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> >>>> Classifiers can be used by more than one rule. If there is an existing
> >>>> classifier, use it instead of creating a new one.
> >>
> >>>> +	struct virtnet_classifier *tmp;
> >>>> +	unsigned long i;
> >>>>  	int err;
> >>>>  
> >>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> >>>> +	xa_for_each(&ff->classifiers, i, tmp) {
> >>>> +		if ((*c)->size == tmp->size &&
> >>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> >>>
> >>> note that classifier has padding bytes.
> >>> comparing these with memcmp is not safe, is it?
> >>
> >> The reserved bytes are set to 0, this is fine.
> > 
> > I mean the compiler padding.  set to 0 where?
> 
> There's no compiler padding in virtio_net_ff_selector. There are
> reserved fields between the count and selector array.

I might be missing something here, but are not the
structures this code compares of the type struct virtnet_classifier
not virtio_net_ff_selector ?

and that one is:

 struct virtnet_classifier {
        size_t size;
+       refcount_t refcount;
        u32 id;
        struct virtio_net_resource_obj_ff_classifier classifier;
 };


which seems to have some padding depending on the architecture.


> > 
> >>>
> >>>
> >>>> +			refcount_inc(&tmp->refcount);
> >>>> +			kfree(*c);
> >>>> +			*c = tmp;
> >>>> +			goto out;
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
> >>>>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> >>>>  		       GFP_KERNEL);
> >>>>  	if (err)
> >>>
> >>> what kind of locking prevents two threads racing in this code?
> >>
> >> The ethtool calls happen under rtnl_lock.
> >>
> >>>
> >>>
> >>>> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> >>>>  		      (*c)->size);
> >>>>  	if (err)
> >>>>  		goto err_xarray;
> >>>>  
> >>>> +	refcount_set(&(*c)->refcount, 1);
> >>>
> >>>
> >>> so you insert uninitialized refcount? can't another thread find it
> >>> meanwhile?
> >>
> >> Again, rtnl_lock.
> >>
> >>
> >>>>  
> >>>>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> >>>>  	if (err) {
> >>>>  		/* destroy_classifier will free the classifier */
> >>>
> >>> will free is no longer correct, is it?
> >>
> >> Clarified the comment.
> >>
> >>>
> >>>> -		destroy_classifier(ff, c->id);
> >>>> +		try_destroy_classifier(ff, c->id);
> >>>>  		goto err_key;
> >>>>  	}
> >>>>  
> >>>> -- 
> >>>> 2.50.1
> >>>
> > 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  7:23           ` Michael S. Tsirkin
@ 2025-11-19  7:33             ` Dan Jurgens
  2025-11-19  7:41               ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 1:23 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:18:56AM -0600, Dan Jurgens wrote:
>> On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
>>> On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
>>>> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
>>>>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
>>>>>> Classifiers can be used by more than one rule. If there is an existing
>>>>>> classifier, use it instead of creating a new one.
>>>>
>>>>>> +	struct virtnet_classifier *tmp;
>>>>>> +	unsigned long i;
>>>>>>  	int err;
>>>>>>  
>>>>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
>>>>>> +	xa_for_each(&ff->classifiers, i, tmp) {
>>>>>> +		if ((*c)->size == tmp->size &&
>>>>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
>>>>>
>>>>> note that classifier has padding bytes.
>>>>> comparing these with memcmp is not safe, is it?
>>>>
>>>> The reserved bytes are set to 0, this is fine.
>>>
>>> I mean the compiler padding.  set to 0 where?
>>
>> There's no compiler padding in virtio_net_ff_selector. There are
>> reserved fields between the count and selector array.
> 
> I might be missing something here, but are not the
> structures this code compares of the type struct virtnet_classifier
> not virtio_net_ff_selector ?
> 
> and that one is:
> 
>  struct virtnet_classifier {
>         size_t size;
> +       refcount_t refcount;
>         u32 id;
>         struct virtio_net_resource_obj_ff_classifier classifier;
>  };
> 
> 
> which seems to have some padding depending on the architecture.

We're only comparing the ->classifier part of that, which is pad free.

> 
> 
>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-18 21:45   ` Michael S. Tsirkin
@ 2025-11-19  7:35     ` Dan Jurgens
  2025-11-19  7:44       ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19  7:35 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 3:45 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:39:00AM -0600, Daniel Jurgens wrote:
>> Implement support for IPV6_USER_FLOW type rules.
>>
>> Example:
>> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
>> Added rule with ID 0
>>
>> The example rule will forward packets with the specified source and
>> destination IP addresses to RX ring 3.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
> 
> I find it weird that this does not modify setup_eth_hdr_key_mask
> 
> So it still hardcodes ETH_P_IP for all IP flows?
> For IPv6, should it not use ETH_P_IPV6 instead?
> 
> how does it work?
> 

Your right, it's works because our controller use that field. Will fix it.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 02/12] virtio: Add config_op for admin commands
  2025-11-18 14:38 ` [PATCH net-next v11 02/12] virtio: Add config_op for admin commands Daniel Jurgens
@ 2025-11-19  7:36   ` Michael S. Tsirkin
  0 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:36 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:52AM -0600, Daniel Jurgens wrote:
> This will allow device drivers to issue administration commands.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
> ---
> v4: New patch for v4
> ---
>  drivers/virtio/virtio_pci_modern.c | 2 ++
>  include/linux/virtio_config.h      | 6 ++++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> index ff11de5b3d69..acc3f958f96a 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -1236,6 +1236,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
>  	.get_shm_region  = vp_get_shm_region,
>  	.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
>  	.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
> +	.admin_cmd_exec = vp_modern_admin_cmd_exec,
>  };
>  
>  static const struct virtio_config_ops virtio_pci_config_ops = {
> @@ -1256,6 +1257,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
>  	.get_shm_region  = vp_get_shm_region,
>  	.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
>  	.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
> +	.admin_cmd_exec = vp_modern_admin_cmd_exec,
>  };
>  
>  /* the PCI probing function */
> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> index 16001e9f9b39..19606609254e 100644
> --- a/include/linux/virtio_config.h
> +++ b/include/linux/virtio_config.h
> @@ -108,6 +108,10 @@ struct virtqueue_info {
>   *	Returns 0 on success or error status
>   *	If disable_vq_and_reset is set, then enable_vq_after_reset must also be
>   *	set.
> + * @admin_cmd_exec: Execute an admin VQ command.

should say (optional) since only pci implements this so far
and callers check it.


> + *	vdev: the virtio_device
> + *	cmd: the command to execute
> + *	Returns 0 on success or error status
>   */
>  struct virtio_config_ops {
>  	void (*get)(struct virtio_device *vdev, unsigned offset,
> @@ -137,6 +141,8 @@ struct virtio_config_ops {
>  			       struct virtio_shm_region *region, u8 id);
>  	int (*disable_vq_and_reset)(struct virtqueue *vq);
>  	int (*enable_vq_after_reset)(struct virtqueue *vq);
> +	int (*admin_cmd_exec)(struct virtio_device *vdev,
> +			      struct virtio_admin_cmd *cmd);
>  };
>  
>  /**
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert
  2025-11-18 14:38 ` [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
@ 2025-11-19  7:38   ` Michael S. Tsirkin
  2025-11-19 14:23     ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:38 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:51AM -0600, Daniel Jurgens wrote:
> The cap ID list can be more than 64 bits. Remove the build assert. Also
> remove caching of the supported caps, it wasn't used.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> 
> ---
> v4: New patch for V4
> v5:
>    - support_caps -> supported_caps (Alok Tiwari)
>    - removed unused variable (test robot)
> ---
>  drivers/virtio/virtio_pci_common.h | 1 -
>  drivers/virtio/virtio_pci_modern.c | 8 +-------
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
> index 8cd01de27baf..fc26e035e7a6 100644
> --- a/drivers/virtio/virtio_pci_common.h
> +++ b/drivers/virtio/virtio_pci_common.h
> @@ -48,7 +48,6 @@ struct virtio_pci_admin_vq {
>  	/* Protects virtqueue access. */
>  	spinlock_t lock;
>  	u64 supported_cmds;
> -	u64 supported_caps;
>  	u8 max_dev_parts_objects;
>  	struct ida dev_parts_ida;
>  	/* Name of the admin queue: avq.$vq_index. */
> diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
> index dd0e65f71d41..ff11de5b3d69 100644
> --- a/drivers/virtio/virtio_pci_modern.c
> +++ b/drivers/virtio/virtio_pci_modern.c
> @@ -304,7 +304,6 @@ virtio_pci_admin_cmd_dev_parts_objects_enable(struct virtio_device *virtio_dev)
>  
>  static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
>  {
> -	struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
>  	struct virtio_admin_cmd_query_cap_id_result *data;
>  	struct virtio_admin_cmd cmd = {};
>  	struct scatterlist result_sg;
> @@ -323,12 +322,7 @@ static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
>  	if (ret)
>  		goto end;
>  
> -	/* Max number of caps fits into a single u64 */
> -	BUILD_BUG_ON(sizeof(data->supported_caps) > sizeof(u64));
> -
> -	vp_dev->admin_vq.supported_caps = le64_to_cpu(data->supported_caps[0]);
> -
> -	if (!(vp_dev->admin_vq.supported_caps & (1 << VIRTIO_DEV_PARTS_CAP)))
> +	if (!(le64_to_cpu(data->supported_caps[0]) & (1 << VIRTIO_DEV_PARTS_CAP)))
>  		goto end;

It's ok but a better way is

data->supported_caps[0] & cpu_to_le64(1 << VIRTIO_DEV_PARTS_CAP)

giving the compiler a chance to do the byte swap at compile time
on BE.



>  	virtio_pci_admin_cmd_dev_parts_objects_enable(virtio_dev);
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  7:33             ` Dan Jurgens
@ 2025-11-19  7:41               ` Michael S. Tsirkin
  2025-11-19 15:45                 ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:41 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 01:33:31AM -0600, Dan Jurgens wrote:
> On 11/19/25 1:23 AM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:18:56AM -0600, Dan Jurgens wrote:
> >> On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
> >>>> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
> >>>>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> >>>>>> Classifiers can be used by more than one rule. If there is an existing
> >>>>>> classifier, use it instead of creating a new one.
> >>>>
> >>>>>> +	struct virtnet_classifier *tmp;
> >>>>>> +	unsigned long i;
> >>>>>>  	int err;
> >>>>>>  
> >>>>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> >>>>>> +	xa_for_each(&ff->classifiers, i, tmp) {
> >>>>>> +		if ((*c)->size == tmp->size &&
> >>>>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> >>>>>
> >>>>> note that classifier has padding bytes.
> >>>>> comparing these with memcmp is not safe, is it?
> >>>>
> >>>> The reserved bytes are set to 0, this is fine.
> >>>
> >>> I mean the compiler padding.  set to 0 where?
> >>
> >> There's no compiler padding in virtio_net_ff_selector. There are
> >> reserved fields between the count and selector array.
> > 
> > I might be missing something here, but are not the
> > structures this code compares of the type struct virtnet_classifier
> > not virtio_net_ff_selector ?
> > 
> > and that one is:
> > 
> >  struct virtnet_classifier {
> >         size_t size;
> > +       refcount_t refcount;
> >         u32 id;
> >         struct virtio_net_resource_obj_ff_classifier classifier;
> >  };
> > 
> > 
> > which seems to have some padding depending on the architecture.
> 
> We're only comparing the ->classifier part of that, which is pad free.

Oh I see a classifier has a classifer inside :(

Should be something else, e.g. ff_classifier to avoid confusion I think.

Or resource_obj since it's the resource object. Or even obj.

But


> > 
> > 
> >


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
  2025-11-18 21:55   ` Michael S. Tsirkin
@ 2025-11-19  7:42   ` Michael S. Tsirkin
  2025-11-19  9:01   ` Michael S. Tsirkin
  2 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:42 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> Classifiers can be used by more than one rule. If there is an existing
> classifier, use it instead of creating a new one.

explaining what's the point would be good. to save
device memory?

> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
>     - Fixed typo in commit message
>     - for (int -> for (
> 
> v8:
>     - Removed unused num_classifiers. Jason Wang
> ---
>  drivers/net/virtio_net.c | 40 +++++++++++++++++++++++++++-------------
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index de1a23c71449..f392ea30f2c7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -299,7 +299,6 @@ struct virtnet_ff {
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
>  	struct xarray classifiers;
> -	int num_classifiers;
>  	struct virtnet_ethtool_ff ethtool;
>  };
>  
> @@ -6827,6 +6826,7 @@ struct virtnet_ethtool_rule {
>  /* The classifier struct must be the last field in this struct */
>  struct virtnet_classifier {
>  	size_t size;
> +	refcount_t refcount;
>  	u32 id;
>  	struct virtio_net_resource_obj_ff_classifier classifier;
>  };
> @@ -6920,11 +6920,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
>  	return false;
>  }
>  
> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +static int setup_classifier(struct virtnet_ff *ff,
> +			    struct virtnet_classifier **c)
>  {
> +	struct virtnet_classifier *tmp;
> +	unsigned long i;
>  	int err;
>  
> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> +	xa_for_each(&ff->classifiers, i, tmp) {
> +		if ((*c)->size == tmp->size &&
> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> +			refcount_inc(&tmp->refcount);
> +			kfree(*c);
> +			*c = tmp;
> +			goto out;
> +		}
> +	}
> +
> +	err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>  		       XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>  		       GFP_KERNEL);
>  	if (err)
> @@ -6932,29 +6945,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>  
>  	err = virtio_admin_obj_create(ff->vdev,
>  				      VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> -				      c->id,
> +				      (*c)->id,
>  				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
>  				      0,
> -				      &c->classifier,
> -				      c->size);
> +				      &(*c)->classifier,
> +				      (*c)->size);
>  	if (err)
>  		goto err_xarray;
>  
> +	refcount_set(&(*c)->refcount, 1);
> +out:
>  	return 0;
>  
>  err_xarray:
> -	xa_erase(&ff->classifiers, c->id);
> +	xa_erase(&ff->classifiers, (*c)->id);
>  
>  	return err;
>  }
>  
> -static void destroy_classifier(struct virtnet_ff *ff,
> -			       u32 classifier_id)
> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
>  {
>  	struct virtnet_classifier *c;
>  
>  	c = xa_load(&ff->classifiers, classifier_id);
> -	if (c) {
> +	if (c && refcount_dec_and_test(&c->refcount)) {
>  		virtio_admin_obj_destroy(ff->vdev,
>  					 VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
>  					 c->id,
> @@ -6978,7 +6992,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
>  				 0);
>  
>  	xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> -	destroy_classifier(ff, eth_rule->classifier_id);
> +	try_destroy_classifier(ff, eth_rule->classifier_id);
>  	kfree(eth_rule);
>  }
>  
> @@ -7159,14 +7173,14 @@ static int build_and_insert(struct virtnet_ff *ff,
>  	if (err)
>  		goto err_key;
>  
> -	err = setup_classifier(ff, c);
> +	err = setup_classifier(ff, &c);
>  	if (err)
>  		goto err_classifier;
>  
>  	err = insert_rule(ff, eth_rule, c->id, key, key_size);
>  	if (err) {
>  		/* destroy_classifier will free the classifier */
> -		destroy_classifier(ff, c->id);
> +		try_destroy_classifier(ff, c->id);
>  		goto err_key;
>  	}
>  
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-19  7:35     ` Dan Jurgens
@ 2025-11-19  7:44       ` Michael S. Tsirkin
  0 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:44 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 01:35:37AM -0600, Dan Jurgens wrote:
> On 11/18/25 3:45 PM, Michael S. Tsirkin wrote:
> > On Tue, Nov 18, 2025 at 08:39:00AM -0600, Daniel Jurgens wrote:
> >> Implement support for IPV6_USER_FLOW type rules.
> >>
> >> Example:
> >> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
> >> Added rule with ID 0
> >>
> >> The example rule will forward packets with the specified source and
> >> destination IP addresses to RX ring 3.
> >>
> >> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> >> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> > 
> > 
> > I find it weird that this does not modify setup_eth_hdr_key_mask
> > 
> > So it still hardcodes ETH_P_IP for all IP flows?
> > For IPv6, should it not use ETH_P_IPV6 instead?
> > 
> > how does it work?
> > 
> 
> Your right, it's works because our controller use that field. Will fix it.

you mean does not use.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
  2025-11-18 22:06   ` Michael S. Tsirkin
  2025-11-18 23:03   ` Michael S. Tsirkin
@ 2025-11-19  7:53   ` Michael S. Tsirkin
  2025-11-19 14:47     ` Dan Jurgens
  2025-11-19  7:55   ` Michael S. Tsirkin
  3 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:53 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
> +/**
> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> + * @groups_limit: maximum number of flow filter groups supported by the device
> + * @classifiers_limit: maximum number of classifiers supported by the device
> + * @rules_limit: maximum number of rules supported device-wide across all groups
> + * @rules_per_group_limit: maximum number of rules allowed in a single group
> + * @last_rule_priority: priority value associated with the lowest-priority rule
> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
> + *
> + * The limits are reported by the device and describe resource capacities for
> + * flow filters.

This sentence adds nothing of substance.
Pls don't add fluff like this in comments.

> Multi-byte fields are little-endian.


You do not really need to say "Multi-byte fields are little-endian."
do you? It says __le explicitly. Same applies to all structures.

> + */
> +struct virtio_net_ff_cap_data {
> +	__le32 groups_limit;
> +	__le32 classifiers_limit;
> +	__le32 rules_limit;
> +	__le32 rules_per_group_limit;
> +	__u8 last_rule_priority;
> +	__u8 selectors_per_classifier_limit;
> +};

so the compiler adds 2 bytes of padding here. The bug is
in the spec.

I think this happens to work for people because controllers
either also added 2 bytes of padding here at the end,
or they report a shorter structure and
the spec says commands can be truncated.
So I think we can just add 2 bytes of padding at the end
and it will be harmless.

It is a spec extension, but a minor one.


-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
                     ` (2 preceding siblings ...)
  2025-11-19  7:53   ` Michael S. Tsirkin
@ 2025-11-19  7:55   ` Michael S. Tsirkin
  3 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  7:55 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
> + * @selectors: array of supported selector descriptors

I would document their actual structure here.

> + */
> +struct virtio_net_ff_cap_mask_data {
> +	__u8 count;
> +	__u8 reserved[7];
> +	__u8 selectors[];
> +};


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
  2025-11-18 21:55   ` Michael S. Tsirkin
  2025-11-19  7:42   ` Michael S. Tsirkin
@ 2025-11-19  9:01   ` Michael S. Tsirkin
  2 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:01 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> Classifiers can be used by more than one rule. If there is an existing
> classifier, use it instead of creating a new one.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
>     - Fixed typo in commit message
>     - for (int -> for (
> 
> v8:
>     - Removed unused num_classifiers. Jason Wang
> ---
>  drivers/net/virtio_net.c | 40 +++++++++++++++++++++++++++-------------
>  1 file changed, 27 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index de1a23c71449..f392ea30f2c7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -299,7 +299,6 @@ struct virtnet_ff {
>  	struct virtio_net_ff_cap_mask_data *ff_mask;
>  	struct virtio_net_ff_actions *ff_actions;
>  	struct xarray classifiers;
> -	int num_classifiers;
>  	struct virtnet_ethtool_ff ethtool;
>  };
>  
> @@ -6827,6 +6826,7 @@ struct virtnet_ethtool_rule {
>  /* The classifier struct must be the last field in this struct */
>  struct virtnet_classifier {
>  	size_t size;
> +	refcount_t refcount;
>  	u32 id;
>  	struct virtio_net_resource_obj_ff_classifier classifier;
>  };


BTW if you are going to use refcount_t you should include
refcount.h not rely on some other header to pull it in for you.

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules
  2025-11-18 14:39 ` [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
@ 2025-11-19  9:14   ` Michael S. Tsirkin
  2025-11-19 16:07     ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:14 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:39:01AM -0600, Daniel Jurgens wrote:
> @@ -7167,6 +7261,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>  	case ETHER_FLOW:
>  	case IP_USER_FLOW:
>  	case IPV6_USER_FLOW:
> +	case TCP_V4_FLOW:
> +	case TCP_V6_FLOW:
> +	case UDP_V4_FLOW:
> +	case UDP_V6_FLOW:
>  		return true;
>  	}
>  

it kinda looks like you are sending flow control rules to
the device ignoring what it reported as supported through
VIRTIO_NET_FF_SELECTOR_CAP

Is that right?

The spec does not say what happens in such a case.

Parav what is your take? is the implication that driver
must only send supported rules?

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-18 21:31   ` Michael S. Tsirkin
  2025-11-19  7:03     ` Dan Jurgens
@ 2025-11-19  9:18     ` Michael S. Tsirkin
  2025-11-19 16:33       ` Dan Jurgens
  1 sibling, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:18 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 04:31:09PM -0500, Michael S. Tsirkin wrote:
> > +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> > +			     u8 *key,
> > +			     const struct ethtool_rx_flow_spec *fs)
> > +{
> > +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> > +	struct iphdr *v4_k = (struct iphdr *)key;
> > +
> > +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> > +	selector->length = sizeof(struct iphdr);
> > +
> > +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> > +	    fs->h_u.usr_ip4_spec.tos ||
> > +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> > +		return -EOPNOTSUPP;
> 
> So include/uapi/linux/ethtool.h says:
> 
>  * struct ethtool_usrip4_spec - general flow specification for IPv4
>  * @ip4src: Source host
>  * @ip4dst: Destination host
>  * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>  * @tos: Type-of-service
>  * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
>  * @proto: Transport protocol number; mask must be 0
> 
> I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
> documentation? But then shouldn't you check the mask
> as well? and mask for proto?
> 
> 
> 

in fact, what if e.g. tos is 0 but mask is non-zero? should not
this be rejected, too?


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-18 19:01   ` Michael S. Tsirkin
  2025-11-19  6:07     ` Dan Jurgens
@ 2025-11-19  9:20     ` Michael S. Tsirkin
  1 sibling, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:20 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 02:01:27PM -0500, Michael S. Tsirkin wrote:
> > +struct virtnet_ethtool_ff {
> > +	struct xarray rules;
> > +	int    num_rules;
> > +};

btw if you are using xarray you should include linux/xarray.h


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-18 14:38 ` [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
  2025-11-18 19:01   ` Michael S. Tsirkin
@ 2025-11-19  9:26   ` Michael S. Tsirkin
  2025-11-19 16:51     ` Dan Jurgens
  1 sibling, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:26 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:57AM -0600, Daniel Jurgens wrote:

...

> +static int insert_rule(struct virtnet_ff *ff,
> +		       struct virtnet_ethtool_rule *eth_rule,
> +		       u32 classifier_id,
> +		       const u8 *key,
> +		       size_t key_size)
> +{
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_resource_obj_ff_rule *ff_rule;
> +	int err;
> +
> +	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
> +	if (!ff_rule)
> +		return -ENOMEM;
> +
> +	/* Intentionally leave the priority as 0. All rules have the same
> +	 * priority.
> +	 */
> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
> +	ff_rule->key_length = (u8)key_size;
> +	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
> +					     VIRTIO_NET_FF_ACTION_DROP :
> +					     VIRTIO_NET_FF_ACTION_RX_VQ;
> +	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
> +					       cpu_to_le16(fs->ring_cookie) : 0;
> +	memcpy(&ff_rule->keys, key, key_size);
> +
> +	err = virtio_admin_obj_create(ff->vdev,
> +				      VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
> +				      fs->location,
> +				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				      0,
> +				      ff_rule,
> +				      sizeof(*ff_rule) + key_size);
> +	if (err)
> +		goto err_ff_rule;
> +
> +	eth_rule->classifier_id = classifier_id;
> +	ff->ethtool.num_rules++;
> +	kfree(ff_rule);
> +
> +	return 0;
> +
> +err_ff_rule:
> +	kfree(ff_rule);
> +
> +	return err;
> +}


...

> +static int build_and_insert(struct virtnet_ff *ff,
> +			    struct virtnet_ethtool_rule *eth_rule)
> +{
> +	struct virtio_net_resource_obj_ff_classifier *classifier;
> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
> +	struct virtio_net_ff_selector *selector;
> +	struct virtnet_classifier *c;
> +	size_t classifier_size;
> +	size_t key_size;
> +	int num_hdrs;
> +	u8 *key;
> +	int err;
> +
> +	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
> +
> +	key = kzalloc(key_size, GFP_KERNEL);
> +	if (!key)
> +		return -ENOMEM;

So key is allocated here ...


> +
> +	/*
> +	 * virtio_net_ff_obj_ff_classifier is already included in the
> +	 * classifier_size.
> +	 */
> +	c = kzalloc(classifier_size +
> +		    sizeof(struct virtnet_classifier) -
> +		    sizeof(struct virtio_net_resource_obj_ff_classifier),
> +		    GFP_KERNEL);
> +	if (!c) {
> +		kfree(key);
> +		return -ENOMEM;
> +	}
> +
> +	c->size = classifier_size;
> +	classifier = &c->classifier;
> +	classifier->count = num_hdrs;
> +	selector = (void *)&classifier->selectors[0];
> +
> +	setup_eth_hdr_key_mask(selector, key, fs);
> +
> +	err = validate_classifier_selectors(ff, classifier, num_hdrs);
> +	if (err)
> +		goto err_key;
> +
> +	err = setup_classifier(ff, c);
> +	if (err)
> +		goto err_classifier;
> +
> +	err = insert_rule(ff, eth_rule, c->id, key, key_size);


... copied by insert_rule


> +	if (err) {
> +		/* destroy_classifier will free the classifier */
> +		destroy_classifier(ff, c->id);
> +		goto err_key;
> +	}
> +


... and apparently never freed?


I think it's because the API of insert_rule is confusing...

> +	return 0;
> +
> +err_classifier:
> +	kfree(c);
> +err_key:
> +	kfree(key);
> +
> +	return err;
> +}
> +

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering
  2025-11-18 14:38 ` [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
@ 2025-11-19  9:36   ` Michael S. Tsirkin
  2025-11-19 17:14     ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19  9:36 UTC (permalink / raw)
  To: Daniel Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Tue, Nov 18, 2025 at 08:38:56AM -0600, Daniel Jurgens wrote:
> All ethtool steering rules will go in one group, create it during
> initialization.
> 
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4: Documented UAPI
> ---
>  drivers/net/virtio_net.c           | 29 +++++++++++++++++++++++++++++
>  include/uapi/linux/virtio_net_ff.h | 15 +++++++++++++++
>  2 files changed, 44 insertions(+)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 3615f45ac358..900d597726f7 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -284,6 +284,9 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
>  	VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
>  };
>  
> +#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
> +#define VIRTNET_FF_MAX_GROUPS 1
> +
>  struct virtnet_ff {
>  	struct virtio_device *vdev;
>  	bool ff_supported;
> @@ -6812,6 +6815,7 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
>  			      sizeof(struct virtio_net_ff_selector) *
>  			      VIRTIO_NET_FF_MASK_TYPE_MAX;
> +	struct virtio_net_resource_obj_ff_group ethtool_group = {};
>  	struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
>  	struct virtio_net_ff_selector *sel;
>  	size_t real_ff_mask_size;
> @@ -6895,6 +6899,12 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	if (le32_to_cpu(ff->ff_caps->groups_limit) < VIRTNET_FF_MAX_GROUPS) {
> +		err = -ENOSPC;
> +		goto err_ff_action;
> +	}
> +	ff->ff_caps->groups_limit = cpu_to_le32(VIRTNET_FF_MAX_GROUPS);
> +
>  	err = virtio_admin_cap_set(vdev,
>  				   VIRTIO_NET_FF_RESOURCE_CAP,
>  				   ff->ff_caps,
> @@ -6932,6 +6942,19 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
>  	if (err)
>  		goto err_ff_action;
>  
> +	ethtool_group.group_priority = cpu_to_le16(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
> +
> +	/* Use priority for the object ID. */
> +	err = virtio_admin_obj_create(vdev,
> +				      VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
> +				      VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
> +				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				      0,
> +				      &ethtool_group,
> +				      sizeof(ethtool_group));
> +	if (err)
> +		goto err_ff_action;
> +
>  	ff->vdev = vdev;
>  	ff->ff_supported = true;
>  

So this is set here and never cleared.

But we never recreate the group on restore (after suspend).

sounds like we need virtnet_ff_cleanup/virtnet_ff_init on the
suspend/restore path?

> @@ -6959,6 +6982,12 @@ static void virtnet_ff_cleanup(struct virtnet_ff *ff)
>  	if (!ff->ff_supported)
>  		return;
>  
> +	virtio_admin_obj_destroy(ff->vdev,
> +				 VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
> +				 VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
> +				 VIRTIO_ADMIN_GROUP_TYPE_SELF,
> +				 0);
> +
>  	kfree(ff->ff_actions);
>  	kfree(ff->ff_mask);
>  	kfree(ff->ff_caps);
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> index bd7a194a9959..6d1f953c2b46 100644
> --- a/include/uapi/linux/virtio_net_ff.h
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -12,6 +12,8 @@
>  #define VIRTIO_NET_FF_SELECTOR_CAP 0x801
>  #define VIRTIO_NET_FF_ACTION_CAP 0x802
>  
> +#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
> +
>  /**
>   * struct virtio_net_ff_cap_data - Flow filter resource capability limits
>   * @groups_limit: maximum number of flow filter groups supported by the device
> @@ -88,4 +90,17 @@ struct virtio_net_ff_actions {
>  	__u8 reserved[7];
>  	__u8 actions[];
>  };
> +
> +/**
> + * struct virtio_net_resource_obj_ff_group - Flow filter group object
> + * @group_priority: priority of the group used to order evaluation
> + *
> + * This structure is the payload for the VIRTIO_NET_RESOURCE_OBJ_FF_GROUP
> + * administrative object. Devices use @group_priority to order flow filter
> + * groups. Multi-byte fields are little-endian.
> + */
> +struct virtio_net_resource_obj_ff_group {
> +	__le16 group_priority;
> +};
> +
>  #endif
> -- 
> 2.50.1


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert
  2025-11-19  7:38   ` Michael S. Tsirkin
@ 2025-11-19 14:23     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 14:23 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 1:38 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:51AM -0600, Daniel Jurgens wrote:
>> -	if (!(vp_dev->admin_vq.supported_caps & (1 << VIRTIO_DEV_PARTS_CAP)))
>> +	if (!(le64_to_cpu(data->supported_caps[0]) & (1 << VIRTIO_DEV_PARTS_CAP)))
>>  		goto end;
> 
> It's ok but a better way is
> 
> data->supported_caps[0] & cpu_to_le64(1 << VIRTIO_DEV_PARTS_CAP)
> 
> giving the compiler a chance to do the byte swap at compile time
> on BE.
> 

Done

> 
> 
>>  	virtio_pci_admin_cmd_dev_parts_objects_enable(virtio_dev);
>> -- 
>> 2.50.1
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps
  2025-11-19  7:53   ` Michael S. Tsirkin
@ 2025-11-19 14:47     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 14:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 1:53 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:55AM -0600, Daniel Jurgens wrote:
>> +/**
>> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
>> + * @groups_limit: maximum number of flow filter groups supported by the device
>> + * @classifiers_limit: maximum number of classifiers supported by the device
>> + * @rules_limit: maximum number of rules supported device-wide across all groups
>> + * @rules_per_group_limit: maximum number of rules allowed in a single group
>> + * @last_rule_priority: priority value associated with the lowest-priority rule
>> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
>> + *
>> + * The limits are reported by the device and describe resource capacities for
>> + * flow filters.
> 
> This sentence adds nothing of substance.
> Pls don't add fluff like this in comments.
> 
>> Multi-byte fields are little-endian.
> 

Done

> 
> You do not really need to say "Multi-byte fields are little-endian."
> do you? It says __le explicitly. Same applies to all structures.
> 
>> + */
>> +struct virtio_net_ff_cap_data {
>> +	__le32 groups_limit;
>> +	__le32 classifiers_limit;
>> +	__le32 rules_limit;
>> +	__le32 rules_per_group_limit;
>> +	__u8 last_rule_priority;
>> +	__u8 selectors_per_classifier_limit;
>> +};
> 
> so the compiler adds 2 bytes of padding here. The bug is
> in the spec.
> 
> I think this happens to work for people because controllers
> either also added 2 bytes of padding here at the end,
> or they report a shorter structure and
> the spec says commands can be truncated.
> So I think we can just add 2 bytes of padding at the end
> and it will be harmless.
> 
> It is a spec extension, but a minor one.

Done, referenced your spec patch in the change log.

> 
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19  7:41               ` Michael S. Tsirkin
@ 2025-11-19 15:45                 ` Dan Jurgens
  2025-11-19 15:58                   ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 15:45 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 1:41 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:33:31AM -0600, Dan Jurgens wrote:
>> On 11/19/25 1:23 AM, Michael S. Tsirkin wrote:
>>> On Wed, Nov 19, 2025 at 01:18:56AM -0600, Dan Jurgens wrote:
>>>> On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
>>>>> On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
>>>>>> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
>>>>>>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
>>>>>>>> Classifiers can be used by more than one rule. If there is an existing
>>>>>>>> classifier, use it instead of creating a new one.
>>>>>>
>>>>>>>> +	struct virtnet_classifier *tmp;
>>>>>>>> +	unsigned long i;
>>>>>>>>  	int err;
>>>>>>>>  
>>>>>>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
>>>>>>>> +	xa_for_each(&ff->classifiers, i, tmp) {
>>>>>>>> +		if ((*c)->size == tmp->size &&
>>>>>>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
>>>>>>>
>>>>>>> note that classifier has padding bytes.
>>>>>>> comparing these with memcmp is not safe, is it?
>>>>>>
>>>>>> The reserved bytes are set to 0, this is fine.
>>>>>
>>>>> I mean the compiler padding.  set to 0 where?
>>>>
>>>> There's no compiler padding in virtio_net_ff_selector. There are
>>>> reserved fields between the count and selector array.
>>>
>>> I might be missing something here, but are not the
>>> structures this code compares of the type struct virtnet_classifier
>>> not virtio_net_ff_selector ?
>>>
>>> and that one is:
>>>
>>>  struct virtnet_classifier {
>>>         size_t size;
>>> +       refcount_t refcount;
>>>         u32 id;
>>>         struct virtio_net_resource_obj_ff_classifier classifier;
>>>  };
>>>
>>>
>>> which seems to have some padding depending on the architecture.
>>
>> We're only comparing the ->classifier part of that, which is pad free.
> 
> Oh I see a classifier has a classifer inside :(
> 
> Should be something else, e.g. ff_classifier to avoid confusion I think.
> 
> Or resource_obj since it's the resource object. Or even obj.
> 
> But
> 

Did you have more to say after that "But"? I did this, also updated the
commit messages and included refcount.h.

> 
>>>
>>>
>>>
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible
  2025-11-19 15:45                 ` Dan Jurgens
@ 2025-11-19 15:58                   ` Michael S. Tsirkin
  0 siblings, 0 replies; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19 15:58 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 09:45:15AM -0600, Dan Jurgens wrote:
> On 11/19/25 1:41 AM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:33:31AM -0600, Dan Jurgens wrote:
> >> On 11/19/25 1:23 AM, Michael S. Tsirkin wrote:
> >>> On Wed, Nov 19, 2025 at 01:18:56AM -0600, Dan Jurgens wrote:
> >>>> On 11/19/25 12:35 AM, Michael S. Tsirkin wrote:
> >>>>> On Wed, Nov 19, 2025 at 12:26:23AM -0600, Dan Jurgens wrote:
> >>>>>> On 11/18/25 3:55 PM, Michael S. Tsirkin wrote:
> >>>>>>> On Tue, Nov 18, 2025 at 08:38:58AM -0600, Daniel Jurgens wrote:
> >>>>>>>> Classifiers can be used by more than one rule. If there is an existing
> >>>>>>>> classifier, use it instead of creating a new one.
> >>>>>>
> >>>>>>>> +	struct virtnet_classifier *tmp;
> >>>>>>>> +	unsigned long i;
> >>>>>>>>  	int err;
> >>>>>>>>  
> >>>>>>>> -	err = xa_alloc(&ff->classifiers, &c->id, c,
> >>>>>>>> +	xa_for_each(&ff->classifiers, i, tmp) {
> >>>>>>>> +		if ((*c)->size == tmp->size &&
> >>>>>>>> +		    !memcmp(&tmp->classifier, &(*c)->classifier, tmp->size)) {
> >>>>>>>
> >>>>>>> note that classifier has padding bytes.
> >>>>>>> comparing these with memcmp is not safe, is it?
> >>>>>>
> >>>>>> The reserved bytes are set to 0, this is fine.
> >>>>>
> >>>>> I mean the compiler padding.  set to 0 where?
> >>>>
> >>>> There's no compiler padding in virtio_net_ff_selector. There are
> >>>> reserved fields between the count and selector array.
> >>>
> >>> I might be missing something here, but are not the
> >>> structures this code compares of the type struct virtnet_classifier
> >>> not virtio_net_ff_selector ?
> >>>
> >>> and that one is:
> >>>
> >>>  struct virtnet_classifier {
> >>>         size_t size;
> >>> +       refcount_t refcount;
> >>>         u32 id;
> >>>         struct virtio_net_resource_obj_ff_classifier classifier;
> >>>  };
> >>>
> >>>
> >>> which seems to have some padding depending on the architecture.
> >>
> >> We're only comparing the ->classifier part of that, which is pad free.
> > 
> > Oh I see a classifier has a classifer inside :(
> > 
> > Should be something else, e.g. ff_classifier to avoid confusion I think.
> > 
> > Or resource_obj since it's the resource object. Or even obj.
> > 
> > But
> > 
> 
> Did you have more to say after that "But"?

Ugh ... donnu how this got here.

> I did this, also updated the
> commit messages and included refcount.h.

thanks.

> > 
> >>>
> >>>
> >>>
> > 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering
  2025-11-18 21:48   ` Michael S. Tsirkin
@ 2025-11-19 16:04     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 3:48 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:39:00AM -0600, Daniel Jurgens wrote:
>> Implement support for IPV6_USER_FLOW type rules.
>>
>> Example:
>> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
>> Added rule with ID 0
>>
>> The example rule will forward packets with the specified source and
>> destination IP addresses to RX ring 3.

>> +	if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
>> +		memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
>> +		memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
>> +	}
> 
> so this checks mask then copies but parse_ip4 copies unconditionally?
> why?

Added a similar check in the previous patch.

> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules
  2025-11-19  9:14   ` Michael S. Tsirkin
@ 2025-11-19 16:07     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 3:14 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:39:01AM -0600, Daniel Jurgens wrote:
>> @@ -7167,6 +7261,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>>  	case ETHER_FLOW:
>>  	case IP_USER_FLOW:
>>  	case IPV6_USER_FLOW:
>> +	case TCP_V4_FLOW:
>> +	case TCP_V6_FLOW:
>> +	case UDP_V4_FLOW:
>> +	case UDP_V6_FLOW:
>>  		return true;
>>  	}
>>  
> 
> it kinda looks like you are sending flow control rules to
> the device ignoring what it reported as supported through
> VIRTIO_NET_FF_SELECTOR_CAP
> 
> Is that right?
> 
> The spec does not say what happens in such a case.
> 
> Parav what is your take? is the implication that driver
> must only send supported rules?
> 

validate_classifier_selectors checks the classifier we built against the
caps reported.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops
  2025-11-18 18:49   ` Michael S. Tsirkin
@ 2025-11-19 16:24     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:24 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/18/25 12:49 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:39:02AM -0600, Daniel Jurgens wrote:

>> +static int
>> +virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
>> +			      struct ethtool_rxnfc *info, u32 *rule_locs)
>> +{
>> +	struct virtnet_ethtool_rule *eth_rule;
>> +	unsigned long i = 0;
>> +	int idx = 0;
>> +
>> +	if (!ff->ff_supported)
>> +		return -EOPNOTSUPP;
>> +
>> +	xa_for_each(&ff->ethtool.rules, i, eth_rule)
>> +		rule_locs[idx++] = i;
>> +
>> +	info->data = le32_to_cpu(ff->ff_caps->rules_limit);
>> +
>> +	return 0;
>> +}
> 
> So I see
> 
> 
>  * For %ETHTOOL_GRXCLSRLALL, @rule_cnt specifies the array size of the
>  * user buffer for @rule_locs on entry.  On return, @data is the size
>  * of the rule table, @rule_cnt is the number of defined rules, and
>  * @rule_locs contains the locations of the defined rules.  Drivers
>  * must use the second parameter to get_rxnfc() instead of @rule_locs.
>  *
> 
> 
> Should this set @rule_cnt?
> 

Some drivers do, and others don't. I'll take the most conservative
approach and use it a limit, then set it at the end.

Also left rc uninitialized at the start of virtnet_get_rxnfc per the
comment in a separate email.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19  9:18     ` Michael S. Tsirkin
@ 2025-11-19 16:33       ` Dan Jurgens
  2025-11-19 16:51         ` Michael S. Tsirkin
  0 siblings, 1 reply; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:33 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 3:18 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 04:31:09PM -0500, Michael S. Tsirkin wrote:
>>> +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>>> +			     u8 *key,
>>> +			     const struct ethtool_rx_flow_spec *fs)
>>> +{
>>> +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
>>> +	struct iphdr *v4_k = (struct iphdr *)key;
>>> +
>>> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
>>> +	selector->length = sizeof(struct iphdr);
>>> +
>>> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>>> +	    fs->h_u.usr_ip4_spec.tos ||
>>> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
>>> +		return -EOPNOTSUPP;
>>
>> So include/uapi/linux/ethtool.h says:
>>
>>  * struct ethtool_usrip4_spec - general flow specification for IPv4
>>  * @ip4src: Source host
>>  * @ip4dst: Destination host
>>  * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>>  * @tos: Type-of-service
>>  * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
>>  * @proto: Transport protocol number; mask must be 0
>>
>> I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
>> documentation? But then shouldn't you check the mask
>> as well? and mask for proto?
>>
>>
>>
> 
> in fact, what if e.g. tos is 0 but mask is non-zero? should not
> this be rejected, too?
> 

Actually the tos check should be removed, there's no guidance it should
be 0, like the other fields. Our hardware doesn't support it, but this
will be caught in validate_classifier_selectors.


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules
  2025-11-19  9:26   ` Michael S. Tsirkin
@ 2025-11-19 16:51     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 3:26 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:57AM -0600, Daniel Jurgens wrote:
> 
> ...
> 
>> +static int insert_rule(struct virtnet_ff *ff,
>> +		       struct virtnet_ethtool_rule *eth_rule,
>> +		       u32 classifier_id,
>> +		       const u8 *key,
>> +		       size_t key_size)
>> +{
>> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
>> +	struct virtio_net_resource_obj_ff_rule *ff_rule;
>> +	int err;
>> +
>> +	ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
>> +	if (!ff_rule)
>> +		return -ENOMEM;
>> +
>> +	/* Intentionally leave the priority as 0. All rules have the same
>> +	 * priority.
>> +	 */
>> +	ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
>> +	ff_rule->classifier_id = cpu_to_le32(classifier_id);
>> +	ff_rule->key_length = (u8)key_size;
>> +	ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
>> +					     VIRTIO_NET_FF_ACTION_DROP :
>> +					     VIRTIO_NET_FF_ACTION_RX_VQ;
>> +	ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
>> +					       cpu_to_le16(fs->ring_cookie) : 0;
>> +	memcpy(&ff_rule->keys, key, key_size);
>> +
>> +	err = virtio_admin_obj_create(ff->vdev,
>> +				      VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
>> +				      fs->location,
>> +				      VIRTIO_ADMIN_GROUP_TYPE_SELF,
>> +				      0,
>> +				      ff_rule,
>> +				      sizeof(*ff_rule) + key_size);
>> +	if (err)
>> +		goto err_ff_rule;
>> +
>> +	eth_rule->classifier_id = classifier_id;
>> +	ff->ethtool.num_rules++;
>> +	kfree(ff_rule);
>> +
>> +	return 0;
>> +
>> +err_ff_rule:
>> +	kfree(ff_rule);
>> +
>> +	return err;
>> +}
> 
> 
> ...
> 
>> +static int build_and_insert(struct virtnet_ff *ff,
>> +			    struct virtnet_ethtool_rule *eth_rule)
>> +{
>> +	struct virtio_net_resource_obj_ff_classifier *classifier;
>> +	struct ethtool_rx_flow_spec *fs = &eth_rule->flow_spec;
>> +	struct virtio_net_ff_selector *selector;
>> +	struct virtnet_classifier *c;
>> +	size_t classifier_size;
>> +	size_t key_size;
>> +	int num_hdrs;
>> +	u8 *key;
>> +	int err;
>> +
>> +	calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
>> +
>> +	key = kzalloc(key_size, GFP_KERNEL);
>> +	if (!key)
>> +		return -ENOMEM;
> 
> So key is allocated here ...
> 
> 
>> +
>> +	/*
>> +	 * virtio_net_ff_obj_ff_classifier is already included in the
>> +	 * classifier_size.
>> +	 */

>> +
>> +	err = insert_rule(ff, eth_rule, c->id, key, key_size);
> 
> 
> ... copied by insert_rule
> 
> 
>> +	if (err) {
>> +		/* destroy_classifier will free the classifier */
>> +		destroy_classifier(ff, c->id);
>> +		goto err_key;
>> +	}
>> +
> 
> 
> ... and apparently never freed?
> 
> 
> I think it's because the API of insert_rule is confusing...
> 

Nice catch. I changed insert_rule to free key when it's successful.

build_and_insert will handle freeing it when it fails.

>> +	return 0;
>> +
>> +err_classifier:
>> +	kfree(c);
>> +err_key:
>> +	kfree(key);
>> +
>> +	return err;
>> +}
>> +
> 


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19 16:33       ` Dan Jurgens
@ 2025-11-19 16:51         ` Michael S. Tsirkin
  2025-11-19 16:59           ` Dan Jurgens
  0 siblings, 1 reply; 66+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19 16:51 UTC (permalink / raw)
  To: Dan Jurgens
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On Wed, Nov 19, 2025 at 10:33:31AM -0600, Dan Jurgens wrote:
> On 11/19/25 3:18 AM, Michael S. Tsirkin wrote:
> > On Tue, Nov 18, 2025 at 04:31:09PM -0500, Michael S. Tsirkin wrote:
> >>> +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> >>> +			     u8 *key,
> >>> +			     const struct ethtool_rx_flow_spec *fs)
> >>> +{
> >>> +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> >>> +	struct iphdr *v4_k = (struct iphdr *)key;
> >>> +
> >>> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> >>> +	selector->length = sizeof(struct iphdr);
> >>> +
> >>> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> >>> +	    fs->h_u.usr_ip4_spec.tos ||
> >>> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
> >>> +		return -EOPNOTSUPP;
> >>
> >> So include/uapi/linux/ethtool.h says:
> >>
> >>  * struct ethtool_usrip4_spec - general flow specification for IPv4
> >>  * @ip4src: Source host
> >>  * @ip4dst: Destination host
> >>  * @l4_4_bytes: First 4 bytes of transport (layer 4) header
> >>  * @tos: Type-of-service
> >>  * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
> >>  * @proto: Transport protocol number; mask must be 0
> >>
> >> I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
> >> documentation? But then shouldn't you check the mask
> >> as well? and mask for proto?
> >>
> >>
> >>
> > 
> > in fact, what if e.g. tos is 0 but mask is non-zero? should not
> > this be rejected, too?
> > 
> 
> Actually the tos check should be removed, there's no guidance it should
> be 0, like the other fields. Our hardware doesn't support it, but this
> will be caught in validate_classifier_selectors.

same question for l4_4_bytes then.

-- 
MST


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules
  2025-11-19 16:51         ` Michael S. Tsirkin
@ 2025-11-19 16:59           ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 16:59 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 10:51 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 10:33:31AM -0600, Dan Jurgens wrote:
>> On 11/19/25 3:18 AM, Michael S. Tsirkin wrote:
>>> On Tue, Nov 18, 2025 at 04:31:09PM -0500, Michael S. Tsirkin wrote:
>>>>> +static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>>>>> +			     u8 *key,
>>>>> +			     const struct ethtool_rx_flow_spec *fs)
>>>>> +{
>>>>> +	struct iphdr *v4_m = (struct iphdr *)&selector->mask;
>>>>> +	struct iphdr *v4_k = (struct iphdr *)key;
>>>>> +
>>>>> +	selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
>>>>> +	selector->length = sizeof(struct iphdr);
>>>>> +
>>>>> +	if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>>>>> +	    fs->h_u.usr_ip4_spec.tos ||
>>>>> +	    fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4)
>>>>> +		return -EOPNOTSUPP;
>>>>
>>>> So include/uapi/linux/ethtool.h says:
>>>>
>>>>  * struct ethtool_usrip4_spec - general flow specification for IPv4
>>>>  * @ip4src: Source host
>>>>  * @ip4dst: Destination host
>>>>  * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>>>>  * @tos: Type-of-service
>>>>  * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
>>>>  * @proto: Transport protocol number; mask must be 0
>>>>
>>>> I guess this ETH_RX_NFC_IP4 check validates that userspace follows this
>>>> documentation? But then shouldn't you check the mask
>>>> as well? and mask for proto?
>>>>
>>>>
>>>>
>>>
>>> in fact, what if e.g. tos is 0 but mask is non-zero? should not
>>> this be rejected, too?
>>>
>>
>> Actually the tos check should be removed, there's no guidance it should
>> be 0, like the other fields. Our hardware doesn't support it, but this
>> will be caught in validate_classifier_selectors.
> 
> same question for l4_4_bytes then.
> 
I guess it's reasonable to assert that. An ip only rule would fail to
match if the mask were set.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering
  2025-11-19  9:36   ` Michael S. Tsirkin
@ 2025-11-19 17:14     ` Dan Jurgens
  0 siblings, 0 replies; 66+ messages in thread
From: Dan Jurgens @ 2025-11-19 17:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
	yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
	edumazet

On 11/19/25 3:36 AM, Michael S. Tsirkin wrote:
> On Tue, Nov 18, 2025 at 08:38:56AM -0600, Daniel Jurgens wrote:
>> +
>>  	ff->vdev = vdev;
>>  	ff->ff_supported = true;
>>  
> 
> So this is set here and never cleared.
> 
> But we never recreate the group on restore (after suspend).
> 
> sounds like we need virtnet_ff_cleanup/virtnet_ff_init on the
> suspend/restore path?

Yes, you're right.




^ permalink raw reply	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2025-11-19 17:15 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-18 14:38 [PATCH net-next v11 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
2025-11-19  7:38   ` Michael S. Tsirkin
2025-11-19 14:23     ` Dan Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 02/12] virtio: Add config_op for admin commands Daniel Jurgens
2025-11-19  7:36   ` Michael S. Tsirkin
2025-11-18 14:38 ` [PATCH net-next v11 03/12] virtio: Expose generic device capability operations Daniel Jurgens
2025-11-18 21:42   ` Michael S. Tsirkin
2025-11-19  3:27     ` Dan Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 04/12] virtio: Expose object create and destroy API Daniel Jurgens
2025-11-18 22:14   ` Michael S. Tsirkin
2025-11-19  3:29     ` Dan Jurgens
2025-11-19  6:39       ` Michael S. Tsirkin
2025-11-19  7:21         ` Dan Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
2025-11-18 22:06   ` Michael S. Tsirkin
2025-11-19  5:57     ` Dan Jurgens
2025-11-18 23:03   ` Michael S. Tsirkin
2025-11-19  4:27     ` Dan Jurgens
2025-11-19  7:53   ` Michael S. Tsirkin
2025-11-19 14:47     ` Dan Jurgens
2025-11-19  7:55   ` Michael S. Tsirkin
2025-11-18 14:38 ` [PATCH net-next v11 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
2025-11-19  9:36   ` Michael S. Tsirkin
2025-11-19 17:14     ` Dan Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
2025-11-18 19:01   ` Michael S. Tsirkin
2025-11-19  6:07     ` Dan Jurgens
2025-11-19  9:20     ` Michael S. Tsirkin
2025-11-19  9:26   ` Michael S. Tsirkin
2025-11-19 16:51     ` Dan Jurgens
2025-11-18 14:38 ` [PATCH net-next v11 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
2025-11-18 21:55   ` Michael S. Tsirkin
2025-11-19  6:26     ` Dan Jurgens
2025-11-19  6:35       ` Michael S. Tsirkin
2025-11-19  7:18         ` Dan Jurgens
2025-11-19  7:23           ` Michael S. Tsirkin
2025-11-19  7:33             ` Dan Jurgens
2025-11-19  7:41               ` Michael S. Tsirkin
2025-11-19 15:45                 ` Dan Jurgens
2025-11-19 15:58                   ` Michael S. Tsirkin
2025-11-19  7:42   ` Michael S. Tsirkin
2025-11-19  9:01   ` Michael S. Tsirkin
2025-11-18 14:38 ` [PATCH net-next v11 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
2025-11-18 21:31   ` Michael S. Tsirkin
2025-11-19  7:03     ` Dan Jurgens
2025-11-19  7:06       ` Michael S. Tsirkin
2025-11-19  7:17         ` Dan Jurgens
2025-11-19  7:20           ` Michael S. Tsirkin
2025-11-19  9:18     ` Michael S. Tsirkin
2025-11-19 16:33       ` Dan Jurgens
2025-11-19 16:51         ` Michael S. Tsirkin
2025-11-19 16:59           ` Dan Jurgens
2025-11-18 14:39 ` [PATCH net-next v11 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
2025-11-18 21:45   ` Michael S. Tsirkin
2025-11-19  7:35     ` Dan Jurgens
2025-11-19  7:44       ` Michael S. Tsirkin
2025-11-18 21:48   ` Michael S. Tsirkin
2025-11-19 16:04     ` Dan Jurgens
2025-11-18 14:39 ` [PATCH net-next v11 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
2025-11-19  9:14   ` Michael S. Tsirkin
2025-11-19 16:07     ` Dan Jurgens
2025-11-18 14:39 ` [PATCH net-next v11 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
2025-11-18 18:49   ` Michael S. Tsirkin
2025-11-19 16:24     ` Dan Jurgens
2025-11-18 22:39   ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).