* [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support
@ 2025-11-19 19:15 Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
` (12 more replies)
0 siblings, 13 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
This series implements ethtool flow rules support for virtio_net using the
virtio flow filter (FF) specification. The implementation allows users to
configure packet filtering rules through ethtool commands, directing
packets to specific receive queues, or dropping them based on various
header fields.
The series starts with infrastructure changes to expose virtio PCI admin
capabilities and object management APIs. It then creates the virtio_net
directory structure and implements the flow filter functionality with support
for:
- Layer 2 (Ethernet) flow rules
- IPv4 and IPv6 flow rules
- TCP and UDP flow rules (both IPv4 and IPv6)
- Rule querying and management operations
Setting, deleting and viewing flow filters, -1 action is drop, positive
integers steer to that RQ:
$ ethtool -u ens9
4 RX rings available
Total 0 rules
$ ethtool -U ens9 flow-type ether src 1c:34:da:4a:33:dd action 0
Added rule with ID 0
$ ethtool -U ens9 flow-type udp4 dst-port 5001 action 3
Added rule with ID 1
$ ethtool -U ens9 flow-type tcp6 src-ip fc00::2 dst-port 5001 action 2
Added rule with ID 2
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action 1
Added rule with ID 3
$ ethtool -U ens9 flow-type ip6 dst-ip fc00::1 action -1
Added rule with ID 4
$ ethtool -U ens9 flow-type ip6 src-ip fc00::2 action -1
Added rule with ID 5
$ ethtool -U ens9 delete 4
$ ethtool -u ens9
4 RX rings available
Total 5 rules
Filter: 0
Flow Type: Raw Ethernet
Src MAC addr: 1C:34:DA:4A:33:DD mask: 00:00:00:00:00:00
Dest MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Ethertype: 0x0 mask: 0xFFFF
Action: Direct to queue 0
Filter: 1
Rule Type: UDP over IPv4
Src IP addr: 0.0.0.0 mask: 255.255.255.255
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 5001 mask: 0x0
Action: Direct to queue 3
Filter: 2
Rule Type: TCP over IPv6
Src IP addr: fc00::2 mask: ::
Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
Traffic Class: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 5001 mask: 0x0
Action: Direct to queue 2
Filter: 3
Rule Type: Raw IPv4
Src IP addr: 192.168.51.101 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Protocol: 0 mask: 0xff
L4 bytes: 0x0 mask: 0xffffffff
Action: Direct to queue 1
Filter: 5
Rule Type: Raw IPv6
Src IP addr: fc00::2 mask: ::
Dest IP addr: :: mask: ffff:ffff:ffff:ffff:ffff:ffff:ffff:ffff
Traffic Class: 0x0 mask: 0xff
Protocol: 0 mask: 0xff
L4 bytes: 0x0 mask: 0xffffffff
Action: Drop
---
v2: https://lore.kernel.org/netdev/20250908164046.25051-1-danielj@nvidia.com/
- Fix sparse warnings
- Fix memory leak on subsequent failure to allocate
- Fix some Typos
v3: https://lore.kernel.org/netdev/20250923141920.283862-1-danielj@nvidia.com/
- Added admin_ops to virtio_device kdoc.
v4:
- Fixed double free bug inserting flows
- Fixed incorrect protocol field check parsing ip4 headers.
- (u8 *) changed to (void *)
- Added kdoc comments to UAPI changes.
- No longer split up virtio_net.c
- Added config op to execute admin commands.
- virtio_pci assigns vp_modern_admin_cmd_exec to this callback.
- Moved admin command API to new core file virtio_admin_commands.c
v5:
- Fixed compile error
- Fixed static analysis warning on () after macro
- Added missing fields to kdoc comments
- Aligned parameter name between prototype and kdoc
v6:
- Fix sparse warning "array of flexible structures" Jakub K/Simon H
- Use new variable and validate ff_mask_size before set_cap. MST
v7:
- Change virtnet_ff_init to return a value. Allow -EOPNOTSUPP. Xuan
- Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abini
- Move for (int i removal hung back a patch. Paolo Abini
v8
- Removed unused num_classifiers. Jason Wang
- Use real_ff_mask_size when setting the selector caps. Jason Wang
v9:
- Set err to -ENOMEM after alloc failures in virtnet_ff_init. Simon H
v10:
- Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
Jason Wang/Paolo Abeni
v11:
- Return -EINVAL if any resource limit is 0. Simon Horman
- Ensure we don't overrun alloced space of ff->ff_mask by moving the
real_ff_mask_size > ff_mask_size check into the loop. Simon Horman
v12: Many comments by MST, thanks Michael. Only the most significant
listed here:
- Fixed leak of key in build_and_insert.
- Fixed setting ethhdr proto for IPv6.
- Added 2 byte pad to struct virtio_net_ff_cap_data.
- Use and set rule_cnt when querying all flows.
- Cleanup and reinit in freeze/restore path.
Daniel Jurgens (12):
virtio_pci: Remove supported_cap size build assert
virtio: Add config_op for admin commands
virtio: Expose generic device capability operations
virtio: Expose object create and destroy API
virtio_net: Query and set flow filter caps
virtio_net: Create a FF group for ethtool steering
virtio_net: Implement layer 2 ethtool flow rules
virtio_net: Use existing classifier if possible
virtio_net: Implement IPv4 ethtool flow rules
virtio_net: Add support for IPv6 ethtool steering
virtio_net: Add support for TCP and UDP ethtool rules
virtio_net: Add get ethtool flow rules ops
drivers/net/virtio_net.c | 1193 ++++++++++++++++++++++++
drivers/virtio/Makefile | 2 +-
drivers/virtio/virtio_admin_commands.c | 168 ++++
drivers/virtio/virtio_pci_common.h | 1 -
drivers/virtio/virtio_pci_modern.c | 10 +-
include/linux/virtio_admin.h | 124 +++
include/linux/virtio_config.h | 6 +
include/uapi/linux/virtio_net_ff.h | 153 +++
include/uapi/linux/virtio_pci.h | 6 +-
9 files changed, 1652 insertions(+), 11 deletions(-)
create mode 100644 drivers/virtio/virtio_admin_commands.c
create mode 100644 include/linux/virtio_admin.h
create mode 100644 include/uapi/linux/virtio_net_ff.h
--
2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* [PATCH net-next v12 01/12] virtio_pci: Remove supported_cap size build assert
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 02/12] virtio: Add config_op for admin commands Daniel Jurgens
` (11 subsequent siblings)
12 siblings, 0 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
The cap ID list can be more than 64 bits. Remove the build assert. Also
remove caching of the supported caps, it wasn't used.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: New patch for V4
v5:
- support_caps -> supported_caps (Alok Tiwari)
- removed unused variable (test robot)
v12
- Change supported_caps check to byte swap the constant. MST
---
drivers/virtio/virtio_pci_common.h | 1 -
drivers/virtio/virtio_pci_modern.c | 8 +-------
2 files changed, 1 insertion(+), 8 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 8cd01de27baf..fc26e035e7a6 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -48,7 +48,6 @@ struct virtio_pci_admin_vq {
/* Protects virtqueue access. */
spinlock_t lock;
u64 supported_cmds;
- u64 supported_caps;
u8 max_dev_parts_objects;
struct ida dev_parts_ida;
/* Name of the admin queue: avq.$vq_index. */
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index dd0e65f71d41..1675d6cda416 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -304,7 +304,6 @@ virtio_pci_admin_cmd_dev_parts_objects_enable(struct virtio_device *virtio_dev)
static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
{
- struct virtio_pci_device *vp_dev = to_vp_device(virtio_dev);
struct virtio_admin_cmd_query_cap_id_result *data;
struct virtio_admin_cmd cmd = {};
struct scatterlist result_sg;
@@ -323,12 +322,7 @@ static void virtio_pci_admin_cmd_cap_init(struct virtio_device *virtio_dev)
if (ret)
goto end;
- /* Max number of caps fits into a single u64 */
- BUILD_BUG_ON(sizeof(data->supported_caps) > sizeof(u64));
-
- vp_dev->admin_vq.supported_caps = le64_to_cpu(data->supported_caps[0]);
-
- if (!(vp_dev->admin_vq.supported_caps & (1 << VIRTIO_DEV_PARTS_CAP)))
+ if (!(data->supported_caps[0] & cpu_to_le64(1 << VIRTIO_DEV_PARTS_CAP)))
goto end;
virtio_pci_admin_cmd_dev_parts_objects_enable(virtio_dev);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 02/12] virtio: Add config_op for admin commands
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 03/12] virtio: Expose generic device capability operations Daniel Jurgens
` (10 subsequent siblings)
12 siblings, 0 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
This will allow device drivers to issue administration commands.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: New patch for v4
v12: Add (optional) to admin_cmd_exec field. MST
---
drivers/virtio/virtio_pci_modern.c | 2 ++
include/linux/virtio_config.h | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 1675d6cda416..e2a813b3b3fd 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -1236,6 +1236,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
.get_shm_region = vp_get_shm_region,
.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
+ .admin_cmd_exec = vp_modern_admin_cmd_exec,
};
static const struct virtio_config_ops virtio_pci_config_ops = {
@@ -1256,6 +1257,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
.get_shm_region = vp_get_shm_region,
.disable_vq_and_reset = vp_modern_disable_vq_and_reset,
.enable_vq_after_reset = vp_modern_enable_vq_after_reset,
+ .admin_cmd_exec = vp_modern_admin_cmd_exec,
};
/* the PCI probing function */
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 16001e9f9b39..d620f46bcc07 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -108,6 +108,10 @@ struct virtqueue_info {
* Returns 0 on success or error status
* If disable_vq_and_reset is set, then enable_vq_after_reset must also be
* set.
+ * @admin_cmd_exec: Execute an admin VQ command (optional).
+ * vdev: the virtio_device
+ * cmd: the command to execute
+ * Returns 0 on success or error status
*/
struct virtio_config_ops {
void (*get)(struct virtio_device *vdev, unsigned offset,
@@ -137,6 +141,8 @@ struct virtio_config_ops {
struct virtio_shm_region *region, u8 id);
int (*disable_vq_and_reset)(struct virtqueue *vq);
int (*enable_vq_after_reset)(struct virtqueue *vq);
+ int (*admin_cmd_exec)(struct virtio_device *vdev,
+ struct virtio_admin_cmd *cmd);
};
/**
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 03/12] virtio: Expose generic device capability operations
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 02/12] virtio: Add config_op for admin commands Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 20:30 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 04/12] virtio: Expose object create and destroy API Daniel Jurgens
` (9 subsequent siblings)
12 siblings, 1 reply; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Currently querying and setting capabilities is restricted to a single
capability and contained within the virtio PCI driver. However, each
device type has generic and device specific capabilities, that may be
queried and set. In subsequent patches virtio_net will query and set
flow filter capabilities.
This changes the size of virtio_admin_cmd_query_cap_id_result. It's safe
to do because this data is written by DMA, so a newer controller can't
overrun the size on an older kernel.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Moved this logic from virtio_pci_modern to new file
virtio_admin_commands.
v12:
- Removed uapi virtio_pci include in virtio_admin.h. MST
- Added virtio_pci uapi include to virtio_admin_commands.c
- Put () around cap in macro. MST
- Removed nonsense comment above VIRTIO_ADMIN_MAX_CAP. MST
- +1 VIRTIO_ADMIN_MAX_CAP when calculating array size. MST
- Updated commit message
---
drivers/virtio/Makefile | 2 +-
drivers/virtio/virtio_admin_commands.c | 91 ++++++++++++++++++++++++++
include/linux/virtio_admin.h | 80 ++++++++++++++++++++++
include/uapi/linux/virtio_pci.h | 6 +-
4 files changed, 176 insertions(+), 3 deletions(-)
create mode 100644 drivers/virtio/virtio_admin_commands.c
create mode 100644 include/linux/virtio_admin.h
diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
index eefcfe90d6b8..2b4a204dde33 100644
--- a/drivers/virtio/Makefile
+++ b/drivers/virtio/Makefile
@@ -1,5 +1,5 @@
# SPDX-License-Identifier: GPL-2.0
-obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
+obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
new file mode 100644
index 000000000000..a2254e71e8dc
--- /dev/null
+++ b/drivers/virtio/virtio_admin_commands.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/virtio_admin.h>
+#include <uapi/linux/virtio_pci.h>
+
+int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
+ struct virtio_admin_cmd_query_cap_id_result *data)
+{
+ struct virtio_admin_cmd cmd = {};
+ struct scatterlist result_sg;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ sg_init_one(&result_sg, data, sizeof(*data));
+ cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
+ cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+ cmd.result_sg = &result_sg;
+
+ return vdev->config->admin_cmd_exec(vdev, &cmd);
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
+
+int virtio_admin_cap_get(struct virtio_device *vdev,
+ u16 id,
+ void *caps,
+ size_t cap_size)
+{
+ struct virtio_admin_cmd_cap_get_data *data;
+ struct virtio_admin_cmd cmd = {};
+ struct scatterlist result_sg;
+ struct scatterlist data_sg;
+ int err;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->id = cpu_to_le16(id);
+ sg_init_one(&data_sg, data, sizeof(*data));
+ sg_init_one(&result_sg, caps, cap_size);
+ cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DEVICE_CAP_GET);
+ cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+ cmd.data_sg = &data_sg;
+ cmd.result_sg = &result_sg;
+
+ err = vdev->config->admin_cmd_exec(vdev, &cmd);
+ kfree(data);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_get);
+
+int virtio_admin_cap_set(struct virtio_device *vdev,
+ u16 id,
+ const void *caps,
+ size_t cap_size)
+{
+ struct virtio_admin_cmd_cap_set_data *data;
+ struct virtio_admin_cmd cmd = {};
+ struct scatterlist data_sg;
+ size_t data_size;
+ int err;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ data_size = sizeof(*data) + cap_size;
+ data = kzalloc(data_size, GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->id = cpu_to_le16(id);
+ memcpy(data->cap_specific_data, caps, cap_size);
+ sg_init_one(&data_sg, data, data_size);
+ cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
+ cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
+ cmd.data_sg = &data_sg;
+ cmd.result_sg = NULL;
+
+ err = vdev->config->admin_cmd_exec(vdev, &cmd);
+ kfree(data);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
new file mode 100644
index 000000000000..4ab84d53c924
--- /dev/null
+++ b/include/linux/virtio_admin.h
@@ -0,0 +1,80 @@
+/* SPDX-License-Identifier: GPL-2.0-only
+ *
+ * Header file for virtio admin operations
+ */
+
+#ifndef _LINUX_VIRTIO_ADMIN_H
+#define _LINUX_VIRTIO_ADMIN_H
+
+struct virtio_device;
+struct virtio_admin_cmd_query_cap_id_result;
+
+/**
+ * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
+ * @cap_list: Pointer to capability list structure containing supported_caps array
+ * @cap: Capability ID to check
+ *
+ * The cap_list contains a supported_caps array of little-endian 64-bit integers
+ * where each bit represents a capability. Bit 0 of the first element represents
+ * capability ID 0, bit 1 represents capability ID 1, and so on.
+ *
+ * Return: 1 if capability is supported, 0 otherwise
+ */
+#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
+ (!!(1 & (le64_to_cpu(cap_list->supported_caps[(cap) / 64]) >> (cap) % 64)))
+
+/**
+ * virtio_admin_cap_id_list_query - Query the list of available capability IDs
+ * @vdev: The virtio device to query
+ * @data: Pointer to result structure (must be heap allocated)
+ *
+ * This function queries the virtio device for the list of available capability
+ * IDs that can be used with virtio_admin_cap_get() and virtio_admin_cap_set().
+ * The result is stored in the provided data structure.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability queries, or a negative error code on other failures.
+ */
+int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
+ struct virtio_admin_cmd_query_cap_id_result *data);
+
+/**
+ * virtio_admin_cap_get - Get capability data for a specific capability ID
+ * @vdev: The virtio device
+ * @id: Capability ID to retrieve
+ * @caps: Pointer to capability data structure (must be heap allocated)
+ * @cap_size: Size of the capability data structure
+ *
+ * This function retrieves a specific capability from the virtio device.
+ * The capability data is stored in the provided buffer. The caller must
+ * ensure the buffer is large enough to hold the capability data.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability retrieval, or a negative error code on other failures.
+ */
+int virtio_admin_cap_get(struct virtio_device *vdev,
+ u16 id,
+ void *caps,
+ size_t cap_size);
+
+/**
+ * virtio_admin_cap_set - Set capability data for a specific capability ID
+ * @vdev: The virtio device
+ * @id: Capability ID to set
+ * @caps: Pointer to capability data structure (must be heap allocated)
+ * @cap_size: Size of the capability data structure
+ *
+ * This function sets a specific capability on the virtio device.
+ * The capability data is read from the provided buffer and applied
+ * to the device. The device may validate the capability data before
+ * applying it.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or capability setting, or a negative error code on other failures.
+ */
+int virtio_admin_cap_set(struct virtio_device *vdev,
+ u16 id,
+ const void *caps,
+ size_t cap_size);
+
+#endif /* _LINUX_VIRTIO_ADMIN_H */
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index c691ac210ce2..2e35fd8d4a95 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -315,15 +315,17 @@ struct virtio_admin_cmd_notify_info_result {
#define VIRTIO_DEV_PARTS_CAP 0x0000
+#define VIRTIO_ADMIN_MAX_CAP 0x0fff
+
struct virtio_dev_parts_cap {
__u8 get_parts_resource_objects_limit;
__u8 set_parts_resource_objects_limit;
};
-#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
+#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP + 1, 64)
struct virtio_admin_cmd_query_cap_id_result {
- __le64 supported_caps[MAX_CAP_ID];
+ __le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
};
struct virtio_admin_cmd_cap_get_data {
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 04/12] virtio: Expose object create and destroy API
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (2 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 03/12] virtio: Expose generic device capability operations Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
` (8 subsequent siblings)
12 siblings, 0 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Object create and destroy were implemented specifically for dev parts
device objects. Create general purpose APIs for use by upper layer
drivers.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Moved this logic from virtio_pci_modern to new file
virtio_admin_commands.
v5: Added missing params, and synced names in comments (Alok Tiwari)
---
drivers/virtio/virtio_admin_commands.c | 75 ++++++++++++++++++++++++++
include/linux/virtio_admin.h | 44 +++++++++++++++
2 files changed, 119 insertions(+)
diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
index a2254e71e8dc..4738ffe3b5c6 100644
--- a/drivers/virtio/virtio_admin_commands.c
+++ b/drivers/virtio/virtio_admin_commands.c
@@ -89,3 +89,78 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
return err;
}
EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
+
+int virtio_admin_obj_create(struct virtio_device *vdev,
+ u16 obj_type,
+ u32 obj_id,
+ u16 group_type,
+ u64 group_member_id,
+ const void *obj_specific_data,
+ size_t obj_specific_data_size)
+{
+ size_t data_size = sizeof(struct virtio_admin_cmd_resource_obj_create_data);
+ struct virtio_admin_cmd_resource_obj_create_data *obj_create_data;
+ struct virtio_admin_cmd cmd = {};
+ struct scatterlist data_sg;
+ void *data;
+ int err;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ data_size += obj_specific_data_size;
+ data = kzalloc(data_size, GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ obj_create_data = data;
+ obj_create_data->hdr.type = cpu_to_le16(obj_type);
+ obj_create_data->hdr.id = cpu_to_le32(obj_id);
+ memcpy(obj_create_data->resource_obj_specific_data, obj_specific_data,
+ obj_specific_data_size);
+ sg_init_one(&data_sg, data, data_size);
+
+ cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_CREATE);
+ cmd.group_type = cpu_to_le16(group_type);
+ cmd.group_member_id = cpu_to_le64(group_member_id);
+ cmd.data_sg = &data_sg;
+
+ err = vdev->config->admin_cmd_exec(vdev, &cmd);
+ kfree(data);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_obj_create);
+
+int virtio_admin_obj_destroy(struct virtio_device *vdev,
+ u16 obj_type,
+ u32 obj_id,
+ u16 group_type,
+ u64 group_member_id)
+{
+ struct virtio_admin_cmd_resource_obj_cmd_hdr *data;
+ struct virtio_admin_cmd cmd = {};
+ struct scatterlist data_sg;
+ int err;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ data = kzalloc(sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->type = cpu_to_le16(obj_type);
+ data->id = cpu_to_le32(obj_id);
+ sg_init_one(&data_sg, data, sizeof(*data));
+ cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_RESOURCE_OBJ_DESTROY);
+ cmd.group_type = cpu_to_le16(group_type);
+ cmd.group_member_id = cpu_to_le64(group_member_id);
+ cmd.data_sg = &data_sg;
+
+ err = vdev->config->admin_cmd_exec(vdev, &cmd);
+ kfree(data);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(virtio_admin_obj_destroy);
diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
index 4ab84d53c924..ea51351c5a0f 100644
--- a/include/linux/virtio_admin.h
+++ b/include/linux/virtio_admin.h
@@ -77,4 +77,48 @@ int virtio_admin_cap_set(struct virtio_device *vdev,
const void *caps,
size_t cap_size);
+/**
+ * virtio_admin_obj_create - Create an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to create
+ * @obj_id: ID for the new object
+ * @group_type: administrative group type for the operation
+ * @group_member_id: member identifier within the administrative group
+ * @obj_specific_data: object-specific data for creation
+ * @obj_specific_data_size: size of the object-specific data in bytes
+ *
+ * Creates a new object on the virtio device with the specified type and ID.
+ * The object may require object-specific data for proper initialization.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object creation, or a negative error code on other failures.
+ */
+int virtio_admin_obj_create(struct virtio_device *vdev,
+ u16 obj_type,
+ u32 obj_id,
+ u16 group_type,
+ u64 group_member_id,
+ const void *obj_specific_data,
+ size_t obj_specific_data_size);
+
+/**
+ * virtio_admin_obj_destroy - Destroy an object on a virtio device
+ * @vdev: the virtio device
+ * @obj_type: type of object to destroy
+ * @obj_id: ID of the object to destroy
+ * @group_type: administrative group type for the operation
+ * @group_member_id: member identifier within the administrative group
+ *
+ * Destroys an existing object on the virtio device with the specified type
+ * and ID.
+ *
+ * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
+ * operations or object destruction, or a negative error code on other failures.
+ */
+int virtio_admin_obj_destroy(struct virtio_device *vdev,
+ u16 obj_type,
+ u32 obj_id,
+ u16 group_type,
+ u64 group_member_id);
+
#endif /* _LINUX_VIRTIO_ADMIN_H */
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (3 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 04/12] virtio: Expose object create and destroy API Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-20 1:51 ` Jakub Kicinski
` (2 more replies)
2025-11-19 19:15 ` [PATCH net-next v12 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
` (7 subsequent siblings)
12 siblings, 3 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
When probing a virtnet device, attempt to read the flow filter
capabilities. In order to use the feature the caps must also
be set. For now setting what was read is sufficient.
This patch adds uapi definitions virtio_net flow filters define in
version 1.4 of the VirtIO spec.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
---
v4:
- Validate the length in the selector caps
- Removed __free usage.
- Removed for(int.
v5:
- Remove unneed () after MAX_SEL_LEN macro (test bot)
v6:
- Fix sparse warning "array of flexible structures" Jakub K/Simon H
- Use new variable and validate ff_mask_size before set_cap. MST
v7:
- Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abeni
- Return errors from virtnet_ff_init, -ENOTSUPP is not fatal. Xuan
v8:
- Use real_ff_mask_size when setting the selector caps. Jason Wang
v9:
- Set err after failed memory allocations. Simon Horman
v10:
- Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
Jason/Paolo.
v11:
- Return -EINVAL if any resource limit is 0. Simon Horman
- Ensure we don't overrun alloced space of ff->ff_mask by moving the
real_ff_mask_size > ff_mask_size check into the loop. Simon Horman
v12:
- Move uapi includes to virtio_net.c vs header file. MST
- Remove kernel.h header in virtio_net_ff uapi. MST
- WARN_ON_ONCE in error paths validating selectors. MST
- Move includes from .h to .c files. MST
- Add WARN_ON_ONCE if obj_destroy fails. MST
- Comment cleanup in virito_net_ff.h uapi. MST
- Add 2 byte pad to the end of virtio_net_ff_cap_data.
https://lore.kernel.org/virtio-comment/20251119044029-mutt-send-email-mst@kernel.org/T/#m930988a5d3db316c68546d8b61f4b94f6ebda030
- Cleanup and reinit in the freeze/restore path. MST
---
drivers/net/virtio_net.c | 221 +++++++++++++++++++++++++
drivers/virtio/virtio_admin_commands.c | 2 +
include/uapi/linux/virtio_net_ff.h | 88 ++++++++++
3 files changed, 311 insertions(+)
create mode 100644 include/uapi/linux/virtio_net_ff.h
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index cfa006b88688..2d5c1bff879a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -26,6 +26,11 @@
#include <net/netdev_rx_queue.h>
#include <net/netdev_queues.h>
#include <net/xdp_sock_drv.h>
+#include <linux/virtio_admin.h>
+#include <net/ipv6.h>
+#include <net/ip.h>
+#include <uapi/linux/virtio_pci.h>
+#include <uapi/linux/virtio_net_ff.h>
static int napi_weight = NAPI_POLL_WEIGHT;
module_param(napi_weight, int, 0444);
@@ -281,6 +286,14 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
};
+struct virtnet_ff {
+ struct virtio_device *vdev;
+ bool ff_supported;
+ struct virtio_net_ff_cap_data *ff_caps;
+ struct virtio_net_ff_cap_mask_data *ff_mask;
+ struct virtio_net_ff_actions *ff_actions;
+};
+
#define VIRTNET_Q_TYPE_RX 0
#define VIRTNET_Q_TYPE_TX 1
#define VIRTNET_Q_TYPE_CQ 2
@@ -493,6 +506,8 @@ struct virtnet_info {
struct failover *failover;
u64 device_stats_cap;
+
+ struct virtnet_ff ff;
};
struct padded_vnet_hdr {
@@ -5760,6 +5775,186 @@ static const struct netdev_stat_ops virtnet_stat_ops = {
.get_base_stats = virtnet_get_base_stats,
};
+static size_t get_mask_size(u16 type)
+{
+ switch (type) {
+ case VIRTIO_NET_FF_MASK_TYPE_ETH:
+ return sizeof(struct ethhdr);
+ case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+ return sizeof(struct iphdr);
+ case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+ return sizeof(struct ipv6hdr);
+ case VIRTIO_NET_FF_MASK_TYPE_TCP:
+ return sizeof(struct tcphdr);
+ case VIRTIO_NET_FF_MASK_TYPE_UDP:
+ return sizeof(struct udphdr);
+ }
+
+ return 0;
+}
+
+#define MAX_SEL_LEN (sizeof(struct ipv6hdr))
+
+static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
+{
+ size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
+ sizeof(struct virtio_net_ff_selector) *
+ VIRTIO_NET_FF_MASK_TYPE_MAX;
+ struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
+ struct virtio_net_ff_selector *sel;
+ size_t real_ff_mask_size;
+ int err;
+ int i;
+
+ if (!vdev->config->admin_cmd_exec)
+ return -EOPNOTSUPP;
+
+ cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
+ if (!cap_id_list)
+ return -ENOMEM;
+
+ err = virtio_admin_cap_id_list_query(vdev, cap_id_list);
+ if (err)
+ goto err_cap_list;
+
+ if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
+ VIRTIO_NET_FF_RESOURCE_CAP) &&
+ VIRTIO_CAP_IN_LIST(cap_id_list,
+ VIRTIO_NET_FF_SELECTOR_CAP) &&
+ VIRTIO_CAP_IN_LIST(cap_id_list,
+ VIRTIO_NET_FF_ACTION_CAP))) {
+ err = -EOPNOTSUPP;
+ goto err_cap_list;
+ }
+
+ ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
+ if (!ff->ff_caps) {
+ err = -ENOMEM;
+ goto err_cap_list;
+ }
+
+ err = virtio_admin_cap_get(vdev,
+ VIRTIO_NET_FF_RESOURCE_CAP,
+ ff->ff_caps,
+ sizeof(*ff->ff_caps));
+
+ if (err)
+ goto err_ff;
+
+ if (!ff->ff_caps->groups_limit ||
+ !ff->ff_caps->classifiers_limit ||
+ !ff->ff_caps->rules_limit ||
+ !ff->ff_caps->rules_per_group_limit) {
+ err = -EINVAL;
+ goto err_ff;
+ }
+
+ /* VIRTIO_NET_FF_MASK_TYPE start at 1 */
+ for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
+ ff_mask_size += get_mask_size(i);
+
+ ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
+ if (!ff->ff_mask) {
+ err = -ENOMEM;
+ goto err_ff;
+ }
+
+ err = virtio_admin_cap_get(vdev,
+ VIRTIO_NET_FF_SELECTOR_CAP,
+ ff->ff_mask,
+ ff_mask_size);
+
+ if (err)
+ goto err_ff_mask;
+
+ ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
+ VIRTIO_NET_FF_ACTION_MAX,
+ GFP_KERNEL);
+ if (!ff->ff_actions) {
+ err = -ENOMEM;
+ goto err_ff_mask;
+ }
+
+ err = virtio_admin_cap_get(vdev,
+ VIRTIO_NET_FF_ACTION_CAP,
+ ff->ff_actions,
+ sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+
+ if (err)
+ goto err_ff_action;
+
+ err = virtio_admin_cap_set(vdev,
+ VIRTIO_NET_FF_RESOURCE_CAP,
+ ff->ff_caps,
+ sizeof(*ff->ff_caps));
+ if (err)
+ goto err_ff_action;
+
+ real_ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
+ sel = (void *)&ff->ff_mask->selectors;
+
+ for (i = 0; i < ff->ff_mask->count; i++) {
+ if (sel->length > MAX_SEL_LEN) {
+ WARN_ON_ONCE(true);
+ err = -EINVAL;
+ goto err_ff_action;
+ }
+ real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
+ if (real_ff_mask_size > ff_mask_size) {
+ WARN_ON_ONCE(true);
+ err = -EINVAL;
+ goto err_ff_action;
+ }
+ sel = (void *)sel + sizeof(*sel) + sel->length;
+ }
+
+ err = virtio_admin_cap_set(vdev,
+ VIRTIO_NET_FF_SELECTOR_CAP,
+ ff->ff_mask,
+ real_ff_mask_size);
+ if (err)
+ goto err_ff_action;
+
+ err = virtio_admin_cap_set(vdev,
+ VIRTIO_NET_FF_ACTION_CAP,
+ ff->ff_actions,
+ sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
+ if (err)
+ goto err_ff_action;
+
+ ff->vdev = vdev;
+ ff->ff_supported = true;
+
+ kfree(cap_id_list);
+
+ return 0;
+
+err_ff_action:
+ kfree(ff->ff_actions);
+ ff->ff_actions = NULL;
+err_ff_mask:
+ kfree(ff->ff_mask);
+ ff->ff_mask = NULL;
+err_ff:
+ kfree(ff->ff_caps);
+ ff->ff_caps = NULL;
+err_cap_list:
+ kfree(cap_id_list);
+
+ return err;
+}
+
+static void virtnet_ff_cleanup(struct virtnet_ff *ff)
+{
+ if (!ff->ff_supported)
+ return;
+
+ kfree(ff->ff_actions);
+ kfree(ff->ff_mask);
+ kfree(ff->ff_caps);
+ ff->ff_supported = false;
+}
+
static void virtnet_freeze_down(struct virtio_device *vdev)
{
struct virtnet_info *vi = vdev->priv;
@@ -5778,6 +5973,10 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
netif_tx_lock_bh(vi->dev);
netif_device_detach(vi->dev);
netif_tx_unlock_bh(vi->dev);
+
+ rtnl_lock();
+ virtnet_ff_cleanup(&vi->ff);
+ rtnl_unlock();
}
static int init_vqs(struct virtnet_info *vi);
@@ -5804,6 +6003,17 @@ static int virtnet_restore_up(struct virtio_device *vdev)
return err;
}
+ /* Initialize flow filters. Not supported is an acceptable and common
+ * return code
+ */
+ rtnl_lock();
+ err = virtnet_ff_init(&vi->ff, vi->vdev);
+ if (err && err != -EOPNOTSUPP) {
+ rtnl_unlock();
+ return err;
+ }
+ rtnl_unlock();
+
netif_tx_lock_bh(vi->dev);
netif_device_attach(vi->dev);
netif_tx_unlock_bh(vi->dev);
@@ -7137,6 +7347,15 @@ static int virtnet_probe(struct virtio_device *vdev)
}
vi->guest_offloads_capable = vi->guest_offloads;
+ /* Initialize flow filters. Not supported is an acceptable and common
+ * return code
+ */
+ err = virtnet_ff_init(&vi->ff, vi->vdev);
+ if (err && err != -EOPNOTSUPP) {
+ rtnl_unlock();
+ goto free_unregister_netdev;
+ }
+
rtnl_unlock();
err = virtnet_cpu_notif_add(vi);
@@ -7152,6 +7371,7 @@ static int virtnet_probe(struct virtio_device *vdev)
free_unregister_netdev:
unregister_netdev(dev);
+ virtnet_ff_cleanup(&vi->ff);
free_failover:
net_failover_destroy(vi->failover);
free_vqs:
@@ -7201,6 +7421,7 @@ static void virtnet_remove(struct virtio_device *vdev)
virtnet_free_irq_moder(vi);
unregister_netdev(vi->dev);
+ virtnet_ff_cleanup(&vi->ff);
net_failover_destroy(vi->failover);
diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
index 4738ffe3b5c6..e84a305d2b2a 100644
--- a/drivers/virtio/virtio_admin_commands.c
+++ b/drivers/virtio/virtio_admin_commands.c
@@ -161,6 +161,8 @@ int virtio_admin_obj_destroy(struct virtio_device *vdev,
err = vdev->config->admin_cmd_exec(vdev, &cmd);
kfree(data);
+ WARN_ON_ONCE(err);
+
return err;
}
EXPORT_SYMBOL_GPL(virtio_admin_obj_destroy);
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
new file mode 100644
index 000000000000..1debcf595bdb
--- /dev/null
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -0,0 +1,88 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+ *
+ * Header file for virtio_net flow filters
+ */
+#ifndef _LINUX_VIRTIO_NET_FF_H
+#define _LINUX_VIRTIO_NET_FF_H
+
+#include <linux/types.h>
+
+#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
+#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
+#define VIRTIO_NET_FF_ACTION_CAP 0x802
+
+/**
+ * struct virtio_net_ff_cap_data - Flow filter resource capability limits
+ * @groups_limit: maximum number of flow filter groups supported by the device
+ * @classifiers_limit: maximum number of classifiers supported by the device
+ * @rules_limit: maximum number of rules supported device-wide across all groups
+ * @rules_per_group_limit: maximum number of rules allowed in a single group
+ * @last_rule_priority: priority value associated with the lowest-priority rule
+ * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
+ */
+struct virtio_net_ff_cap_data {
+ __le32 groups_limit;
+ __le32 classifiers_limit;
+ __le32 rules_limit;
+ __le32 rules_per_group_limit;
+ __u8 last_rule_priority;
+ __u8 selectors_per_classifier_limit;
+ __u8 reserved[2];
+};
+
+/**
+ * struct virtio_net_ff_selector - Selector mask descriptor
+ * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
+ * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @length: size in bytes of @mask
+ * @reserved1: must be set to 0 by the driver and ignored by the device
+ * @mask: variable-length mask payload for @type, length given by @length
+ *
+ * A selector describes a header mask that a classifier can apply. The format
+ * of @mask depends on @type.
+ */
+struct virtio_net_ff_selector {
+ __u8 type;
+ __u8 flags;
+ __u8 reserved[2];
+ __u8 length;
+ __u8 reserved1[3];
+ __u8 mask[];
+};
+
+#define VIRTIO_NET_FF_MASK_TYPE_ETH 1
+#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
+#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
+#define VIRTIO_NET_FF_MASK_TYPE_TCP 4
+#define VIRTIO_NET_FF_MASK_TYPE_UDP 5
+#define VIRTIO_NET_FF_MASK_TYPE_MAX VIRTIO_NET_FF_MASK_TYPE_UDP
+
+/**
+ * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
+ * @count: number of entries in @selectors
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @selectors: packed array of struct virtio_net_ff_selectors.
+ */
+struct virtio_net_ff_cap_mask_data {
+ __u8 count;
+ __u8 reserved[7];
+ __u8 selectors[];
+};
+#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
+
+#define VIRTIO_NET_FF_ACTION_DROP 1
+#define VIRTIO_NET_FF_ACTION_RX_VQ 2
+#define VIRTIO_NET_FF_ACTION_MAX VIRTIO_NET_FF_ACTION_RX_VQ
+/**
+ * struct virtio_net_ff_actions - Supported flow actions
+ * @count: number of supported actions in @actions
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
+ */
+struct virtio_net_ff_actions {
+ __u8 count;
+ __u8 reserved[7];
+ __u8 actions[];
+};
+#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 06/12] virtio_net: Create a FF group for ethtool steering
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (4 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
` (6 subsequent siblings)
12 siblings, 0 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
All ethtool steering rules will go in one group, create it during
initialization.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Documented UAPI
---
---
drivers/net/virtio_net.c | 29 +++++++++++++++++++++++++++++
include/uapi/linux/virtio_net_ff.h | 15 +++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 2d5c1bff879a..22571a7c97e9 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -286,6 +286,9 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
};
+#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
+#define VIRTNET_FF_MAX_GROUPS 1
+
struct virtnet_ff {
struct virtio_device *vdev;
bool ff_supported;
@@ -5800,6 +5803,7 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
sizeof(struct virtio_net_ff_selector) *
VIRTIO_NET_FF_MASK_TYPE_MAX;
+ struct virtio_net_resource_obj_ff_group ethtool_group = {};
struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
struct virtio_net_ff_selector *sel;
size_t real_ff_mask_size;
@@ -5883,6 +5887,12 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
if (err)
goto err_ff_action;
+ if (le32_to_cpu(ff->ff_caps->groups_limit) < VIRTNET_FF_MAX_GROUPS) {
+ err = -ENOSPC;
+ goto err_ff_action;
+ }
+ ff->ff_caps->groups_limit = cpu_to_le32(VIRTNET_FF_MAX_GROUPS);
+
err = virtio_admin_cap_set(vdev,
VIRTIO_NET_FF_RESOURCE_CAP,
ff->ff_caps,
@@ -5922,6 +5932,19 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
if (err)
goto err_ff_action;
+ ethtool_group.group_priority = cpu_to_le16(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+
+ /* Use priority for the object ID. */
+ err = virtio_admin_obj_create(vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+ VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0,
+ ðtool_group,
+ sizeof(ethtool_group));
+ if (err)
+ goto err_ff_action;
+
ff->vdev = vdev;
ff->ff_supported = true;
@@ -5949,6 +5972,12 @@ static void virtnet_ff_cleanup(struct virtnet_ff *ff)
if (!ff->ff_supported)
return;
+ virtio_admin_obj_destroy(ff->vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
+ VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0);
+
kfree(ff->ff_actions);
kfree(ff->ff_mask);
kfree(ff->ff_caps);
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index 1debcf595bdb..5883fdf4d37c 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -11,6 +11,8 @@
#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
#define VIRTIO_NET_FF_ACTION_CAP 0x802
+#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+
/**
* struct virtio_net_ff_cap_data - Flow filter resource capability limits
* @groups_limit: maximum number of flow filter groups supported by the device
@@ -85,4 +87,17 @@ struct virtio_net_ff_actions {
__u8 reserved[7];
__u8 actions[];
};
+
+/**
+ * struct virtio_net_resource_obj_ff_group - Flow filter group object
+ * @group_priority: priority of the group used to order evaluation
+ *
+ * This structure is the payload for the VIRTIO_NET_RESOURCE_OBJ_FF_GROUP
+ * administrative object. Devices use @group_priority to order flow filter
+ * groups. Multi-byte fields are little-endian.
+ */
+struct virtio_net_resource_obj_ff_group {
+ __le16 group_priority;
+};
+
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (5 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 21:05 ` Michael S. Tsirkin
2025-11-25 14:25 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
` (5 subsequent siblings)
12 siblings, 2 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Filtering a flow requires a classifier to match the packets, and a rule
to filter on the matches.
A classifier consists of one or more selectors. There is one selector
per header type. A selector must only use fields set in the selector
capability. If partial matching is supported, the classifier mask for a
particular field can be a subset of the mask for that field in the
capability.
The rule consists of a priority, an action and a key. The key is a byte
array containing headers corresponding to the selectors in the
classifier.
This patch implements ethtool rules for ethernet headers.
Example:
$ ethtool -U ens9 flow-type ether dst 08:11:22:33:44:54 action 30
Added rule with ID 1
The rule in the example directs received packets with the specified
destination MAC address to rq 30.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
- Fixed double free bug in error flows
- Build bug on for classifier struct ordering.
- (u8 *) to (void *) casting.
- Documentation in UAPI
- Answered questions about overflow with no changes.
v6:
- Fix sparse warning "array of flexible structures" Jakub K/Simon H
v7:
- Move for (int i -> for (i hunk from next patch. Paolo Abeni
v12:
- Make key_size u8. MST
- Free key in insert_rule when it's successful. MST
---
---
drivers/net/virtio_net.c | 464 +++++++++++++++++++++++++++++
include/uapi/linux/virtio_net_ff.h | 50 ++++
2 files changed, 514 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 22571a7c97e9..7600e2383a72 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -31,6 +31,7 @@
#include <net/ip.h>
#include <uapi/linux/virtio_pci.h>
#include <uapi/linux/virtio_net_ff.h>
+#include <linux/xarray.h>
static int napi_weight = NAPI_POLL_WEIGHT;
module_param(napi_weight, int, 0444);
@@ -286,6 +287,11 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
};
+struct virtnet_ethtool_ff {
+ struct xarray rules;
+ int num_rules;
+};
+
#define VIRTNET_FF_ETHTOOL_GROUP_PRIORITY 1
#define VIRTNET_FF_MAX_GROUPS 1
@@ -295,8 +301,16 @@ struct virtnet_ff {
struct virtio_net_ff_cap_data *ff_caps;
struct virtio_net_ff_cap_mask_data *ff_mask;
struct virtio_net_ff_actions *ff_actions;
+ struct xarray classifiers;
+ int num_classifiers;
+ struct virtnet_ethtool_ff ethtool;
};
+static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+ struct ethtool_rx_flow_spec *fs,
+ u16 curr_queue_pairs);
+static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
+
#define VIRTNET_Q_TYPE_RX 0
#define VIRTNET_Q_TYPE_TX 1
#define VIRTNET_Q_TYPE_CQ 2
@@ -5655,6 +5669,21 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
return vi->curr_queue_pairs;
}
+static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+
+ switch (info->cmd) {
+ case ETHTOOL_SRXCLSRLINS:
+ return virtnet_ethtool_flow_insert(&vi->ff, &info->fs,
+ vi->curr_queue_pairs);
+ case ETHTOOL_SRXCLSRLDEL:
+ return virtnet_ethtool_flow_remove(&vi->ff, info->fs.location);
+ }
+
+ return -EOPNOTSUPP;
+}
+
static const struct ethtool_ops virtnet_ethtool_ops = {
.supported_coalesce_params = ETHTOOL_COALESCE_MAX_FRAMES |
ETHTOOL_COALESCE_USECS | ETHTOOL_COALESCE_USE_ADAPTIVE_RX,
@@ -5681,6 +5710,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
.get_rxfh_fields = virtnet_get_hashflow,
.set_rxfh_fields = virtnet_set_hashflow,
.get_rx_ring_count = virtnet_get_rx_ring_count,
+ .set_rxnfc = virtnet_set_rxnfc,
};
static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
@@ -5778,6 +5808,429 @@ static const struct netdev_stat_ops virtnet_stat_ops = {
.get_base_stats = virtnet_get_base_stats,
};
+struct virtnet_ethtool_rule {
+ struct ethtool_rx_flow_spec flow_spec;
+ u32 classifier_id;
+};
+
+/* The classifier struct must be the last field in this struct */
+struct virtnet_classifier {
+ size_t size;
+ u32 id;
+ struct virtio_net_resource_obj_ff_classifier classifier;
+};
+
+static_assert(sizeof(struct virtnet_classifier) ==
+ ALIGN(offsetofend(struct virtnet_classifier, classifier),
+ __alignof__(struct virtnet_classifier)),
+ "virtnet_classifier: classifier must be the last member");
+
+static bool check_mask_vs_cap(const void *m, const void *c,
+ u16 len, bool partial)
+{
+ const u8 *mask = m;
+ const u8 *cap = c;
+ int i;
+
+ for (i = 0; i < len; i++) {
+ if (partial && ((mask[i] & cap[i]) != mask[i]))
+ return false;
+ if (!partial && mask[i] != cap[i])
+ return false;
+ }
+
+ return true;
+}
+
+static
+struct virtio_net_ff_selector *get_selector_cap(const struct virtnet_ff *ff,
+ u8 selector_type)
+{
+ struct virtio_net_ff_selector *sel;
+ void *buf;
+ int i;
+
+ buf = &ff->ff_mask->selectors;
+ sel = buf;
+
+ for (i = 0; i < ff->ff_mask->count; i++) {
+ if (sel->type == selector_type)
+ return sel;
+
+ buf += sizeof(struct virtio_net_ff_selector) + sel->length;
+ sel = buf;
+ }
+
+ return NULL;
+}
+
+static bool validate_eth_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel,
+ const struct virtio_net_ff_selector *sel_cap)
+{
+ bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+ struct ethhdr *cap, *mask;
+ struct ethhdr zeros = {};
+
+ cap = (struct ethhdr *)&sel_cap->mask;
+ mask = (struct ethhdr *)&sel->mask;
+
+ if (memcmp(&zeros.h_dest, mask->h_dest, sizeof(zeros.h_dest)) &&
+ !check_mask_vs_cap(mask->h_dest, cap->h_dest,
+ sizeof(mask->h_dest), partial_mask))
+ return false;
+
+ if (memcmp(&zeros.h_source, mask->h_source, sizeof(zeros.h_source)) &&
+ !check_mask_vs_cap(mask->h_source, cap->h_source,
+ sizeof(mask->h_source), partial_mask))
+ return false;
+
+ if (mask->h_proto &&
+ !check_mask_vs_cap(&mask->h_proto, &cap->h_proto,
+ sizeof(__be16), partial_mask))
+ return false;
+
+ return true;
+}
+
+static bool validate_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel)
+{
+ struct virtio_net_ff_selector *sel_cap = get_selector_cap(ff, sel->type);
+
+ if (!sel_cap)
+ return false;
+
+ switch (sel->type) {
+ case VIRTIO_NET_FF_MASK_TYPE_ETH:
+ return validate_eth_mask(ff, sel, sel_cap);
+ }
+
+ return false;
+}
+
+static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+{
+ int err;
+
+ err = xa_alloc(&ff->classifiers, &c->id, c,
+ XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
+ GFP_KERNEL);
+ if (err)
+ return err;
+
+ err = virtio_admin_obj_create(ff->vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+ c->id,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0,
+ &c->classifier,
+ c->size);
+ if (err)
+ goto err_xarray;
+
+ return 0;
+
+err_xarray:
+ xa_erase(&ff->classifiers, c->id);
+
+ return err;
+}
+
+static void destroy_classifier(struct virtnet_ff *ff,
+ u32 classifier_id)
+{
+ struct virtnet_classifier *c;
+
+ c = xa_load(&ff->classifiers, classifier_id);
+ if (c) {
+ virtio_admin_obj_destroy(ff->vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
+ c->id,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0);
+
+ xa_erase(&ff->classifiers, c->id);
+ kfree(c);
+ }
+}
+
+static void destroy_ethtool_rule(struct virtnet_ff *ff,
+ struct virtnet_ethtool_rule *eth_rule)
+{
+ ff->ethtool.num_rules--;
+
+ virtio_admin_obj_destroy(ff->vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+ eth_rule->flow_spec.location,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0);
+
+ xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+ destroy_classifier(ff, eth_rule->classifier_id);
+ kfree(eth_rule);
+}
+
+static int insert_rule(struct virtnet_ff *ff,
+ struct virtnet_ethtool_rule *eth_rule,
+ u32 classifier_id,
+ const u8 *key,
+ u8 key_size)
+{
+ struct ethtool_rx_flow_spec *fs = ð_rule->flow_spec;
+ struct virtio_net_resource_obj_ff_rule *ff_rule;
+ int err;
+
+ ff_rule = kzalloc(sizeof(*ff_rule) + key_size, GFP_KERNEL);
+ if (!ff_rule)
+ return -ENOMEM;
+
+ /* Intentionally leave the priority as 0. All rules have the same
+ * priority.
+ */
+ ff_rule->group_id = cpu_to_le32(VIRTNET_FF_ETHTOOL_GROUP_PRIORITY);
+ ff_rule->classifier_id = cpu_to_le32(classifier_id);
+ ff_rule->key_length = key_size;
+ ff_rule->action = fs->ring_cookie == RX_CLS_FLOW_DISC ?
+ VIRTIO_NET_FF_ACTION_DROP :
+ VIRTIO_NET_FF_ACTION_RX_VQ;
+ ff_rule->vq_index = fs->ring_cookie != RX_CLS_FLOW_DISC ?
+ cpu_to_le16(fs->ring_cookie) : 0;
+ memcpy(&ff_rule->keys, key, key_size);
+
+ err = virtio_admin_obj_create(ff->vdev,
+ VIRTIO_NET_RESOURCE_OBJ_FF_RULE,
+ fs->location,
+ VIRTIO_ADMIN_GROUP_TYPE_SELF,
+ 0,
+ ff_rule,
+ sizeof(*ff_rule) + key_size);
+ if (err)
+ goto err_ff_rule;
+
+ eth_rule->classifier_id = classifier_id;
+ ff->ethtool.num_rules++;
+ kfree(ff_rule);
+ kfree(key);
+
+ return 0;
+
+err_ff_rule:
+ kfree(ff_rule);
+
+ return err;
+}
+
+static u32 flow_type_mask(u32 flow_type)
+{
+ return flow_type & ~(FLOW_EXT | FLOW_MAC_EXT | FLOW_RSS);
+}
+
+static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
+{
+ switch (fs->flow_type) {
+ case ETHER_FLOW:
+ return true;
+ }
+
+ return false;
+}
+
+static int validate_flow_input(struct virtnet_ff *ff,
+ const struct ethtool_rx_flow_spec *fs,
+ u16 curr_queue_pairs)
+{
+ /* Force users to use RX_CLS_LOC_ANY - don't allow specific locations */
+ if (fs->location != RX_CLS_LOC_ANY)
+ return -EOPNOTSUPP;
+
+ if (fs->ring_cookie != RX_CLS_FLOW_DISC &&
+ fs->ring_cookie >= curr_queue_pairs)
+ return -EINVAL;
+
+ if (fs->flow_type != flow_type_mask(fs->flow_type))
+ return -EOPNOTSUPP;
+
+ if (!supported_flow_type(fs))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
+ u8 *key_size, size_t *classifier_size,
+ int *num_hdrs)
+{
+ *num_hdrs = 1;
+ *key_size = sizeof(struct ethhdr);
+ /*
+ * The classifier size is the size of the classifier header, a selector
+ * header for each type of header in the match criteria, and each header
+ * providing the mask for matching against.
+ */
+ *classifier_size = *key_size +
+ sizeof(struct virtio_net_resource_obj_ff_classifier) +
+ sizeof(struct virtio_net_ff_selector) * (*num_hdrs);
+}
+
+static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
+ u8 *key,
+ const struct ethtool_rx_flow_spec *fs)
+{
+ struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
+ struct ethhdr *eth_k = (struct ethhdr *)key;
+
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
+ selector->length = sizeof(struct ethhdr);
+
+ memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+ memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+}
+
+static int
+validate_classifier_selectors(struct virtnet_ff *ff,
+ struct virtio_net_resource_obj_ff_classifier *classifier,
+ int num_hdrs)
+{
+ struct virtio_net_ff_selector *selector = (void *)classifier->selectors;
+ int i;
+
+ for (i = 0; i < num_hdrs; i++) {
+ if (!validate_mask(ff, selector))
+ return -EINVAL;
+
+ selector = (((void *)selector) + sizeof(*selector) +
+ selector->length);
+ }
+
+ return 0;
+}
+
+static int build_and_insert(struct virtnet_ff *ff,
+ struct virtnet_ethtool_rule *eth_rule)
+{
+ struct virtio_net_resource_obj_ff_classifier *classifier;
+ struct ethtool_rx_flow_spec *fs = ð_rule->flow_spec;
+ struct virtio_net_ff_selector *selector;
+ struct virtnet_classifier *c;
+ size_t classifier_size;
+ int num_hdrs;
+ u8 key_size;
+ u8 *key;
+ int err;
+
+ calculate_flow_sizes(fs, &key_size, &classifier_size, &num_hdrs);
+
+ key = kzalloc(key_size, GFP_KERNEL);
+ if (!key)
+ return -ENOMEM;
+
+ /*
+ * virtio_net_ff_obj_ff_classifier is already included in the
+ * classifier_size.
+ */
+ c = kzalloc(classifier_size +
+ sizeof(struct virtnet_classifier) -
+ sizeof(struct virtio_net_resource_obj_ff_classifier),
+ GFP_KERNEL);
+ if (!c) {
+ kfree(key);
+ return -ENOMEM;
+ }
+
+ c->size = classifier_size;
+ classifier = &c->classifier;
+ classifier->count = num_hdrs;
+ selector = (void *)&classifier->selectors[0];
+
+ setup_eth_hdr_key_mask(selector, key, fs);
+
+ err = validate_classifier_selectors(ff, classifier, num_hdrs);
+ if (err)
+ goto err_key;
+
+ err = setup_classifier(ff, c);
+ if (err)
+ goto err_classifier;
+
+ err = insert_rule(ff, eth_rule, c->id, key, key_size);
+ if (err) {
+ /* destroy_classifier will free the classifier */
+ destroy_classifier(ff, c->id);
+ goto err_key;
+ }
+
+ return 0;
+
+err_classifier:
+ kfree(c);
+err_key:
+ kfree(key);
+
+ return err;
+}
+
+static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
+ struct ethtool_rx_flow_spec *fs,
+ u16 curr_queue_pairs)
+{
+ struct virtnet_ethtool_rule *eth_rule;
+ int err;
+
+ if (!ff->ff_supported)
+ return -EOPNOTSUPP;
+
+ err = validate_flow_input(ff, fs, curr_queue_pairs);
+ if (err)
+ return err;
+
+ eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
+ if (!eth_rule)
+ return -ENOMEM;
+
+ err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
+ XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
+ GFP_KERNEL);
+ if (err)
+ goto err_rule;
+
+ eth_rule->flow_spec = *fs;
+
+ err = build_and_insert(ff, eth_rule);
+ if (err)
+ goto err_xa;
+
+ return err;
+
+err_xa:
+ xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
+
+err_rule:
+ fs->location = RX_CLS_LOC_ANY;
+ kfree(eth_rule);
+
+ return err;
+}
+
+static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
+{
+ struct virtnet_ethtool_rule *eth_rule;
+ int err = 0;
+
+ if (!ff->ff_supported)
+ return -EOPNOTSUPP;
+
+ eth_rule = xa_load(&ff->ethtool.rules, location);
+ if (!eth_rule) {
+ err = -ENOENT;
+ goto out;
+ }
+
+ destroy_ethtool_rule(ff, eth_rule);
+out:
+ return err;
+}
+
static size_t get_mask_size(u16 type)
{
switch (type) {
@@ -5945,6 +6398,8 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
if (err)
goto err_ff_action;
+ xa_init_flags(&ff->classifiers, XA_FLAGS_ALLOC);
+ xa_init_flags(&ff->ethtool.rules, XA_FLAGS_ALLOC);
ff->vdev = vdev;
ff->ff_supported = true;
@@ -5969,9 +6424,18 @@ static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
static void virtnet_ff_cleanup(struct virtnet_ff *ff)
{
+ struct virtnet_ethtool_rule *eth_rule;
+ unsigned long i;
+
if (!ff->ff_supported)
return;
+ xa_for_each(&ff->ethtool.rules, i, eth_rule)
+ destroy_ethtool_rule(ff, eth_rule);
+
+ xa_destroy(&ff->ethtool.rules);
+ xa_destroy(&ff->classifiers);
+
virtio_admin_obj_destroy(ff->vdev,
VIRTIO_NET_RESOURCE_OBJ_FF_GROUP,
VIRTNET_FF_ETHTOOL_GROUP_PRIORITY,
diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
index 5883fdf4d37c..1afe69105076 100644
--- a/include/uapi/linux/virtio_net_ff.h
+++ b/include/uapi/linux/virtio_net_ff.h
@@ -12,6 +12,8 @@
#define VIRTIO_NET_FF_ACTION_CAP 0x802
#define VIRTIO_NET_RESOURCE_OBJ_FF_GROUP 0x0200
+#define VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER 0x0201
+#define VIRTIO_NET_RESOURCE_OBJ_FF_RULE 0x0202
/**
* struct virtio_net_ff_cap_data - Flow filter resource capability limits
@@ -100,4 +102,52 @@ struct virtio_net_resource_obj_ff_group {
__le16 group_priority;
};
+/**
+ * struct virtio_net_resource_obj_ff_classifier - Flow filter classifier object
+ * @count: number of selector entries in @selectors
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @selectors: array of selector descriptors that define match masks
+ *
+ * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER administrative object.
+ * Each selector describes a header mask used to match packets
+ * (see struct virtio_net_ff_selector). Selectors appear in the order they are
+ * to be applied.
+ */
+struct virtio_net_resource_obj_ff_classifier {
+ __u8 count;
+ __u8 reserved[7];
+ __u8 selectors[];
+};
+
+/**
+ * struct virtio_net_resource_obj_ff_rule - Flow filter rule object
+ * @group_id: identifier of the target flow filter group
+ * @classifier_id: identifier of the classifier referenced by this rule
+ * @rule_priority: relative priority of this rule within the group
+ * @key_length: number of bytes in @keys
+ * @action: action to perform, one of VIRTIO_NET_FF_ACTION_*
+ * @reserved: must be set to 0 by the driver and ignored by the device
+ * @vq_index: RX virtqueue index for VIRTIO_NET_FF_ACTION_RX_VQ, 0 otherwise
+ * @reserved1: must be set to 0 by the driver and ignored by the device
+ * @keys: concatenated key bytes matching the classifier's selectors order
+ *
+ * Payload for the VIRTIO_NET_RESOURCE_OBJ_FF_RULE administrative object.
+ * @group_id and @classifier_id refer to previously created objects of types
+ * VIRTIO_NET_RESOURCE_OBJ_FF_GROUP and VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER
+ * respectively. The key bytes are compared against packet headers using the
+ * masks provided by the classifier's selectors. Multi-byte fields are
+ * little-endian.
+ */
+struct virtio_net_resource_obj_ff_rule {
+ __le32 group_id;
+ __le32 classifier_id;
+ __u8 rule_priority;
+ __u8 key_length; /* length of key in bytes */
+ __u8 action;
+ __u8 reserved;
+ __le16 vq_index;
+ __u8 reserved1[2];
+ __u8 keys[];
+};
+
#endif
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (6 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 22:04 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
` (4 subsequent siblings)
12 siblings, 1 reply; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Classifiers can be used by more than one rule. If there is an existing
classifier, use it instead of creating a new one. If duplicate
classifiers are created it would artifically limit the number of rules
to the classifier limit, which is likely less than the rules limit.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
- Fixed typo in commit message
- for (int -> for (
v8:
- Removed unused num_classifiers. Jason Wang
v12:
- Clarified comment about destroy_classifier freeing. MST
- Renamed the classifier field of virtnet_classifier to obj. MST
- Explained why in commit message. MST
---
---
drivers/net/virtio_net.c | 51 ++++++++++++++++++++++++++--------------
1 file changed, 34 insertions(+), 17 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 7600e2383a72..5e49cd78904f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -32,6 +32,7 @@
#include <uapi/linux/virtio_pci.h>
#include <uapi/linux/virtio_net_ff.h>
#include <linux/xarray.h>
+#include <linux/refcount.h>
static int napi_weight = NAPI_POLL_WEIGHT;
module_param(napi_weight, int, 0444);
@@ -302,7 +303,6 @@ struct virtnet_ff {
struct virtio_net_ff_cap_mask_data *ff_mask;
struct virtio_net_ff_actions *ff_actions;
struct xarray classifiers;
- int num_classifiers;
struct virtnet_ethtool_ff ethtool;
};
@@ -5816,12 +5816,13 @@ struct virtnet_ethtool_rule {
/* The classifier struct must be the last field in this struct */
struct virtnet_classifier {
size_t size;
+ refcount_t refcount;
u32 id;
- struct virtio_net_resource_obj_ff_classifier classifier;
+ struct virtio_net_resource_obj_ff_classifier obj;
};
static_assert(sizeof(struct virtnet_classifier) ==
- ALIGN(offsetofend(struct virtnet_classifier, classifier),
+ ALIGN(offsetofend(struct virtnet_classifier, obj),
__alignof__(struct virtnet_classifier)),
"virtnet_classifier: classifier must be the last member");
@@ -5909,11 +5910,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
return false;
}
-static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
+static int setup_classifier(struct virtnet_ff *ff,
+ struct virtnet_classifier **c)
{
+ struct virtnet_classifier *tmp;
+ unsigned long i;
int err;
- err = xa_alloc(&ff->classifiers, &c->id, c,
+ xa_for_each(&ff->classifiers, i, tmp) {
+ if ((*c)->size == tmp->size &&
+ !memcmp(&tmp->obj, &(*c)->obj, tmp->size)) {
+ refcount_inc(&tmp->refcount);
+ kfree(*c);
+ *c = tmp;
+ goto out;
+ }
+ }
+
+ err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
GFP_KERNEL);
if (err)
@@ -5921,29 +5935,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
err = virtio_admin_obj_create(ff->vdev,
VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
- c->id,
+ (*c)->id,
VIRTIO_ADMIN_GROUP_TYPE_SELF,
0,
- &c->classifier,
- c->size);
+ &(*c)->obj,
+ (*c)->size);
if (err)
goto err_xarray;
+ refcount_set(&(*c)->refcount, 1);
+out:
return 0;
err_xarray:
- xa_erase(&ff->classifiers, c->id);
+ xa_erase(&ff->classifiers, (*c)->id);
return err;
}
-static void destroy_classifier(struct virtnet_ff *ff,
- u32 classifier_id)
+static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
{
struct virtnet_classifier *c;
c = xa_load(&ff->classifiers, classifier_id);
- if (c) {
+ if (c && refcount_dec_and_test(&c->refcount)) {
virtio_admin_obj_destroy(ff->vdev,
VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
c->id,
@@ -5967,7 +5982,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
0);
xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
- destroy_classifier(ff, eth_rule->classifier_id);
+ try_destroy_classifier(ff, eth_rule->classifier_id);
kfree(eth_rule);
}
@@ -6139,7 +6154,7 @@ static int build_and_insert(struct virtnet_ff *ff,
}
c->size = classifier_size;
- classifier = &c->classifier;
+ classifier = &c->obj;
classifier->count = num_hdrs;
selector = (void *)&classifier->selectors[0];
@@ -6149,14 +6164,16 @@ static int build_and_insert(struct virtnet_ff *ff,
if (err)
goto err_key;
- err = setup_classifier(ff, c);
+ err = setup_classifier(ff, &c);
if (err)
goto err_classifier;
err = insert_rule(ff, eth_rule, c->id, key, key_size);
if (err) {
- /* destroy_classifier will free the classifier */
- destroy_classifier(ff, c->id);
+ /* destroy_classifier release the reference on the classifier
+ * and free it if needed.
+ */
+ try_destroy_classifier(ff, c->id);
goto err_key;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (7 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 21:51 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
` (3 subsequent siblings)
12 siblings, 1 reply; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Add support for IP_USER type rules from ethtool.
Example:
$ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
Added rule with ID 1
The example rule will drop packets with the source IP specified.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4:
- Fixed bug in protocol check of parse_ip4
- (u8 *) to (void *) casting.
- Alignment issues.
v12
- refactor calculate_flow_sizes to remove goto. MST
- refactor build_and_insert to remove goto validate. MST
- Move parse_ip4 l3_mask check to TCP/UDP patch. MST
- Check saddr/daddr mask before copying in parse_ip4. MST
- Remove tos check in setup_ip_key_mask.
- check l4_4_bytes mask is 0 in setup_ip_key_mask. MST
- changed return of setup_ip_key_mask to -EINVAL.
- BUG_ON if key overflows u8 size in calculate_flow_sizes. MST
---
---
drivers/net/virtio_net.c | 119 +++++++++++++++++++++++++++++++++++++--
1 file changed, 113 insertions(+), 6 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 5e49cd78904f..b0b9972fe624 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -5894,6 +5894,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
return true;
}
+static bool validate_ip4_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel,
+ const struct virtio_net_ff_selector *sel_cap)
+{
+ bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+ struct iphdr *cap, *mask;
+
+ cap = (struct iphdr *)&sel_cap->mask;
+ mask = (struct iphdr *)&sel->mask;
+
+ if (mask->saddr &&
+ !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+ sizeof(__be32), partial_mask))
+ return false;
+
+ if (mask->daddr &&
+ !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+ sizeof(__be32), partial_mask))
+ return false;
+
+ if (mask->protocol &&
+ !check_mask_vs_cap(&mask->protocol, &cap->protocol,
+ sizeof(u8), partial_mask))
+ return false;
+
+ return true;
+}
+
static bool validate_mask(const struct virtnet_ff *ff,
const struct virtio_net_ff_selector *sel)
{
@@ -5905,11 +5933,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
switch (sel->type) {
case VIRTIO_NET_FF_MASK_TYPE_ETH:
return validate_eth_mask(ff, sel, sel_cap);
+
+ case VIRTIO_NET_FF_MASK_TYPE_IPV4:
+ return validate_ip4_mask(ff, sel, sel_cap);
}
return false;
}
+static void parse_ip4(struct iphdr *mask, struct iphdr *key,
+ const struct ethtool_rx_flow_spec *fs)
+{
+ const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
+ const struct ethtool_usrip4_spec *l3_val = &fs->h_u.usr_ip4_spec;
+
+ if (mask->saddr) {
+ mask->saddr = l3_mask->ip4src;
+ key->saddr = l3_val->ip4src;
+ }
+
+ if (mask->daddr) {
+ mask->daddr = l3_mask->ip4dst;
+ key->daddr = l3_val->ip4dst;
+ }
+}
+
+static bool has_ipv4(u32 flow_type)
+{
+ return flow_type == IP_USER_FLOW;
+}
+
static int setup_classifier(struct virtnet_ff *ff,
struct virtnet_classifier **c)
{
@@ -6045,6 +6098,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
{
switch (fs->flow_type) {
case ETHER_FLOW:
+ case IP_USER_FLOW:
return true;
}
@@ -6076,8 +6130,18 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
u8 *key_size, size_t *classifier_size,
int *num_hdrs)
{
+ size_t size = sizeof(struct ethhdr);
+
*num_hdrs = 1;
- *key_size = sizeof(struct ethhdr);
+
+ if (fs->flow_type != ETHER_FLOW) {
+ ++(*num_hdrs);
+ if (has_ipv4(fs->flow_type))
+ size += sizeof(struct iphdr);
+ }
+
+ BUG_ON(size > 0xff);
+ *key_size = size;
/*
* The classifier size is the size of the classifier header, a selector
* header for each type of header in the match criteria, and each header
@@ -6089,8 +6153,9 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
}
static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
- u8 *key,
- const struct ethtool_rx_flow_spec *fs)
+ u8 *key,
+ const struct ethtool_rx_flow_spec *fs,
+ int num_hdrs)
{
struct ethhdr *eth_m = (struct ethhdr *)&selector->mask;
struct ethhdr *eth_k = (struct ethhdr *)key;
@@ -6098,8 +6163,35 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
selector->type = VIRTIO_NET_FF_MASK_TYPE_ETH;
selector->length = sizeof(struct ethhdr);
- memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
- memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+ if (num_hdrs > 1) {
+ eth_m->h_proto = cpu_to_be16(0xffff);
+ eth_k->h_proto = cpu_to_be16(ETH_P_IP);
+ } else {
+ memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
+ memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
+ }
+}
+
+static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
+ u8 *key,
+ const struct ethtool_rx_flow_spec *fs)
+{
+ struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+ struct iphdr *v4_k = (struct iphdr *)key;
+
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+ selector->length = sizeof(struct iphdr);
+
+ if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+ fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
+ fs->m_u.usr_ip4_spec.l4_4_bytes ||
+ fs->m_u.usr_ip4_spec.ip_ver ||
+ fs->m_u.usr_ip4_spec.proto)
+ return -EINVAL;
+
+ parse_ip4(v4_m, v4_k, fs);
+
+ return 0;
}
static int
@@ -6121,6 +6213,13 @@ validate_classifier_selectors(struct virtnet_ff *ff,
return 0;
}
+static
+struct virtio_net_ff_selector *next_selector(struct virtio_net_ff_selector *sel)
+{
+ return (void *)sel + sizeof(struct virtio_net_ff_selector) +
+ sel->length;
+}
+
static int build_and_insert(struct virtnet_ff *ff,
struct virtnet_ethtool_rule *eth_rule)
{
@@ -6158,7 +6257,15 @@ static int build_and_insert(struct virtnet_ff *ff,
classifier->count = num_hdrs;
selector = (void *)&classifier->selectors[0];
- setup_eth_hdr_key_mask(selector, key, fs);
+ setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
+
+ if (num_hdrs != 1) {
+ selector = next_selector(selector);
+
+ err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+ if (err)
+ goto err_classifier;
+ }
err = validate_classifier_selectors(ff, classifier, num_hdrs);
if (err)
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (8 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 21:59 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
` (2 subsequent siblings)
12 siblings, 1 reply; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Implement support for IPV6_USER_FLOW type rules.
Example:
$ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
Added rule with ID 0
The example rule will forward packets with the specified source and
destination IP addresses to RX ring 3.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: commit message typo
v12:
- refactor calculate_flow_sizes. MST
- Move parse_ip6 l3_mask check to TCP/UDP patch. MST
- Set eth proto to ipv6 as needed. MST
- Also check l4_4_bytes mask is 0 in setup_ip_key_mask. MST
- Remove tclass check in setup_ip_key_mask. If it's not suppored it
will be caught in validate_classifier_selectors. MST
- Changed error return in setup_ip_key_mask to -EINVAL
---
---
drivers/net/virtio_net.c | 92 +++++++++++++++++++++++++++++++++++-----
1 file changed, 82 insertions(+), 10 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b0b9972fe624..bb8ec4265da5 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -5922,6 +5922,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
return true;
}
+static bool validate_ip6_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel,
+ const struct virtio_net_ff_selector *sel_cap)
+{
+ bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+ struct ipv6hdr *cap, *mask;
+
+ cap = (struct ipv6hdr *)&sel_cap->mask;
+ mask = (struct ipv6hdr *)&sel->mask;
+
+ if (!ipv6_addr_any(&mask->saddr) &&
+ !check_mask_vs_cap(&mask->saddr, &cap->saddr,
+ sizeof(cap->saddr), partial_mask))
+ return false;
+
+ if (!ipv6_addr_any(&mask->daddr) &&
+ !check_mask_vs_cap(&mask->daddr, &cap->daddr,
+ sizeof(cap->daddr), partial_mask))
+ return false;
+
+ if (mask->nexthdr &&
+ !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
+ sizeof(cap->nexthdr), partial_mask))
+ return false;
+
+ return true;
+}
+
static bool validate_mask(const struct virtnet_ff *ff,
const struct virtio_net_ff_selector *sel)
{
@@ -5936,6 +5964,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
case VIRTIO_NET_FF_MASK_TYPE_IPV4:
return validate_ip4_mask(ff, sel, sel_cap);
+
+ case VIRTIO_NET_FF_MASK_TYPE_IPV6:
+ return validate_ip6_mask(ff, sel, sel_cap);
}
return false;
@@ -5958,11 +5989,33 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
}
}
+static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
+ const struct ethtool_rx_flow_spec *fs)
+{
+ const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
+ const struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
+
+ if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
+ memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
+ memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
+ }
+
+ if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
+ memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
+ memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
+ }
+}
+
static bool has_ipv4(u32 flow_type)
{
return flow_type == IP_USER_FLOW;
}
+static bool has_ipv6(u32 flow_type)
+{
+ return flow_type == IPV6_USER_FLOW;
+}
+
static int setup_classifier(struct virtnet_ff *ff,
struct virtnet_classifier **c)
{
@@ -6099,6 +6152,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
switch (fs->flow_type) {
case ETHER_FLOW:
case IP_USER_FLOW:
+ case IPV6_USER_FLOW:
return true;
}
@@ -6138,6 +6192,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
++(*num_hdrs);
if (has_ipv4(fs->flow_type))
size += sizeof(struct iphdr);
+ else if (has_ipv6(fs->flow_type))
+ size += sizeof(struct ipv6hdr);
}
BUG_ON(size > 0xff);
@@ -6165,7 +6221,10 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
if (num_hdrs > 1) {
eth_m->h_proto = cpu_to_be16(0xffff);
- eth_k->h_proto = cpu_to_be16(ETH_P_IP);
+ if (has_ipv4(fs->flow_type))
+ eth_k->h_proto = cpu_to_be16(ETH_P_IP);
+ else
+ eth_k->h_proto = cpu_to_be16(ETH_P_IPV6);
} else {
memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
@@ -6176,20 +6235,33 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
u8 *key,
const struct ethtool_rx_flow_spec *fs)
{
+ struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
struct iphdr *v4_m = (struct iphdr *)&selector->mask;
+ struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
struct iphdr *v4_k = (struct iphdr *)key;
- selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
- selector->length = sizeof(struct iphdr);
+ if (has_ipv6(fs->flow_type)) {
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
+ selector->length = sizeof(struct ipv6hdr);
- if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
- fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
- fs->m_u.usr_ip4_spec.l4_4_bytes ||
- fs->m_u.usr_ip4_spec.ip_ver ||
- fs->m_u.usr_ip4_spec.proto)
- return -EINVAL;
+ if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+ fs->m_u.usr_ip6_spec.l4_4_bytes)
+ return -EINVAL;
- parse_ip4(v4_m, v4_k, fs);
+ parse_ip6(v6_m, v6_k, fs);
+ } else {
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
+ selector->length = sizeof(struct iphdr);
+
+ if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+ fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
+ fs->m_u.usr_ip4_spec.l4_4_bytes ||
+ fs->m_u.usr_ip4_spec.ip_ver ||
+ fs->m_u.usr_ip4_spec.proto)
+ return -EINVAL;
+
+ parse_ip4(v4_m, v4_k, fs);
+ }
return 0;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (9 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-24 22:02 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
2025-11-19 20:22 ` [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Michael S. Tsirkin
12 siblings, 1 reply; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
Implement TCP and UDP V4/V6 ethtool flow types.
Examples:
$ ethtool -U ens9 flow-type udp4 dst-ip 192.168.5.2 dst-port\
4321 action 20
Added rule with ID 4
This example directs IPv4 UDP traffic with the specified address and
port to queue 20.
$ ethtool -U ens9 flow-type tcp6 src-ip 2001:db8::1 src-port 1234 dst-ip\
2001:db8::2 dst-port 4321 action 12
Added rule with ID 5
This example directs IPv6 TCP traffic with the specified address and
port to queue 12.
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: (*num_hdrs)++ to ++(*num_hdrs)
v12:
- Refactor calculate_flow_sizes. MST
- Refactor build_and_insert to remove goto validate. MST
- Move parse_ip4/6 l3_mask check here. MST
---
---
drivers/net/virtio_net.c | 223 +++++++++++++++++++++++++++++++++++++--
1 file changed, 212 insertions(+), 11 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index bb8ec4265da5..e6c7e8cd4ab4 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -5950,6 +5950,52 @@ static bool validate_ip6_mask(const struct virtnet_ff *ff,
return true;
}
+static bool validate_tcp_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel,
+ const struct virtio_net_ff_selector *sel_cap)
+{
+ bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+ struct tcphdr *cap, *mask;
+
+ cap = (struct tcphdr *)&sel_cap->mask;
+ mask = (struct tcphdr *)&sel->mask;
+
+ if (mask->source &&
+ !check_mask_vs_cap(&mask->source, &cap->source,
+ sizeof(cap->source), partial_mask))
+ return false;
+
+ if (mask->dest &&
+ !check_mask_vs_cap(&mask->dest, &cap->dest,
+ sizeof(cap->dest), partial_mask))
+ return false;
+
+ return true;
+}
+
+static bool validate_udp_mask(const struct virtnet_ff *ff,
+ const struct virtio_net_ff_selector *sel,
+ const struct virtio_net_ff_selector *sel_cap)
+{
+ bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
+ struct udphdr *cap, *mask;
+
+ cap = (struct udphdr *)&sel_cap->mask;
+ mask = (struct udphdr *)&sel->mask;
+
+ if (mask->source &&
+ !check_mask_vs_cap(&mask->source, &cap->source,
+ sizeof(cap->source), partial_mask))
+ return false;
+
+ if (mask->dest &&
+ !check_mask_vs_cap(&mask->dest, &cap->dest,
+ sizeof(cap->dest), partial_mask))
+ return false;
+
+ return true;
+}
+
static bool validate_mask(const struct virtnet_ff *ff,
const struct virtio_net_ff_selector *sel)
{
@@ -5967,11 +6013,45 @@ static bool validate_mask(const struct virtnet_ff *ff,
case VIRTIO_NET_FF_MASK_TYPE_IPV6:
return validate_ip6_mask(ff, sel, sel_cap);
+
+ case VIRTIO_NET_FF_MASK_TYPE_TCP:
+ return validate_tcp_mask(ff, sel, sel_cap);
+
+ case VIRTIO_NET_FF_MASK_TYPE_UDP:
+ return validate_udp_mask(ff, sel, sel_cap);
}
return false;
}
+static void set_tcp(struct tcphdr *mask, struct tcphdr *key,
+ __be16 psrc_m, __be16 psrc_k,
+ __be16 pdst_m, __be16 pdst_k)
+{
+ if (psrc_m) {
+ mask->source = psrc_m;
+ key->source = psrc_k;
+ }
+ if (pdst_m) {
+ mask->dest = pdst_m;
+ key->dest = pdst_k;
+ }
+}
+
+static void set_udp(struct udphdr *mask, struct udphdr *key,
+ __be16 psrc_m, __be16 psrc_k,
+ __be16 pdst_m, __be16 pdst_k)
+{
+ if (psrc_m) {
+ mask->source = psrc_m;
+ key->source = psrc_k;
+ }
+ if (pdst_m) {
+ mask->dest = pdst_m;
+ key->dest = pdst_k;
+ }
+}
+
static void parse_ip4(struct iphdr *mask, struct iphdr *key,
const struct ethtool_rx_flow_spec *fs)
{
@@ -5987,6 +6067,11 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
mask->daddr = l3_mask->ip4dst;
key->daddr = l3_val->ip4dst;
}
+
+ if (l3_mask->proto) {
+ mask->protocol = l3_mask->proto;
+ key->protocol = l3_val->proto;
+ }
}
static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
@@ -6004,16 +6089,35 @@ static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
}
+
+ if (l3_mask->l4_proto) {
+ mask->nexthdr = l3_mask->l4_proto;
+ key->nexthdr = l3_val->l4_proto;
+ }
}
static bool has_ipv4(u32 flow_type)
{
- return flow_type == IP_USER_FLOW;
+ return flow_type == TCP_V4_FLOW ||
+ flow_type == UDP_V4_FLOW ||
+ flow_type == IP_USER_FLOW;
}
static bool has_ipv6(u32 flow_type)
{
- return flow_type == IPV6_USER_FLOW;
+ return flow_type == TCP_V6_FLOW ||
+ flow_type == UDP_V6_FLOW ||
+ flow_type == IPV6_USER_FLOW;
+}
+
+static bool has_tcp(u32 flow_type)
+{
+ return flow_type == TCP_V4_FLOW || flow_type == TCP_V6_FLOW;
+}
+
+static bool has_udp(u32 flow_type)
+{
+ return flow_type == UDP_V4_FLOW || flow_type == UDP_V6_FLOW;
}
static int setup_classifier(struct virtnet_ff *ff,
@@ -6153,6 +6257,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
case ETHER_FLOW:
case IP_USER_FLOW:
case IPV6_USER_FLOW:
+ case TCP_V4_FLOW:
+ case TCP_V6_FLOW:
+ case UDP_V4_FLOW:
+ case UDP_V6_FLOW:
return true;
}
@@ -6194,6 +6302,12 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
size += sizeof(struct iphdr);
else if (has_ipv6(fs->flow_type))
size += sizeof(struct ipv6hdr);
+
+ if (has_tcp(fs->flow_type) || has_udp(fs->flow_type)) {
+ ++(*num_hdrs);
+ size += has_tcp(fs->flow_type) ? sizeof(struct tcphdr) :
+ sizeof(struct udphdr);
+ }
}
BUG_ON(size > 0xff);
@@ -6233,7 +6347,8 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
u8 *key,
- const struct ethtool_rx_flow_spec *fs)
+ const struct ethtool_rx_flow_spec *fs,
+ int num_hdrs)
{
struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
struct iphdr *v4_m = (struct iphdr *)&selector->mask;
@@ -6244,23 +6359,95 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
selector->length = sizeof(struct ipv6hdr);
- if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
- fs->m_u.usr_ip6_spec.l4_4_bytes)
+ if (num_hdrs == 2 && (fs->h_u.usr_ip6_spec.l4_4_bytes ||
+ fs->m_u.usr_ip6_spec.l4_4_bytes))
return -EINVAL;
parse_ip6(v6_m, v6_k, fs);
+
+ if (num_hdrs > 2) {
+ v6_m->nexthdr = 0xff;
+ if (has_tcp(fs->flow_type))
+ v6_k->nexthdr = IPPROTO_TCP;
+ else
+ v6_k->nexthdr = IPPROTO_UDP;
+ }
} else {
selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
selector->length = sizeof(struct iphdr);
- if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
- fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
- fs->m_u.usr_ip4_spec.l4_4_bytes ||
- fs->m_u.usr_ip4_spec.ip_ver ||
- fs->m_u.usr_ip4_spec.proto)
+ if (num_hdrs == 2 &&
+ (fs->h_u.usr_ip4_spec.l4_4_bytes ||
+ fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
+ fs->m_u.usr_ip4_spec.l4_4_bytes ||
+ fs->m_u.usr_ip4_spec.ip_ver ||
+ fs->m_u.usr_ip4_spec.proto))
return -EINVAL;
parse_ip4(v4_m, v4_k, fs);
+
+ if (num_hdrs > 2) {
+ v4_m->protocol = 0xff;
+ if (has_tcp(fs->flow_type))
+ v4_k->protocol = IPPROTO_TCP;
+ else
+ v4_k->protocol = IPPROTO_UDP;
+ }
+ }
+
+ return 0;
+}
+
+static int setup_transport_key_mask(struct virtio_net_ff_selector *selector,
+ u8 *key,
+ struct ethtool_rx_flow_spec *fs)
+{
+ struct tcphdr *tcp_m = (struct tcphdr *)&selector->mask;
+ struct udphdr *udp_m = (struct udphdr *)&selector->mask;
+ const struct ethtool_tcpip6_spec *v6_l4_mask;
+ const struct ethtool_tcpip4_spec *v4_l4_mask;
+ const struct ethtool_tcpip6_spec *v6_l4_key;
+ const struct ethtool_tcpip4_spec *v4_l4_key;
+ struct tcphdr *tcp_k = (struct tcphdr *)key;
+ struct udphdr *udp_k = (struct udphdr *)key;
+
+ if (has_tcp(fs->flow_type)) {
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_TCP;
+ selector->length = sizeof(struct tcphdr);
+
+ if (has_ipv6(fs->flow_type)) {
+ v6_l4_mask = &fs->m_u.tcp_ip6_spec;
+ v6_l4_key = &fs->h_u.tcp_ip6_spec;
+
+ set_tcp(tcp_m, tcp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+ v6_l4_mask->pdst, v6_l4_key->pdst);
+ } else {
+ v4_l4_mask = &fs->m_u.tcp_ip4_spec;
+ v4_l4_key = &fs->h_u.tcp_ip4_spec;
+
+ set_tcp(tcp_m, tcp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+ v4_l4_mask->pdst, v4_l4_key->pdst);
+ }
+
+ } else if (has_udp(fs->flow_type)) {
+ selector->type = VIRTIO_NET_FF_MASK_TYPE_UDP;
+ selector->length = sizeof(struct udphdr);
+
+ if (has_ipv6(fs->flow_type)) {
+ v6_l4_mask = &fs->m_u.udp_ip6_spec;
+ v6_l4_key = &fs->h_u.udp_ip6_spec;
+
+ set_udp(udp_m, udp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
+ v6_l4_mask->pdst, v6_l4_key->pdst);
+ } else {
+ v4_l4_mask = &fs->m_u.udp_ip4_spec;
+ v4_l4_key = &fs->h_u.udp_ip4_spec;
+
+ set_udp(udp_m, udp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
+ v4_l4_mask->pdst, v4_l4_key->pdst);
+ }
+ } else {
+ return -EOPNOTSUPP;
}
return 0;
@@ -6300,6 +6487,7 @@ static int build_and_insert(struct virtnet_ff *ff,
struct virtio_net_ff_selector *selector;
struct virtnet_classifier *c;
size_t classifier_size;
+ size_t key_offset;
int num_hdrs;
u8 key_size;
u8 *key;
@@ -6332,11 +6520,24 @@ static int build_and_insert(struct virtnet_ff *ff,
setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
if (num_hdrs != 1) {
+ key_offset = selector->length;
selector = next_selector(selector);
- err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
+ err = setup_ip_key_mask(selector, key + key_offset,
+ fs, num_hdrs);
if (err)
goto err_classifier;
+
+ if (num_hdrs >= 2) {
+ key_offset += selector->length;
+ selector = next_selector(selector);
+
+ err = setup_transport_key_mask(selector,
+ key + key_offset,
+ fs);
+ if (err)
+ goto err_classifier;
+ }
}
err = validate_classifier_selectors(ff, classifier, num_hdrs);
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* [PATCH net-next v12 12/12] virtio_net: Add get ethtool flow rules ops
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (10 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
@ 2025-11-19 19:15 ` Daniel Jurgens
2025-11-19 20:22 ` [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Michael S. Tsirkin
12 siblings, 0 replies; 40+ messages in thread
From: Daniel Jurgens @ 2025-11-19 19:15 UTC (permalink / raw)
To: netdev, mst, jasowang, pabeni
Cc: virtualization, parav, shshitrit, yohadt, xuanzhuo, eperezma, jgg,
kevin.tian, kuba, andrew+netdev, edumazet, Daniel Jurgens
- Get total number of rules. There's no user interface for this. It is
used to allocate an appropriately sized buffer for getting all the
rules.
- Get specific rule
$ ethtool -u ens9 rule 0
Filter: 0
Rule Type: UDP over IPv4
Src IP addr: 0.0.0.0 mask: 255.255.255.255
Dest IP addr: 192.168.5.2 mask: 0.0.0.0
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 4321 mask: 0x0
Action: Direct to queue 16
- Get all rules:
$ ethtool -u ens9
31 RX rings available
Total 2 rules
Filter: 0
Rule Type: UDP over IPv4
Src IP addr: 0.0.0.0 mask: 255.255.255.255
Dest IP addr: 192.168.5.2 mask: 0.0.0.0
...
Filter: 1
Flow Type: Raw Ethernet
Src MAC addr: 00:00:00:00:00:00 mask: FF:FF:FF:FF:FF:FF
Dest MAC addr: 08:11:22:33:44:54 mask: 00:00:00:00:00:00
Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
---
v4: Answered questions about rules_limit overflow with no changes.
v12:
- Use and set rule_cnt in virtnet_ethtool_get_all_flows. MST
- Leave rc uninitiazed at the top of virtnet_get_rxnfc. MST
---
---
drivers/net/virtio_net.c | 82 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 82 insertions(+)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index e6c7e8cd4ab4..2bd2bf6b754b 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -310,6 +310,13 @@ static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
struct ethtool_rx_flow_spec *fs,
u16 curr_queue_pairs);
static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location);
+static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info);
+static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info);
+static int
+virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info, u32 *rule_locs);
#define VIRTNET_Q_TYPE_RX 0
#define VIRTNET_Q_TYPE_TX 1
@@ -5669,6 +5676,28 @@ static u32 virtnet_get_rx_ring_count(struct net_device *dev)
return vi->curr_queue_pairs;
}
+static int virtnet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+ struct virtnet_info *vi = netdev_priv(dev);
+ int rc;
+
+ switch (info->cmd) {
+ case ETHTOOL_GRXCLSRLCNT:
+ rc = virtnet_ethtool_get_flow_count(&vi->ff, info);
+ break;
+ case ETHTOOL_GRXCLSRULE:
+ rc = virtnet_ethtool_get_flow(&vi->ff, info);
+ break;
+ case ETHTOOL_GRXCLSRLALL:
+ rc = virtnet_ethtool_get_all_flows(&vi->ff, info, rule_locs);
+ break;
+ default:
+ rc = -EOPNOTSUPP;
+ }
+
+ return rc;
+}
+
static int virtnet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *info)
{
struct virtnet_info *vi = netdev_priv(dev);
@@ -5710,6 +5739,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
.get_rxfh_fields = virtnet_get_hashflow,
.set_rxfh_fields = virtnet_set_hashflow,
.get_rx_ring_count = virtnet_get_rx_ring_count,
+ .get_rxnfc = virtnet_get_rxnfc,
.set_rxnfc = virtnet_set_rxnfc,
};
@@ -6628,6 +6658,58 @@ static int virtnet_ethtool_flow_remove(struct virtnet_ff *ff, int location)
return err;
}
+static int virtnet_ethtool_get_flow_count(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info)
+{
+ if (!ff->ff_supported)
+ return -EOPNOTSUPP;
+
+ info->rule_cnt = ff->ethtool.num_rules;
+ info->data = le32_to_cpu(ff->ff_caps->rules_limit) | RX_CLS_LOC_SPECIAL;
+
+ return 0;
+}
+
+static int virtnet_ethtool_get_flow(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info)
+{
+ struct virtnet_ethtool_rule *eth_rule;
+
+ if (!ff->ff_supported)
+ return -EOPNOTSUPP;
+
+ eth_rule = xa_load(&ff->ethtool.rules, info->fs.location);
+ if (!eth_rule)
+ return -ENOENT;
+
+ info->fs = eth_rule->flow_spec;
+
+ return 0;
+}
+
+static int
+virtnet_ethtool_get_all_flows(struct virtnet_ff *ff,
+ struct ethtool_rxnfc *info, u32 *rule_locs)
+{
+ struct virtnet_ethtool_rule *eth_rule;
+ unsigned long i = 0;
+ int idx = 0;
+
+ if (!ff->ff_supported)
+ return -EOPNOTSUPP;
+
+ xa_for_each(&ff->ethtool.rules, i, eth_rule) {
+ if (idx == info->rule_cnt)
+ return -EMSGSIZE;
+ rule_locs[idx++] = i;
+ }
+
+ info->data = le32_to_cpu(ff->ff_caps->rules_limit);
+ info->rule_cnt = idx;
+
+ return 0;
+}
+
static size_t get_mask_size(u16 type)
{
switch (type) {
--
2.50.1
^ permalink raw reply related [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
` (11 preceding siblings ...)
2025-11-19 19:15 ` [PATCH net-next v12 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
@ 2025-11-19 20:22 ` Michael S. Tsirkin
12 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-19 20:22 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:11PM -0600, Daniel Jurgens wrote:
> This series implements ethtool flow rules support for virtio_net using the
> virtio flow filter (FF) specification. The implementation allows users to
> configure packet filtering rules through ethtool commands, directing
> packets to specific receive queues, or dropping them based on various
> header fields.
traveling, will review after tuesday. thanks!
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
@ 2025-11-20 1:51 ` Jakub Kicinski
2025-11-20 15:39 ` Dan Jurgens
2025-11-24 21:01 ` Michael S. Tsirkin
2025-11-24 22:54 ` Michael S. Tsirkin
2 siblings, 1 reply; 40+ messages in thread
From: Jakub Kicinski @ 2025-11-20 1:51 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, mst, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, andrew+netdev,
edumazet
On Wed, 19 Nov 2025 13:15:16 -0600 Daniel Jurgens wrote:
> +/**
> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> + * @groups_limit: maximum number of flow filter groups supported by the device
> + * @classifiers_limit: maximum number of classifiers supported by the device
> + * @rules_limit: maximum number of rules supported device-wide across all groups
> + * @rules_per_group_limit: maximum number of rules allowed in a single group
> + * @last_rule_priority: priority value associated with the lowest-priority rule
> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
> + */
> +struct virtio_net_ff_cap_data {
> + __le32 groups_limit;
> + __le32 classifiers_limit;
> + __le32 rules_limit;
> + __le32 rules_per_group_limit;
> + __u8 last_rule_priority;
> + __u8 selectors_per_classifier_limit;
pop in a :
/* private: */
comment here, otherwise kdoc will complain that @reserved is
undocumented.
> + __u8 reserved[2];
> +};
That said, if you don't mind pls wait for Michael's review with
the repost. Unless someone else provides review comments first.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-20 1:51 ` Jakub Kicinski
@ 2025-11-20 15:39 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-20 15:39 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, mst, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, andrew+netdev,
edumazet
On 11/19/25 7:51 PM, Jakub Kicinski wrote:
> On Wed, 19 Nov 2025 13:15:16 -0600 Daniel Jurgens wrote:
>> +/**
>> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
>> + * @groups_limit: maximum number of flow filter groups supported by the device
>> + * @classifiers_limit: maximum number of classifiers supported by the device
>> + * @rules_limit: maximum number of rules supported device-wide across all groups
>> + * @rules_per_group_limit: maximum number of rules allowed in a single group
>> + * @last_rule_priority: priority value associated with the lowest-priority rule
>> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
>> + */
>> +struct virtio_net_ff_cap_data {
>> + __le32 groups_limit;
>> + __le32 classifiers_limit;
>> + __le32 rules_limit;
>> + __le32 rules_per_group_limit;
>> + __u8 last_rule_priority;
>> + __u8 selectors_per_classifier_limit;
>
> pop in a :
>
> /* private: */
>
> comment here, otherwise kdoc will complain that @reserved is
> undocumented.
>
>> + __u8 reserved[2];
>> +};
>
> That said, if you don't mind pls wait for Michael's review with
> the repost. Unless someone else provides review comments first.
Thanks, I have it queued up for the next version.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 03/12] virtio: Expose generic device capability operations
2025-11-19 19:15 ` [PATCH net-next v12 03/12] virtio: Expose generic device capability operations Daniel Jurgens
@ 2025-11-24 20:30 ` Michael S. Tsirkin
2025-11-24 22:24 ` Dan Jurgens
0 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 20:30 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:14PM -0600, Daniel Jurgens wrote:
> Currently querying and setting capabilities is restricted to a single
> capability and contained within the virtio PCI driver. However, each
> device type has generic and device specific capabilities, that may be
> queried and set. In subsequent patches virtio_net will query and set
> flow filter capabilities.
>
> This changes the size of virtio_admin_cmd_query_cap_id_result. It's safe
> to do because this data is written by DMA, so a newer controller can't
> overrun the size on an older kernel.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>
> ---
> v4: Moved this logic from virtio_pci_modern to new file
> virtio_admin_commands.
>
> v12:
> - Removed uapi virtio_pci include in virtio_admin.h. MST
> - Added virtio_pci uapi include to virtio_admin_commands.c
> - Put () around cap in macro. MST
> - Removed nonsense comment above VIRTIO_ADMIN_MAX_CAP. MST
> - +1 VIRTIO_ADMIN_MAX_CAP when calculating array size. MST
> - Updated commit message
> ---
> drivers/virtio/Makefile | 2 +-
> drivers/virtio/virtio_admin_commands.c | 91 ++++++++++++++++++++++++++
> include/linux/virtio_admin.h | 80 ++++++++++++++++++++++
> include/uapi/linux/virtio_pci.h | 6 +-
> 4 files changed, 176 insertions(+), 3 deletions(-)
> create mode 100644 drivers/virtio/virtio_admin_commands.c
> create mode 100644 include/linux/virtio_admin.h
>
> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> index eefcfe90d6b8..2b4a204dde33 100644
> --- a/drivers/virtio/Makefile
> +++ b/drivers/virtio/Makefile
> @@ -1,5 +1,5 @@
> # SPDX-License-Identifier: GPL-2.0
> -obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> +obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
> obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
> obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
> obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
> new file mode 100644
> index 000000000000..a2254e71e8dc
> --- /dev/null
> +++ b/drivers/virtio/virtio_admin_commands.c
> @@ -0,0 +1,91 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +
> +#include <linux/virtio.h>
> +#include <linux/virtio_config.h>
> +#include <linux/virtio_admin.h>
> +#include <uapi/linux/virtio_pci.h>
> +
> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
> + struct virtio_admin_cmd_query_cap_id_result *data)
> +{
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist result_sg;
> +
> + if (!vdev->config->admin_cmd_exec)
> + return -EOPNOTSUPP;
> +
> + sg_init_one(&result_sg, data, sizeof(*data));
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> + cmd.result_sg = &result_sg;
> +
> + return vdev->config->admin_cmd_exec(vdev, &cmd);
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
> +
> +int virtio_admin_cap_get(struct virtio_device *vdev,
> + u16 id,
> + void *caps,
> + size_t cap_size)
I still don't get why cap_size needs to be as large as size_t.
if you don't care what's it size is, just say "unsigned".
or u8 as a hint to users it's a small value.
> +{
> + struct virtio_admin_cmd_cap_get_data *data;
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist result_sg;
> + struct scatterlist data_sg;
> + int err;
> +
> + if (!vdev->config->admin_cmd_exec)
> + return -EOPNOTSUPP;
> +
> + data = kzalloc(sizeof(*data), GFP_KERNEL);
uses kzalloc without including linux/slab.h
> + if (!data)
> + return -ENOMEM;
> +
> + data->id = cpu_to_le16(id);
> + sg_init_one(&data_sg, data, sizeof(*data));
> + sg_init_one(&result_sg, caps, cap_size);
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DEVICE_CAP_GET);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> + cmd.data_sg = &data_sg;
> + cmd.result_sg = &result_sg;
> +
> + err = vdev->config->admin_cmd_exec(vdev, &cmd);
> + kfree(data);
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_get);
> +
> +int virtio_admin_cap_set(struct virtio_device *vdev,
> + u16 id,
> + const void *caps,
> + size_t cap_size)
> +{
> + struct virtio_admin_cmd_cap_set_data *data;
> + struct virtio_admin_cmd cmd = {};
> + struct scatterlist data_sg;
> + size_t data_size;
> + int err;
> +
> + if (!vdev->config->admin_cmd_exec)
> + return -EOPNOTSUPP;
> +
> + data_size = sizeof(*data) + cap_size;
> + data = kzalloc(data_size, GFP_KERNEL);
> + if (!data)
> + return -ENOMEM;
> +
> + data->id = cpu_to_le16(id);
> + memcpy(data->cap_specific_data, caps, cap_size);
> + sg_init_one(&data_sg, data, data_size);
> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_DRIVER_CAP_SET);
> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> + cmd.data_sg = &data_sg;
> + cmd.result_sg = NULL;
> +
> + err = vdev->config->admin_cmd_exec(vdev, &cmd);
> + kfree(data);
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(virtio_admin_cap_set);
> diff --git a/include/linux/virtio_admin.h b/include/linux/virtio_admin.h
> new file mode 100644
> index 000000000000..4ab84d53c924
> --- /dev/null
> +++ b/include/linux/virtio_admin.h
> @@ -0,0 +1,80 @@
> +/* SPDX-License-Identifier: GPL-2.0-only
> + *
> + * Header file for virtio admin operations
> + */
> +
> +#ifndef _LINUX_VIRTIO_ADMIN_H
> +#define _LINUX_VIRTIO_ADMIN_H
> +
> +struct virtio_device;
> +struct virtio_admin_cmd_query_cap_id_result;
> +
> +/**
> + * VIRTIO_CAP_IN_LIST - Check if a capability is supported in the capability list
> + * @cap_list: Pointer to capability list structure containing supported_caps array
> + * @cap: Capability ID to check
> + *
> + * The cap_list contains a supported_caps array of little-endian 64-bit integers
> + * where each bit represents a capability. Bit 0 of the first element represents
> + * capability ID 0, bit 1 represents capability ID 1, and so on.
> + *
> + * Return: 1 if capability is supported, 0 otherwise
> + */
> +#define VIRTIO_CAP_IN_LIST(cap_list, cap) \
> + (!!(1 & (le64_to_cpu(cap_list->supported_caps[(cap) / 64]) >> (cap) % 64)))
> +
> +/**
> + * virtio_admin_cap_id_list_query - Query the list of available capability IDs
> + * @vdev: The virtio device to query
> + * @data: Pointer to result structure (must be heap allocated)
> + *
> + * This function queries the virtio device for the list of available capability
> + * IDs that can be used with virtio_admin_cap_get() and virtio_admin_cap_set().
> + * The result is stored in the provided data structure.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability queries, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
> + struct virtio_admin_cmd_query_cap_id_result *data);
> +
> +/**
> + * virtio_admin_cap_get - Get capability data for a specific capability ID
> + * @vdev: The virtio device
> + * @id: Capability ID to retrieve
> + * @caps: Pointer to capability data structure (must be heap allocated)
> + * @cap_size: Size of the capability data structure
> + *
> + * This function retrieves a specific capability from the virtio device.
> + * The capability data is stored in the provided buffer. The caller must
> + * ensure the buffer is large enough to hold the capability data.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability retrieval, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_get(struct virtio_device *vdev,
> + u16 id,
> + void *caps,
> + size_t cap_size);
> +
> +/**
> + * virtio_admin_cap_set - Set capability data for a specific capability ID
> + * @vdev: The virtio device
> + * @id: Capability ID to set
> + * @caps: Pointer to capability data structure (must be heap allocated)
> + * @cap_size: Size of the capability data structure
> + *
> + * This function sets a specific capability on the virtio device.
> + * The capability data is read from the provided buffer and applied
> + * to the device. The device may validate the capability data before
> + * applying it.
> + *
> + * Return: 0 on success, -EOPNOTSUPP if the device doesn't support admin
> + * operations or capability setting, or a negative error code on other failures.
> + */
> +int virtio_admin_cap_set(struct virtio_device *vdev,
> + u16 id,
> + const void *caps,
> + size_t cap_size);
> +
> +#endif /* _LINUX_VIRTIO_ADMIN_H */
> diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
> index c691ac210ce2..2e35fd8d4a95 100644
> --- a/include/uapi/linux/virtio_pci.h
> +++ b/include/uapi/linux/virtio_pci.h
> @@ -315,15 +315,17 @@ struct virtio_admin_cmd_notify_info_result {
>
> #define VIRTIO_DEV_PARTS_CAP 0x0000
>
> +#define VIRTIO_ADMIN_MAX_CAP 0x0fff
> +
> struct virtio_dev_parts_cap {
> __u8 get_parts_resource_objects_limit;
> __u8 set_parts_resource_objects_limit;
> };
>
> -#define MAX_CAP_ID __KERNEL_DIV_ROUND_UP(VIRTIO_DEV_PARTS_CAP + 1, 64)
> +#define VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE __KERNEL_DIV_ROUND_UP(VIRTIO_ADMIN_MAX_CAP + 1, 64)
>
> struct virtio_admin_cmd_query_cap_id_result {
> - __le64 supported_caps[MAX_CAP_ID];
> + __le64 supported_caps[VIRTIO_ADMIN_CAP_ID_ARRAY_SIZE];
> };
>
> struct virtio_admin_cmd_cap_get_data {
> --
> 2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
2025-11-20 1:51 ` Jakub Kicinski
@ 2025-11-24 21:01 ` Michael S. Tsirkin
2025-11-25 0:05 ` Dan Jurgens
2025-11-24 22:54 ` Michael S. Tsirkin
2 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 21:01 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:16PM -0600, Daniel Jurgens wrote:
> When probing a virtnet device, attempt to read the flow filter
> capabilities. In order to use the feature the caps must also
> be set. For now setting what was read is sufficient.
>
> This patch adds uapi definitions virtio_net flow filters define in
> version 1.4 of the VirtIO spec.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>
> ---
> v4:
> - Validate the length in the selector caps
> - Removed __free usage.
> - Removed for(int.
> v5:
> - Remove unneed () after MAX_SEL_LEN macro (test bot)
> v6:
> - Fix sparse warning "array of flexible structures" Jakub K/Simon H
> - Use new variable and validate ff_mask_size before set_cap. MST
> v7:
> - Set ff->ff_{caps, mask, actions} NULL in error path. Paolo Abeni
> - Return errors from virtnet_ff_init, -ENOTSUPP is not fatal. Xuan
>
> v8:
> - Use real_ff_mask_size when setting the selector caps. Jason Wang
>
> v9:
> - Set err after failed memory allocations. Simon Horman
>
> v10:
> - Return -EOPNOTSUPP in virnet_ff_init before allocing any memory.
> Jason/Paolo.
>
> v11:
> - Return -EINVAL if any resource limit is 0. Simon Horman
> - Ensure we don't overrun alloced space of ff->ff_mask by moving the
> real_ff_mask_size > ff_mask_size check into the loop. Simon Horman
>
> v12:
> - Move uapi includes to virtio_net.c vs header file. MST
> - Remove kernel.h header in virtio_net_ff uapi. MST
> - WARN_ON_ONCE in error paths validating selectors. MST
> - Move includes from .h to .c files. MST
> - Add WARN_ON_ONCE if obj_destroy fails. MST
> - Comment cleanup in virito_net_ff.h uapi. MST
> - Add 2 byte pad to the end of virtio_net_ff_cap_data.
> https://lore.kernel.org/virtio-comment/20251119044029-mutt-send-email-mst@kernel.org/T/#m930988a5d3db316c68546d8b61f4b94f6ebda030
> - Cleanup and reinit in the freeze/restore path. MST
> ---
> drivers/net/virtio_net.c | 221 +++++++++++++++++++++++++
> drivers/virtio/virtio_admin_commands.c | 2 +
> include/uapi/linux/virtio_net_ff.h | 88 ++++++++++
> 3 files changed, 311 insertions(+)
> create mode 100644 include/uapi/linux/virtio_net_ff.h
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index cfa006b88688..2d5c1bff879a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -26,6 +26,11 @@
> #include <net/netdev_rx_queue.h>
> #include <net/netdev_queues.h>
> #include <net/xdp_sock_drv.h>
> +#include <linux/virtio_admin.h>
> +#include <net/ipv6.h>
> +#include <net/ip.h>
> +#include <uapi/linux/virtio_pci.h>
> +#include <uapi/linux/virtio_net_ff.h>
>
> static int napi_weight = NAPI_POLL_WEIGHT;
> module_param(napi_weight, int, 0444);
> @@ -281,6 +286,14 @@ static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc_qstat[] = {
> VIRTNET_STATS_DESC_TX_QSTAT(speed, ratelimit_packets, hw_drop_ratelimits),
> };
>
> +struct virtnet_ff {
> + struct virtio_device *vdev;
> + bool ff_supported;
> + struct virtio_net_ff_cap_data *ff_caps;
> + struct virtio_net_ff_cap_mask_data *ff_mask;
> + struct virtio_net_ff_actions *ff_actions;
> +};
> +
> #define VIRTNET_Q_TYPE_RX 0
> #define VIRTNET_Q_TYPE_TX 1
> #define VIRTNET_Q_TYPE_CQ 2
> @@ -493,6 +506,8 @@ struct virtnet_info {
> struct failover *failover;
>
> u64 device_stats_cap;
> +
> + struct virtnet_ff ff;
> };
>
> struct padded_vnet_hdr {
> @@ -5760,6 +5775,186 @@ static const struct netdev_stat_ops virtnet_stat_ops = {
> .get_base_stats = virtnet_get_base_stats,
> };
>
> +static size_t get_mask_size(u16 type)
> +{
> + switch (type) {
> + case VIRTIO_NET_FF_MASK_TYPE_ETH:
> + return sizeof(struct ethhdr);
> + case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> + return sizeof(struct iphdr);
> + case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> + return sizeof(struct ipv6hdr);
> + case VIRTIO_NET_FF_MASK_TYPE_TCP:
> + return sizeof(struct tcphdr);
> + case VIRTIO_NET_FF_MASK_TYPE_UDP:
> + return sizeof(struct udphdr);
> + }
> +
> + return 0;
> +}
> +
> +#define MAX_SEL_LEN (sizeof(struct ipv6hdr))
> +
> +static int virtnet_ff_init(struct virtnet_ff *ff, struct virtio_device *vdev)
> +{
> + size_t ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data) +
> + sizeof(struct virtio_net_ff_selector) *
> + VIRTIO_NET_FF_MASK_TYPE_MAX;
> + struct virtio_admin_cmd_query_cap_id_result *cap_id_list;
> + struct virtio_net_ff_selector *sel;
> + size_t real_ff_mask_size;
> + int err;
> + int i;
> +
> + if (!vdev->config->admin_cmd_exec)
> + return -EOPNOTSUPP;
> +
> + cap_id_list = kzalloc(sizeof(*cap_id_list), GFP_KERNEL);
> + if (!cap_id_list)
> + return -ENOMEM;
> +
> + err = virtio_admin_cap_id_list_query(vdev, cap_id_list);
> + if (err)
> + goto err_cap_list;
> +
> + if (!(VIRTIO_CAP_IN_LIST(cap_id_list,
> + VIRTIO_NET_FF_RESOURCE_CAP) &&
> + VIRTIO_CAP_IN_LIST(cap_id_list,
> + VIRTIO_NET_FF_SELECTOR_CAP) &&
> + VIRTIO_CAP_IN_LIST(cap_id_list,
> + VIRTIO_NET_FF_ACTION_CAP))) {
> + err = -EOPNOTSUPP;
> + goto err_cap_list;
> + }
> +
> + ff->ff_caps = kzalloc(sizeof(*ff->ff_caps), GFP_KERNEL);
> + if (!ff->ff_caps) {
> + err = -ENOMEM;
> + goto err_cap_list;
> + }
> +
> + err = virtio_admin_cap_get(vdev,
> + VIRTIO_NET_FF_RESOURCE_CAP,
> + ff->ff_caps,
> + sizeof(*ff->ff_caps));
> +
> + if (err)
> + goto err_ff;
> +
> + if (!ff->ff_caps->groups_limit ||
> + !ff->ff_caps->classifiers_limit ||
> + !ff->ff_caps->rules_limit ||
> + !ff->ff_caps->rules_per_group_limit) {
> + err = -EINVAL;
> + goto err_ff;
> + }
> +
> + /* VIRTIO_NET_FF_MASK_TYPE start at 1 */
> + for (i = 1; i <= VIRTIO_NET_FF_MASK_TYPE_MAX; i++)
> + ff_mask_size += get_mask_size(i);
> +
> + ff->ff_mask = kzalloc(ff_mask_size, GFP_KERNEL);
> + if (!ff->ff_mask) {
> + err = -ENOMEM;
> + goto err_ff;
> + }
> +
> + err = virtio_admin_cap_get(vdev,
> + VIRTIO_NET_FF_SELECTOR_CAP,
> + ff->ff_mask,
> + ff_mask_size);
> +
> + if (err)
> + goto err_ff_mask;
> +
> + ff->ff_actions = kzalloc(sizeof(*ff->ff_actions) +
> + VIRTIO_NET_FF_ACTION_MAX,
> + GFP_KERNEL);
> + if (!ff->ff_actions) {
> + err = -ENOMEM;
> + goto err_ff_mask;
> + }
> +
> + err = virtio_admin_cap_get(vdev,
> + VIRTIO_NET_FF_ACTION_CAP,
> + ff->ff_actions,
> + sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> +
> + if (err)
> + goto err_ff_action;
> +
> + err = virtio_admin_cap_set(vdev,
> + VIRTIO_NET_FF_RESOURCE_CAP,
> + ff->ff_caps,
> + sizeof(*ff->ff_caps));
> + if (err)
> + goto err_ff_action;
> +
> + real_ff_mask_size = sizeof(struct virtio_net_ff_cap_mask_data);
> + sel = (void *)&ff->ff_mask->selectors;
> +
> + for (i = 0; i < ff->ff_mask->count; i++) {
> + if (sel->length > MAX_SEL_LEN) {
> + WARN_ON_ONCE(true);
> + err = -EINVAL;
> + goto err_ff_action;
> + }
> + real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
> + if (real_ff_mask_size > ff_mask_size) {
> + WARN_ON_ONCE(true);
> + err = -EINVAL;
> + goto err_ff_action;
> + }
> + sel = (void *)sel + sizeof(*sel) + sel->length;
> + }
I am trying to figure out whether this is safe with
a buggy/malicious device which passes count > VIRTIO_NET_FF_MASK_TYPE_MAX
In fact, what if a future device supports more types?
There does not need to be a negotiation about what driver
needs, right?
> +
> + err = virtio_admin_cap_set(vdev,
> + VIRTIO_NET_FF_SELECTOR_CAP,
> + ff->ff_mask,
> + real_ff_mask_size);
> + if (err)
> + goto err_ff_action;
> +
> + err = virtio_admin_cap_set(vdev,
> + VIRTIO_NET_FF_ACTION_CAP,
> + ff->ff_actions,
> + sizeof(*ff->ff_actions) + VIRTIO_NET_FF_ACTION_MAX);
> + if (err)
> + goto err_ff_action;
> +
> + ff->vdev = vdev;
> + ff->ff_supported = true;
> +
> + kfree(cap_id_list);
> +
> + return 0;
> +
> +err_ff_action:
> + kfree(ff->ff_actions);
> + ff->ff_actions = NULL;
> +err_ff_mask:
> + kfree(ff->ff_mask);
> + ff->ff_mask = NULL;
> +err_ff:
> + kfree(ff->ff_caps);
> + ff->ff_caps = NULL;
> +err_cap_list:
> + kfree(cap_id_list);
> +
> + return err;
> +}
> +
> +static void virtnet_ff_cleanup(struct virtnet_ff *ff)
> +{
> + if (!ff->ff_supported)
> + return;
> +
> + kfree(ff->ff_actions);
> + kfree(ff->ff_mask);
> + kfree(ff->ff_caps);
> + ff->ff_supported = false;
> +}
> +
> static void virtnet_freeze_down(struct virtio_device *vdev)
> {
> struct virtnet_info *vi = vdev->priv;
> @@ -5778,6 +5973,10 @@ static void virtnet_freeze_down(struct virtio_device *vdev)
> netif_tx_lock_bh(vi->dev);
> netif_device_detach(vi->dev);
> netif_tx_unlock_bh(vi->dev);
> +
> + rtnl_lock();
> + virtnet_ff_cleanup(&vi->ff);
> + rtnl_unlock();
> }
>
> static int init_vqs(struct virtnet_info *vi);
> @@ -5804,6 +6003,17 @@ static int virtnet_restore_up(struct virtio_device *vdev)
> return err;
> }
>
> + /* Initialize flow filters. Not supported is an acceptable and common
> + * return code
> + */
> + rtnl_lock();
> + err = virtnet_ff_init(&vi->ff, vi->vdev);
> + if (err && err != -EOPNOTSUPP) {
> + rtnl_unlock();
> + return err;
> + }
> + rtnl_unlock();
> +
> netif_tx_lock_bh(vi->dev);
> netif_device_attach(vi->dev);
> netif_tx_unlock_bh(vi->dev);
> @@ -7137,6 +7347,15 @@ static int virtnet_probe(struct virtio_device *vdev)
> }
> vi->guest_offloads_capable = vi->guest_offloads;
>
> + /* Initialize flow filters. Not supported is an acceptable and common
> + * return code
> + */
> + err = virtnet_ff_init(&vi->ff, vi->vdev);
> + if (err && err != -EOPNOTSUPP) {
> + rtnl_unlock();
> + goto free_unregister_netdev;
> + }
> +
> rtnl_unlock();
>
> err = virtnet_cpu_notif_add(vi);
> @@ -7152,6 +7371,7 @@ static int virtnet_probe(struct virtio_device *vdev)
>
> free_unregister_netdev:
> unregister_netdev(dev);
> + virtnet_ff_cleanup(&vi->ff);
> free_failover:
> net_failover_destroy(vi->failover);
> free_vqs:
> @@ -7201,6 +7421,7 @@ static void virtnet_remove(struct virtio_device *vdev)
> virtnet_free_irq_moder(vi);
>
> unregister_netdev(vi->dev);
> + virtnet_ff_cleanup(&vi->ff);
>
> net_failover_destroy(vi->failover);
>
> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
> index 4738ffe3b5c6..e84a305d2b2a 100644
> --- a/drivers/virtio/virtio_admin_commands.c
> +++ b/drivers/virtio/virtio_admin_commands.c
> @@ -161,6 +161,8 @@ int virtio_admin_obj_destroy(struct virtio_device *vdev,
> err = vdev->config->admin_cmd_exec(vdev, &cmd);
> kfree(data);
>
> + WARN_ON_ONCE(err);
> +
> return err;
> }
> EXPORT_SYMBOL_GPL(virtio_admin_obj_destroy);
> diff --git a/include/uapi/linux/virtio_net_ff.h b/include/uapi/linux/virtio_net_ff.h
> new file mode 100644
> index 000000000000..1debcf595bdb
> --- /dev/null
> +++ b/include/uapi/linux/virtio_net_ff.h
> @@ -0,0 +1,88 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
> + *
> + * Header file for virtio_net flow filters
> + */
> +#ifndef _LINUX_VIRTIO_NET_FF_H
> +#define _LINUX_VIRTIO_NET_FF_H
> +
> +#include <linux/types.h>
> +
> +#define VIRTIO_NET_FF_RESOURCE_CAP 0x800
> +#define VIRTIO_NET_FF_SELECTOR_CAP 0x801
> +#define VIRTIO_NET_FF_ACTION_CAP 0x802
> +
> +/**
> + * struct virtio_net_ff_cap_data - Flow filter resource capability limits
> + * @groups_limit: maximum number of flow filter groups supported by the device
> + * @classifiers_limit: maximum number of classifiers supported by the device
> + * @rules_limit: maximum number of rules supported device-wide across all groups
> + * @rules_per_group_limit: maximum number of rules allowed in a single group
> + * @last_rule_priority: priority value associated with the lowest-priority rule
> + * @selectors_per_classifier_limit: maximum selectors allowed in one classifier
> + */
> +struct virtio_net_ff_cap_data {
> + __le32 groups_limit;
> + __le32 classifiers_limit;
> + __le32 rules_limit;
> + __le32 rules_per_group_limit;
> + __u8 last_rule_priority;
> + __u8 selectors_per_classifier_limit;
> + __u8 reserved[2];
> +};
> +
> +/**
> + * struct virtio_net_ff_selector - Selector mask descriptor
> + * @type: selector type, one of VIRTIO_NET_FF_MASK_TYPE_* constants
> + * @flags: selector flags, see VIRTIO_NET_FF_MASK_F_* constants
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @length: size in bytes of @mask
> + * @reserved1: must be set to 0 by the driver and ignored by the device
> + * @mask: variable-length mask payload for @type, length given by @length
> + *
> + * A selector describes a header mask that a classifier can apply. The format
> + * of @mask depends on @type.
> + */
> +struct virtio_net_ff_selector {
> + __u8 type;
> + __u8 flags;
> + __u8 reserved[2];
> + __u8 length;
> + __u8 reserved1[3];
> + __u8 mask[];
> +};
> +
> +#define VIRTIO_NET_FF_MASK_TYPE_ETH 1
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV4 2
> +#define VIRTIO_NET_FF_MASK_TYPE_IPV6 3
> +#define VIRTIO_NET_FF_MASK_TYPE_TCP 4
> +#define VIRTIO_NET_FF_MASK_TYPE_UDP 5
> +#define VIRTIO_NET_FF_MASK_TYPE_MAX VIRTIO_NET_FF_MASK_TYPE_UDP
> +
> +/**
> + * struct virtio_net_ff_cap_mask_data - Supported selector mask formats
> + * @count: number of entries in @selectors
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @selectors: packed array of struct virtio_net_ff_selectors.
> + */
> +struct virtio_net_ff_cap_mask_data {
> + __u8 count;
> + __u8 reserved[7];
> + __u8 selectors[];
> +};
> +#define VIRTIO_NET_FF_MASK_F_PARTIAL_MASK (1 << 0)
> +
> +#define VIRTIO_NET_FF_ACTION_DROP 1
> +#define VIRTIO_NET_FF_ACTION_RX_VQ 2
> +#define VIRTIO_NET_FF_ACTION_MAX VIRTIO_NET_FF_ACTION_RX_VQ
> +/**
> + * struct virtio_net_ff_actions - Supported flow actions
> + * @count: number of supported actions in @actions
> + * @reserved: must be set to 0 by the driver and ignored by the device
> + * @actions: array of action identifiers (VIRTIO_NET_FF_ACTION_*)
> + */
> +struct virtio_net_ff_actions {
> + __u8 count;
> + __u8 reserved[7];
> + __u8 actions[];
> +};
> +#endif
> --
> 2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-19 19:15 ` [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
@ 2025-11-24 21:05 ` Michael S. Tsirkin
2025-11-26 16:25 ` Dan Jurgens
2025-11-25 14:25 ` Michael S. Tsirkin
1 sibling, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 21:05 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:18PM -0600, Daniel Jurgens wrote:
> @@ -5681,6 +5710,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
> .get_rxfh_fields = virtnet_get_hashflow,
> .set_rxfh_fields = virtnet_set_hashflow,
> .get_rx_ring_count = virtnet_get_rx_ring_count,
> + .set_rxnfc = virtnet_set_rxnfc,
> };
>
> static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
should we not wire up get_rxnfc too? weird to be able to set but
not get.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules
2025-11-19 19:15 ` [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
@ 2025-11-24 21:51 ` Michael S. Tsirkin
2025-11-24 22:41 ` Dan Jurgens
2025-11-26 5:48 ` Dan Jurgens
0 siblings, 2 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 21:51 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:20PM -0600, Daniel Jurgens wrote:
> Add support for IP_USER type rules from ethtool.
>
> Example:
> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
> Added rule with ID 1
>
> The example rule will drop packets with the source IP specified.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
> - Fixed bug in protocol check of parse_ip4
> - (u8 *) to (void *) casting.
> - Alignment issues.
>
> v12
> - refactor calculate_flow_sizes to remove goto. MST
> - refactor build_and_insert to remove goto validate. MST
> - Move parse_ip4 l3_mask check to TCP/UDP patch. MST
> - Check saddr/daddr mask before copying in parse_ip4. MST
> - Remove tos check in setup_ip_key_mask.
So if user attempts to set a filter by tos now, what blocks it?
because parse_ip4 seems to ignore it ...
> - check l4_4_bytes mask is 0 in setup_ip_key_mask. MST
> - changed return of setup_ip_key_mask to -EINVAL.
> - BUG_ON if key overflows u8 size in calculate_flow_sizes. MST
> ---
> ---
> drivers/net/virtio_net.c | 119 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 113 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 5e49cd78904f..b0b9972fe624 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -5894,6 +5894,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
> return true;
> }
>
> +static bool validate_ip4_mask(const struct virtnet_ff *ff,
> + const struct virtio_net_ff_selector *sel,
> + const struct virtio_net_ff_selector *sel_cap)
> +{
> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> + struct iphdr *cap, *mask;
> +
> + cap = (struct iphdr *)&sel_cap->mask;
> + mask = (struct iphdr *)&sel->mask;
> +
> + if (mask->saddr &&
> + !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> + sizeof(__be32), partial_mask))
> + return false;
> +
> + if (mask->daddr &&
> + !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> + sizeof(__be32), partial_mask))
> + return false;
> +
> + if (mask->protocol &&
> + !check_mask_vs_cap(&mask->protocol, &cap->protocol,
> + sizeof(u8), partial_mask))
> + return false;
> +
> + return true;
> +}
> +
> static bool validate_mask(const struct virtnet_ff *ff,
> const struct virtio_net_ff_selector *sel)
> {
> @@ -5905,11 +5933,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
> switch (sel->type) {
> case VIRTIO_NET_FF_MASK_TYPE_ETH:
> return validate_eth_mask(ff, sel, sel_cap);
> +
> + case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> + return validate_ip4_mask(ff, sel, sel_cap);
> }
>
> return false;
> }
>
> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> + const struct ethtool_rx_flow_spec *fs)
> +{
> + const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
> + const struct ethtool_usrip4_spec *l3_val = &fs->h_u.usr_ip4_spec;
> +
> + if (mask->saddr) {
> + mask->saddr = l3_mask->ip4src;
> + key->saddr = l3_val->ip4src;
> + }
So if mast->saddr is already set you over-write it?
But what sets it? Don't you really mean l3_mask->ip4src maybe?
> +
> + if (mask->daddr) {
> + mask->daddr = l3_mask->ip4dst;
> + key->daddr = l3_val->ip4dst;
> + }
> +}
Same question.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering
2025-11-19 19:15 ` [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
@ 2025-11-24 21:59 ` Michael S. Tsirkin
2025-11-24 23:04 ` Dan Jurgens
0 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 21:59 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:21PM -0600, Daniel Jurgens wrote:
> Implement support for IPV6_USER_FLOW type rules.
>
> Example:
> $ ethtool -U ens9 flow-type ip6 src-ip fe80::2 dst-ip fe80::4 action 3
> Added rule with ID 0
>
> The example rule will forward packets with the specified source and
> destination IP addresses to RX ring 3.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4: commit message typo
>
> v12:
> - refactor calculate_flow_sizes. MST
> - Move parse_ip6 l3_mask check to TCP/UDP patch. MST
> - Set eth proto to ipv6 as needed. MST
> - Also check l4_4_bytes mask is 0 in setup_ip_key_mask. MST
> - Remove tclass check in setup_ip_key_mask. If it's not suppored it
> will be caught in validate_classifier_selectors. MST
> - Changed error return in setup_ip_key_mask to -EINVAL
> ---
> ---
> drivers/net/virtio_net.c | 92 +++++++++++++++++++++++++++++++++++-----
> 1 file changed, 82 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index b0b9972fe624..bb8ec4265da5 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -5922,6 +5922,34 @@ static bool validate_ip4_mask(const struct virtnet_ff *ff,
> return true;
> }
>
> +static bool validate_ip6_mask(const struct virtnet_ff *ff,
> + const struct virtio_net_ff_selector *sel,
> + const struct virtio_net_ff_selector *sel_cap)
> +{
> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> + struct ipv6hdr *cap, *mask;
> +
> + cap = (struct ipv6hdr *)&sel_cap->mask;
> + mask = (struct ipv6hdr *)&sel->mask;
> +
> + if (!ipv6_addr_any(&mask->saddr) &&
> + !check_mask_vs_cap(&mask->saddr, &cap->saddr,
> + sizeof(cap->saddr), partial_mask))
> + return false;
> +
> + if (!ipv6_addr_any(&mask->daddr) &&
> + !check_mask_vs_cap(&mask->daddr, &cap->daddr,
> + sizeof(cap->daddr), partial_mask))
> + return false;
> +
> + if (mask->nexthdr &&
> + !check_mask_vs_cap(&mask->nexthdr, &cap->nexthdr,
> + sizeof(cap->nexthdr), partial_mask))
> + return false;
> +
> + return true;
> +}
> +
> static bool validate_mask(const struct virtnet_ff *ff,
> const struct virtio_net_ff_selector *sel)
> {
> @@ -5936,6 +5964,9 @@ static bool validate_mask(const struct virtnet_ff *ff,
>
> case VIRTIO_NET_FF_MASK_TYPE_IPV4:
> return validate_ip4_mask(ff, sel, sel_cap);
> +
> + case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> + return validate_ip6_mask(ff, sel, sel_cap);
> }
>
> return false;
> @@ -5958,11 +5989,33 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> }
> }
>
> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> + const struct ethtool_rx_flow_spec *fs)
> +{
I note logic wise it is different from ipv4, it is looking at the fs.
> + const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
> + const struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
> +
> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
> + memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
> + memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
> + }
> +
> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
> + memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> + memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> + }
Is this enough?
For example, what if user tries to set up a filter by l4_proto ?
> +}
> +
> static bool has_ipv4(u32 flow_type)
> {
> return flow_type == IP_USER_FLOW;
> }
>
> +static bool has_ipv6(u32 flow_type)
> +{
> + return flow_type == IPV6_USER_FLOW;
> +}
> +
> static int setup_classifier(struct virtnet_ff *ff,
> struct virtnet_classifier **c)
> {
> @@ -6099,6 +6152,7 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
> switch (fs->flow_type) {
> case ETHER_FLOW:
> case IP_USER_FLOW:
> + case IPV6_USER_FLOW:
> return true;
> }
>
> @@ -6138,6 +6192,8 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> ++(*num_hdrs);
> if (has_ipv4(fs->flow_type))
> size += sizeof(struct iphdr);
> + else if (has_ipv6(fs->flow_type))
> + size += sizeof(struct ipv6hdr);
> }
>
> BUG_ON(size > 0xff);
> @@ -6165,7 +6221,10 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
>
> if (num_hdrs > 1) {
> eth_m->h_proto = cpu_to_be16(0xffff);
> - eth_k->h_proto = cpu_to_be16(ETH_P_IP);
> + if (has_ipv4(fs->flow_type))
> + eth_k->h_proto = cpu_to_be16(ETH_P_IP);
> + else
> + eth_k->h_proto = cpu_to_be16(ETH_P_IPV6);
> } else {
> memcpy(eth_m, &fs->m_u.ether_spec, sizeof(*eth_m));
> memcpy(eth_k, &fs->h_u.ether_spec, sizeof(*eth_k));
> @@ -6176,20 +6235,33 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> u8 *key,
> const struct ethtool_rx_flow_spec *fs)
> {
> + struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
> struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> + struct ipv6hdr *v6_k = (struct ipv6hdr *)key;
> struct iphdr *v4_k = (struct iphdr *)key;
>
> - selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> - selector->length = sizeof(struct iphdr);
> + if (has_ipv6(fs->flow_type)) {
> + selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
> + selector->length = sizeof(struct ipv6hdr);
>
> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
> - fs->m_u.usr_ip4_spec.ip_ver ||
> - fs->m_u.usr_ip4_spec.proto)
> - return -EINVAL;
> + if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> + fs->m_u.usr_ip6_spec.l4_4_bytes)
> + return -EINVAL;
>
> - parse_ip4(v4_m, v4_k, fs);
> + parse_ip6(v6_m, v6_k, fs);
why does ipv6 not check unsupported fields unlike ipv4?
> + } else {
> + selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> + selector->length = sizeof(struct iphdr);
> +
> + if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
> + fs->m_u.usr_ip4_spec.ip_ver ||
> + fs->m_u.usr_ip4_spec.proto)
> + return -EINVAL;
> +
> + parse_ip4(v4_m, v4_k, fs);
> + }
>
> return 0;
> }
> --
> 2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules
2025-11-19 19:15 ` [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
@ 2025-11-24 22:02 ` Michael S. Tsirkin
2025-11-24 22:47 ` Dan Jurgens
0 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 22:02 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:22PM -0600, Daniel Jurgens wrote:
> Implement TCP and UDP V4/V6 ethtool flow types.
>
> Examples:
> $ ethtool -U ens9 flow-type udp4 dst-ip 192.168.5.2 dst-port\
> 4321 action 20
> Added rule with ID 4
>
> This example directs IPv4 UDP traffic with the specified address and
> port to queue 20.
>
> $ ethtool -U ens9 flow-type tcp6 src-ip 2001:db8::1 src-port 1234 dst-ip\
> 2001:db8::2 dst-port 4321 action 12
> Added rule with ID 5
>
> This example directs IPv6 TCP traffic with the specified address and
> port to queue 12.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4: (*num_hdrs)++ to ++(*num_hdrs)
>
> v12:
> - Refactor calculate_flow_sizes. MST
> - Refactor build_and_insert to remove goto validate. MST
> - Move parse_ip4/6 l3_mask check here. MST
> ---
> ---
> drivers/net/virtio_net.c | 223 +++++++++++++++++++++++++++++++++++++--
> 1 file changed, 212 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index bb8ec4265da5..e6c7e8cd4ab4 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -5950,6 +5950,52 @@ static bool validate_ip6_mask(const struct virtnet_ff *ff,
> return true;
> }
>
> +static bool validate_tcp_mask(const struct virtnet_ff *ff,
> + const struct virtio_net_ff_selector *sel,
> + const struct virtio_net_ff_selector *sel_cap)
> +{
> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> + struct tcphdr *cap, *mask;
> +
> + cap = (struct tcphdr *)&sel_cap->mask;
> + mask = (struct tcphdr *)&sel->mask;
> +
> + if (mask->source &&
> + !check_mask_vs_cap(&mask->source, &cap->source,
> + sizeof(cap->source), partial_mask))
> + return false;
> +
> + if (mask->dest &&
> + !check_mask_vs_cap(&mask->dest, &cap->dest,
> + sizeof(cap->dest), partial_mask))
> + return false;
> +
> + return true;
> +}
> +
> +static bool validate_udp_mask(const struct virtnet_ff *ff,
> + const struct virtio_net_ff_selector *sel,
> + const struct virtio_net_ff_selector *sel_cap)
> +{
> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
> + struct udphdr *cap, *mask;
> +
> + cap = (struct udphdr *)&sel_cap->mask;
> + mask = (struct udphdr *)&sel->mask;
> +
> + if (mask->source &&
> + !check_mask_vs_cap(&mask->source, &cap->source,
> + sizeof(cap->source), partial_mask))
> + return false;
> +
> + if (mask->dest &&
> + !check_mask_vs_cap(&mask->dest, &cap->dest,
> + sizeof(cap->dest), partial_mask))
> + return false;
> +
> + return true;
> +}
> +
> static bool validate_mask(const struct virtnet_ff *ff,
> const struct virtio_net_ff_selector *sel)
> {
> @@ -5967,11 +6013,45 @@ static bool validate_mask(const struct virtnet_ff *ff,
>
> case VIRTIO_NET_FF_MASK_TYPE_IPV6:
> return validate_ip6_mask(ff, sel, sel_cap);
> +
> + case VIRTIO_NET_FF_MASK_TYPE_TCP:
> + return validate_tcp_mask(ff, sel, sel_cap);
> +
> + case VIRTIO_NET_FF_MASK_TYPE_UDP:
> + return validate_udp_mask(ff, sel, sel_cap);
> }
>
> return false;
> }
>
> +static void set_tcp(struct tcphdr *mask, struct tcphdr *key,
> + __be16 psrc_m, __be16 psrc_k,
> + __be16 pdst_m, __be16 pdst_k)
> +{
> + if (psrc_m) {
> + mask->source = psrc_m;
> + key->source = psrc_k;
> + }
> + if (pdst_m) {
> + mask->dest = pdst_m;
> + key->dest = pdst_k;
> + }
> +}
> +
> +static void set_udp(struct udphdr *mask, struct udphdr *key,
> + __be16 psrc_m, __be16 psrc_k,
> + __be16 pdst_m, __be16 pdst_k)
> +{
> + if (psrc_m) {
> + mask->source = psrc_m;
> + key->source = psrc_k;
> + }
> + if (pdst_m) {
> + mask->dest = pdst_m;
> + key->dest = pdst_k;
> + }
> +}
> +
> static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> const struct ethtool_rx_flow_spec *fs)
> {
> @@ -5987,6 +6067,11 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> mask->daddr = l3_mask->ip4dst;
> key->daddr = l3_val->ip4dst;
> }
> +
> + if (l3_mask->proto) {
> + mask->protocol = l3_mask->proto;
> + key->protocol = l3_val->proto;
> + }
> }
>
> static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> @@ -6004,16 +6089,35 @@ static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> }
> +
> + if (l3_mask->l4_proto) {
> + mask->nexthdr = l3_mask->l4_proto;
> + key->nexthdr = l3_val->l4_proto;
> + }
> }
>
> static bool has_ipv4(u32 flow_type)
> {
> - return flow_type == IP_USER_FLOW;
> + return flow_type == TCP_V4_FLOW ||
> + flow_type == UDP_V4_FLOW ||
> + flow_type == IP_USER_FLOW;
> }
>
> static bool has_ipv6(u32 flow_type)
> {
> - return flow_type == IPV6_USER_FLOW;
> + return flow_type == TCP_V6_FLOW ||
> + flow_type == UDP_V6_FLOW ||
> + flow_type == IPV6_USER_FLOW;
> +}
> +
> +static bool has_tcp(u32 flow_type)
> +{
> + return flow_type == TCP_V4_FLOW || flow_type == TCP_V6_FLOW;
> +}
> +
> +static bool has_udp(u32 flow_type)
> +{
> + return flow_type == UDP_V4_FLOW || flow_type == UDP_V6_FLOW;
> }
>
> static int setup_classifier(struct virtnet_ff *ff,
> @@ -6153,6 +6257,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
> case ETHER_FLOW:
> case IP_USER_FLOW:
> case IPV6_USER_FLOW:
> + case TCP_V4_FLOW:
> + case TCP_V6_FLOW:
> + case UDP_V4_FLOW:
> + case UDP_V6_FLOW:
> return true;
> }
>
> @@ -6194,6 +6302,12 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
> size += sizeof(struct iphdr);
> else if (has_ipv6(fs->flow_type))
> size += sizeof(struct ipv6hdr);
> +
> + if (has_tcp(fs->flow_type) || has_udp(fs->flow_type)) {
> + ++(*num_hdrs);
> + size += has_tcp(fs->flow_type) ? sizeof(struct tcphdr) :
> + sizeof(struct udphdr);
> + }
> }
>
> BUG_ON(size > 0xff);
> @@ -6233,7 +6347,8 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
>
> static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> u8 *key,
> - const struct ethtool_rx_flow_spec *fs)
> + const struct ethtool_rx_flow_spec *fs,
> + int num_hdrs)
> {
> struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
> struct iphdr *v4_m = (struct iphdr *)&selector->mask;
> @@ -6244,23 +6359,95 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
> selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
> selector->length = sizeof(struct ipv6hdr);
>
> - if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> - fs->m_u.usr_ip6_spec.l4_4_bytes)
> + if (num_hdrs == 2 && (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> + fs->m_u.usr_ip6_spec.l4_4_bytes))
> return -EINVAL;
>
> parse_ip6(v6_m, v6_k, fs);
> +
> + if (num_hdrs > 2) {
> + v6_m->nexthdr = 0xff;
> + if (has_tcp(fs->flow_type))
> + v6_k->nexthdr = IPPROTO_TCP;
> + else
> + v6_k->nexthdr = IPPROTO_UDP;
> + }
> } else {
> selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> selector->length = sizeof(struct iphdr);
>
> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
> - fs->m_u.usr_ip4_spec.ip_ver ||
> - fs->m_u.usr_ip4_spec.proto)
> + if (num_hdrs == 2 &&
> + (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
> + fs->m_u.usr_ip4_spec.ip_ver ||
> + fs->m_u.usr_ip4_spec.proto))
> return -EINVAL;
>
> parse_ip4(v4_m, v4_k, fs);
> +
> + if (num_hdrs > 2) {
> + v4_m->protocol = 0xff;
> + if (has_tcp(fs->flow_type))
> + v4_k->protocol = IPPROTO_TCP;
> + else
> + v4_k->protocol = IPPROTO_UDP;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int setup_transport_key_mask(struct virtio_net_ff_selector *selector,
> + u8 *key,
> + struct ethtool_rx_flow_spec *fs)
> +{
> + struct tcphdr *tcp_m = (struct tcphdr *)&selector->mask;
> + struct udphdr *udp_m = (struct udphdr *)&selector->mask;
> + const struct ethtool_tcpip6_spec *v6_l4_mask;
> + const struct ethtool_tcpip4_spec *v4_l4_mask;
> + const struct ethtool_tcpip6_spec *v6_l4_key;
> + const struct ethtool_tcpip4_spec *v4_l4_key;
> + struct tcphdr *tcp_k = (struct tcphdr *)key;
> + struct udphdr *udp_k = (struct udphdr *)key;
> +
> + if (has_tcp(fs->flow_type)) {
> + selector->type = VIRTIO_NET_FF_MASK_TYPE_TCP;
> + selector->length = sizeof(struct tcphdr);
> +
> + if (has_ipv6(fs->flow_type)) {
> + v6_l4_mask = &fs->m_u.tcp_ip6_spec;
> + v6_l4_key = &fs->h_u.tcp_ip6_spec;
> +
> + set_tcp(tcp_m, tcp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
> + v6_l4_mask->pdst, v6_l4_key->pdst);
> + } else {
> + v4_l4_mask = &fs->m_u.tcp_ip4_spec;
> + v4_l4_key = &fs->h_u.tcp_ip4_spec;
> +
> + set_tcp(tcp_m, tcp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
> + v4_l4_mask->pdst, v4_l4_key->pdst);
> + }
> +
> + } else if (has_udp(fs->flow_type)) {
> + selector->type = VIRTIO_NET_FF_MASK_TYPE_UDP;
> + selector->length = sizeof(struct udphdr);
> +
> + if (has_ipv6(fs->flow_type)) {
> + v6_l4_mask = &fs->m_u.udp_ip6_spec;
> + v6_l4_key = &fs->h_u.udp_ip6_spec;
> +
> + set_udp(udp_m, udp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
> + v6_l4_mask->pdst, v6_l4_key->pdst);
> + } else {
> + v4_l4_mask = &fs->m_u.udp_ip4_spec;
> + v4_l4_key = &fs->h_u.udp_ip4_spec;
> +
> + set_udp(udp_m, udp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
> + v4_l4_mask->pdst, v4_l4_key->pdst);
> + }
> + } else {
> + return -EOPNOTSUPP;
> }
>
> return 0;
> @@ -6300,6 +6487,7 @@ static int build_and_insert(struct virtnet_ff *ff,
> struct virtio_net_ff_selector *selector;
> struct virtnet_classifier *c;
> size_t classifier_size;
> + size_t key_offset;
> int num_hdrs;
> u8 key_size;
> u8 *key;
> @@ -6332,11 +6520,24 @@ static int build_and_insert(struct virtnet_ff *ff,
> setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
>
> if (num_hdrs != 1) {
> + key_offset = selector->length;
> selector = next_selector(selector);
>
> - err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
> + err = setup_ip_key_mask(selector, key + key_offset,
> + fs, num_hdrs);
> if (err)
> goto err_classifier;
> +
> + if (num_hdrs >= 2) {
So elsewhere it is num_hdrs > 2 here it's >= 2 ...
all this is confusing.
Can you please add some constants so reader can understand why
is each condition checked.
For example, is this not invoked on ip only filters? num_hdrs will be 2,
right?
> + key_offset += selector->length;
> + selector = next_selector(selector);
> +
> + err = setup_transport_key_mask(selector,
> + key + key_offset,
> + fs);
> + if (err)
> + goto err_classifier;
> + }
> }
>
> err = validate_classifier_selectors(ff, classifier, num_hdrs);
> --
> 2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible
2025-11-19 19:15 ` [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
@ 2025-11-24 22:04 ` Michael S. Tsirkin
2025-11-24 22:31 ` Dan Jurgens
0 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 22:04 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:19PM -0600, Daniel Jurgens wrote:
> Classifiers can be used by more than one rule. If there is an existing
> classifier, use it instead of creating a new one. If duplicate
> classifiers are created it would artifically limit the number of rules
> to the classifier limit, which is likely less than the rules limit.
>
> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> Reviewed-by: Parav Pandit <parav@nvidia.com>
> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> ---
> v4:
> - Fixed typo in commit message
> - for (int -> for (
>
> v8:
> - Removed unused num_classifiers. Jason Wang
>
> v12:
> - Clarified comment about destroy_classifier freeing. MST
> - Renamed the classifier field of virtnet_classifier to obj. MST
> - Explained why in commit message. MST
> ---
> ---
> drivers/net/virtio_net.c | 51 ++++++++++++++++++++++++++--------------
> 1 file changed, 34 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 7600e2383a72..5e49cd78904f 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -32,6 +32,7 @@
> #include <uapi/linux/virtio_pci.h>
> #include <uapi/linux/virtio_net_ff.h>
> #include <linux/xarray.h>
> +#include <linux/refcount.h>
>
> static int napi_weight = NAPI_POLL_WEIGHT;
> module_param(napi_weight, int, 0444);
> @@ -302,7 +303,6 @@ struct virtnet_ff {
> struct virtio_net_ff_cap_mask_data *ff_mask;
> struct virtio_net_ff_actions *ff_actions;
> struct xarray classifiers;
> - int num_classifiers;
> struct virtnet_ethtool_ff ethtool;
> };
>
> @@ -5816,12 +5816,13 @@ struct virtnet_ethtool_rule {
> /* The classifier struct must be the last field in this struct */
> struct virtnet_classifier {
> size_t size;
> + refcount_t refcount;
> u32 id;
> - struct virtio_net_resource_obj_ff_classifier classifier;
> + struct virtio_net_resource_obj_ff_classifier obj;
> };
>
> static_assert(sizeof(struct virtnet_classifier) ==
> - ALIGN(offsetofend(struct virtnet_classifier, classifier),
> + ALIGN(offsetofend(struct virtnet_classifier, obj),
> __alignof__(struct virtnet_classifier)),
> "virtnet_classifier: classifier must be the last member");
>
> @@ -5909,11 +5910,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
> return false;
> }
>
> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> +static int setup_classifier(struct virtnet_ff *ff,
> + struct virtnet_classifier **c)
> {
> + struct virtnet_classifier *tmp;
> + unsigned long i;
> int err;
>
> - err = xa_alloc(&ff->classifiers, &c->id, c,
> + xa_for_each(&ff->classifiers, i, tmp) {
> + if ((*c)->size == tmp->size &&
> + !memcmp(&tmp->obj, &(*c)->obj, tmp->size)) {
> + refcount_inc(&tmp->refcount);
> + kfree(*c);
> + *c = tmp;
> + goto out;
> + }
> + }
> +
> + err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
> XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> GFP_KERNEL);
> if (err)
> @@ -5921,29 +5935,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>
> err = virtio_admin_obj_create(ff->vdev,
> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> - c->id,
> + (*c)->id,
> VIRTIO_ADMIN_GROUP_TYPE_SELF,
> 0,
> - &c->classifier,
> - c->size);
> + &(*c)->obj,
> + (*c)->size);
> if (err)
> goto err_xarray;
>
> + refcount_set(&(*c)->refcount, 1);
> +out:
> return 0;
>
> err_xarray:
> - xa_erase(&ff->classifiers, c->id);
> + xa_erase(&ff->classifiers, (*c)->id);
>
> return err;
> }
>
> -static void destroy_classifier(struct virtnet_ff *ff,
> - u32 classifier_id)
> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
> {
> struct virtnet_classifier *c;
>
> c = xa_load(&ff->classifiers, classifier_id);
> - if (c) {
> + if (c && refcount_dec_and_test(&c->refcount)) {
> virtio_admin_obj_destroy(ff->vdev,
> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> c->id,
> @@ -5967,7 +5982,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
> 0);
>
> xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> - destroy_classifier(ff, eth_rule->classifier_id);
> + try_destroy_classifier(ff, eth_rule->classifier_id);
> kfree(eth_rule);
> }
>
> @@ -6139,7 +6154,7 @@ static int build_and_insert(struct virtnet_ff *ff,
> }
>
> c->size = classifier_size;
> - classifier = &c->classifier;
> + classifier = &c->obj;
> classifier->count = num_hdrs;
> selector = (void *)&classifier->selectors[0];
>
> @@ -6149,14 +6164,16 @@ static int build_and_insert(struct virtnet_ff *ff,
> if (err)
> goto err_key;
>
> - err = setup_classifier(ff, c);
> + err = setup_classifier(ff, &c);
> if (err)
> goto err_classifier;
>
> err = insert_rule(ff, eth_rule, c->id, key, key_size);
> if (err) {
> - /* destroy_classifier will free the classifier */
> - destroy_classifier(ff, c->id);
> + /* destroy_classifier release the reference on the classifier
try_destroy_classifier ? and I think you mean *will* release and free.
and what is "the reference"
> + * and free it if needed.
> + */
> + try_destroy_classifier(ff, c->id);
> goto err_key;
> }
>
> --
> 2.50.1
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 03/12] virtio: Expose generic device capability operations
2025-11-24 20:30 ` Michael S. Tsirkin
@ 2025-11-24 22:24 ` Dan Jurgens
2025-11-24 22:27 ` Michael S. Tsirkin
0 siblings, 1 reply; 40+ messages in thread
From: Dan Jurgens @ 2025-11-24 22:24 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 2:30 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:14PM -0600, Daniel Jurgens wrote:
>> Currently querying and setting capabilities is restricted to a single
>> capability and contained within the virtio PCI driver. However, each
>> device type has generic and device specific capabilities, that may be
>> queried and set. In subsequent patches virtio_net will query and set
>> flow filter capabilities.
>>
>> This changes the size of virtio_admin_cmd_query_cap_id_result. It's safe
>> to do because this data is written by DMA, so a newer controller can't
>> overrun the size on an older kernel.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>>
>> ---
>> v4: Moved this logic from virtio_pci_modern to new file
>> virtio_admin_commands.
>>
>> v12:
>> - Removed uapi virtio_pci include in virtio_admin.h. MST
>> - Added virtio_pci uapi include to virtio_admin_commands.c
>> - Put () around cap in macro. MST
>> - Removed nonsense comment above VIRTIO_ADMIN_MAX_CAP. MST
>> - +1 VIRTIO_ADMIN_MAX_CAP when calculating array size. MST
>> - Updated commit message
>> ---
>> drivers/virtio/Makefile | 2 +-
>> drivers/virtio/virtio_admin_commands.c | 91 ++++++++++++++++++++++++++
>> include/linux/virtio_admin.h | 80 ++++++++++++++++++++++
>> include/uapi/linux/virtio_pci.h | 6 +-
>> 4 files changed, 176 insertions(+), 3 deletions(-)
>> create mode 100644 drivers/virtio/virtio_admin_commands.c
>> create mode 100644 include/linux/virtio_admin.h
>>
>> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
>> index eefcfe90d6b8..2b4a204dde33 100644
>> --- a/drivers/virtio/Makefile
>> +++ b/drivers/virtio/Makefile
>> @@ -1,5 +1,5 @@
>> # SPDX-License-Identifier: GPL-2.0
>> -obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
>> +obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
>> obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
>> obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
>> obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
>> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
>> new file mode 100644
>> index 000000000000..a2254e71e8dc
>> --- /dev/null
>> +++ b/drivers/virtio/virtio_admin_commands.c
>> @@ -0,0 +1,91 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +
>> +#include <linux/virtio.h>
>> +#include <linux/virtio_config.h>
>> +#include <linux/virtio_admin.h>
>> +#include <uapi/linux/virtio_pci.h>
>> +
>> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
>> + struct virtio_admin_cmd_query_cap_id_result *data)
>> +{
>> + struct virtio_admin_cmd cmd = {};
>> + struct scatterlist result_sg;
>> +
>> + if (!vdev->config->admin_cmd_exec)
>> + return -EOPNOTSUPP;
>> +
>> + sg_init_one(&result_sg, data, sizeof(*data));
>> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
>> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
>> + cmd.result_sg = &result_sg;
>> +
>> + return vdev->config->admin_cmd_exec(vdev, &cmd);
>> +}
>> +EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
>> +
>> +int virtio_admin_cap_get(struct virtio_device *vdev,
>> + u16 id,
>> + void *caps,
>> + size_t cap_size)
>
>
> I still don't get why cap_size needs to be as large as size_t.
>
> if you don't care what's it size is, just say "unsigned".
> or u8 as a hint to users it's a small value.
The size is small for net flow filters, but this is supposed to be a
generic interface for future uses as well. Why limit it?
>
>> +{
>> + struct virtio_admin_cmd_cap_get_data *data;
>> + struct virtio_admin_cmd cmd = {};
>> + struct scatterlist result_sg;
>> + struct scatterlist data_sg;
>> + int err;
>> +
>> + if (!vdev->config->admin_cmd_exec)
>> + return -EOPNOTSUPP;
>> +
>> + data = kzalloc(sizeof(*data), GFP_KERNEL);
>
> uses kzalloc without including linux/slab.h
>
>
>
>> + if (!data)
>> + return -ENOMEM;
>> +
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 03/12] virtio: Expose generic device capability operations
2025-11-24 22:24 ` Dan Jurgens
@ 2025-11-24 22:27 ` Michael S. Tsirkin
0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 22:27 UTC (permalink / raw)
To: Dan Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Mon, Nov 24, 2025 at 04:24:37PM -0600, Dan Jurgens wrote:
> On 11/24/25 2:30 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:15:14PM -0600, Daniel Jurgens wrote:
> >> Currently querying and setting capabilities is restricted to a single
> >> capability and contained within the virtio PCI driver. However, each
> >> device type has generic and device specific capabilities, that may be
> >> queried and set. In subsequent patches virtio_net will query and set
> >> flow filter capabilities.
> >>
> >> This changes the size of virtio_admin_cmd_query_cap_id_result. It's safe
> >> to do because this data is written by DMA, so a newer controller can't
> >> overrun the size on an older kernel.
> >>
> >> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >>
> >> ---
> >> v4: Moved this logic from virtio_pci_modern to new file
> >> virtio_admin_commands.
> >>
> >> v12:
> >> - Removed uapi virtio_pci include in virtio_admin.h. MST
> >> - Added virtio_pci uapi include to virtio_admin_commands.c
> >> - Put () around cap in macro. MST
> >> - Removed nonsense comment above VIRTIO_ADMIN_MAX_CAP. MST
> >> - +1 VIRTIO_ADMIN_MAX_CAP when calculating array size. MST
> >> - Updated commit message
> >> ---
> >> drivers/virtio/Makefile | 2 +-
> >> drivers/virtio/virtio_admin_commands.c | 91 ++++++++++++++++++++++++++
> >> include/linux/virtio_admin.h | 80 ++++++++++++++++++++++
> >> include/uapi/linux/virtio_pci.h | 6 +-
> >> 4 files changed, 176 insertions(+), 3 deletions(-)
> >> create mode 100644 drivers/virtio/virtio_admin_commands.c
> >> create mode 100644 include/linux/virtio_admin.h
> >>
> >> diff --git a/drivers/virtio/Makefile b/drivers/virtio/Makefile
> >> index eefcfe90d6b8..2b4a204dde33 100644
> >> --- a/drivers/virtio/Makefile
> >> +++ b/drivers/virtio/Makefile
> >> @@ -1,5 +1,5 @@
> >> # SPDX-License-Identifier: GPL-2.0
> >> -obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o
> >> +obj-$(CONFIG_VIRTIO) += virtio.o virtio_ring.o virtio_admin_commands.o
> >> obj-$(CONFIG_VIRTIO_ANCHOR) += virtio_anchor.o
> >> obj-$(CONFIG_VIRTIO_PCI_LIB) += virtio_pci_modern_dev.o
> >> obj-$(CONFIG_VIRTIO_PCI_LIB_LEGACY) += virtio_pci_legacy_dev.o
> >> diff --git a/drivers/virtio/virtio_admin_commands.c b/drivers/virtio/virtio_admin_commands.c
> >> new file mode 100644
> >> index 000000000000..a2254e71e8dc
> >> --- /dev/null
> >> +++ b/drivers/virtio/virtio_admin_commands.c
> >> @@ -0,0 +1,91 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +
> >> +#include <linux/virtio.h>
> >> +#include <linux/virtio_config.h>
> >> +#include <linux/virtio_admin.h>
> >> +#include <uapi/linux/virtio_pci.h>
> >> +
> >> +int virtio_admin_cap_id_list_query(struct virtio_device *vdev,
> >> + struct virtio_admin_cmd_query_cap_id_result *data)
> >> +{
> >> + struct virtio_admin_cmd cmd = {};
> >> + struct scatterlist result_sg;
> >> +
> >> + if (!vdev->config->admin_cmd_exec)
> >> + return -EOPNOTSUPP;
> >> +
> >> + sg_init_one(&result_sg, data, sizeof(*data));
> >> + cmd.opcode = cpu_to_le16(VIRTIO_ADMIN_CMD_CAP_ID_LIST_QUERY);
> >> + cmd.group_type = cpu_to_le16(VIRTIO_ADMIN_GROUP_TYPE_SELF);
> >> + cmd.result_sg = &result_sg;
> >> +
> >> + return vdev->config->admin_cmd_exec(vdev, &cmd);
> >> +}
> >> +EXPORT_SYMBOL_GPL(virtio_admin_cap_id_list_query);
> >> +
> >> +int virtio_admin_cap_get(struct virtio_device *vdev,
> >> + u16 id,
> >> + void *caps,
> >> + size_t cap_size)
> >
> >
> > I still don't get why cap_size needs to be as large as size_t.
> >
> > if you don't care what's it size is, just say "unsigned".
> > or u8 as a hint to users it's a small value.
>
> The size is small for net flow filters, but this is supposed to be a
> generic interface for future uses as well. Why limit it?
because your implementation
makes assumptions - if you want it to be truly generic then you need to handle
weird corner cases such as integer overflow when you add these things.
>
> >
> >> +{
> >> + struct virtio_admin_cmd_cap_get_data *data;
> >> + struct virtio_admin_cmd cmd = {};
> >> + struct scatterlist result_sg;
> >> + struct scatterlist data_sg;
> >> + int err;
> >> +
> >> + if (!vdev->config->admin_cmd_exec)
> >> + return -EOPNOTSUPP;
> >> +
> >> + data = kzalloc(sizeof(*data), GFP_KERNEL);
> >
> > uses kzalloc without including linux/slab.h
> >
> >
> >
> >> + if (!data)
> >> + return -ENOMEM;
> >> +
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible
2025-11-24 22:04 ` Michael S. Tsirkin
@ 2025-11-24 22:31 ` Dan Jurgens
2025-11-24 22:38 ` Michael S. Tsirkin
0 siblings, 1 reply; 40+ messages in thread
From: Dan Jurgens @ 2025-11-24 22:31 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 4:04 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:19PM -0600, Daniel Jurgens wrote:
>> Classifiers can be used by more than one rule. If there is an existing
>> classifier, use it instead of creating a new one. If duplicate
>> classifiers are created it would artifically limit the number of rules
>> to the classifier limit, which is likely less than the rules limit.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> ---
>> v4:
>> - Fixed typo in commit message
>> - for (int -> for (
>>
>> v8:
>> - Removed unused num_classifiers. Jason Wang
>>
>> v12:
>> - Clarified comment about destroy_classifier freeing. MST
>> - Renamed the classifier field of virtnet_classifier to obj. MST
>> - Explained why in commit message. MST
>> ---
>> ---
>> drivers/net/virtio_net.c | 51 ++++++++++++++++++++++++++--------------
>> 1 file changed, 34 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 7600e2383a72..5e49cd78904f 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -32,6 +32,7 @@
>> #include <uapi/linux/virtio_pci.h>
>> #include <uapi/linux/virtio_net_ff.h>
>> #include <linux/xarray.h>
>> +#include <linux/refcount.h>
>>
>> static int napi_weight = NAPI_POLL_WEIGHT;
>> module_param(napi_weight, int, 0444);
>> @@ -302,7 +303,6 @@ struct virtnet_ff {
>> struct virtio_net_ff_cap_mask_data *ff_mask;
>> struct virtio_net_ff_actions *ff_actions;
>> struct xarray classifiers;
>> - int num_classifiers;
>> struct virtnet_ethtool_ff ethtool;
>> };
>>
>> @@ -5816,12 +5816,13 @@ struct virtnet_ethtool_rule {
>> /* The classifier struct must be the last field in this struct */
>> struct virtnet_classifier {
>> size_t size;
>> + refcount_t refcount;
>> u32 id;
>> - struct virtio_net_resource_obj_ff_classifier classifier;
>> + struct virtio_net_resource_obj_ff_classifier obj;
>> };
>>
>> static_assert(sizeof(struct virtnet_classifier) ==
>> - ALIGN(offsetofend(struct virtnet_classifier, classifier),
>> + ALIGN(offsetofend(struct virtnet_classifier, obj),
>> __alignof__(struct virtnet_classifier)),
>> "virtnet_classifier: classifier must be the last member");
>>
>> @@ -5909,11 +5910,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
>> return false;
>> }
>>
>> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>> +static int setup_classifier(struct virtnet_ff *ff,
>> + struct virtnet_classifier **c)
>> {
>> + struct virtnet_classifier *tmp;
>> + unsigned long i;
>> int err;
>>
>> - err = xa_alloc(&ff->classifiers, &c->id, c,
>> + xa_for_each(&ff->classifiers, i, tmp) {
>> + if ((*c)->size == tmp->size &&
>> + !memcmp(&tmp->obj, &(*c)->obj, tmp->size)) {
>> + refcount_inc(&tmp->refcount);
>> + kfree(*c);
>> + *c = tmp;
>> + goto out;
>> + }
>> + }
>> +
>> + err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
>> XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
>> GFP_KERNEL);
>> if (err)
>> @@ -5921,29 +5935,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
>>
>> err = virtio_admin_obj_create(ff->vdev,
>> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
>> - c->id,
>> + (*c)->id,
>> VIRTIO_ADMIN_GROUP_TYPE_SELF,
>> 0,
>> - &c->classifier,
>> - c->size);
>> + &(*c)->obj,
>> + (*c)->size);
>> if (err)
>> goto err_xarray;
>>
>> + refcount_set(&(*c)->refcount, 1);
>> +out:
>> return 0;
>>
>> err_xarray:
>> - xa_erase(&ff->classifiers, c->id);
>> + xa_erase(&ff->classifiers, (*c)->id);
>>
>> return err;
>> }
>>
>> -static void destroy_classifier(struct virtnet_ff *ff,
>> - u32 classifier_id)
>> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
>> {
>> struct virtnet_classifier *c;
>>
>> c = xa_load(&ff->classifiers, classifier_id);
>> - if (c) {
>> + if (c && refcount_dec_and_test(&c->refcount)) {
>> virtio_admin_obj_destroy(ff->vdev,
>> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
>> c->id,
>> @@ -5967,7 +5982,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
>> 0);
>>
>> xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
>> - destroy_classifier(ff, eth_rule->classifier_id);
>> + try_destroy_classifier(ff, eth_rule->classifier_id);
>> kfree(eth_rule);
>> }
>>
>> @@ -6139,7 +6154,7 @@ static int build_and_insert(struct virtnet_ff *ff,
>> }
>>
>> c->size = classifier_size;
>> - classifier = &c->classifier;
>> + classifier = &c->obj;
>> classifier->count = num_hdrs;
>> selector = (void *)&classifier->selectors[0];
>>
>> @@ -6149,14 +6164,16 @@ static int build_and_insert(struct virtnet_ff *ff,
>> if (err)
>> goto err_key;
>>
>> - err = setup_classifier(ff, c);
>> + err = setup_classifier(ff, &c);
>> if (err)
>> goto err_classifier;
>>
>> err = insert_rule(ff, eth_rule, c->id, key, key_size);
>> if (err) {
>> - /* destroy_classifier will free the classifier */
>> - destroy_classifier(ff, c->id);
>> + /* destroy_classifier release the reference on the classifier
>
>
> try_destroy_classifier ? and I think you mean *will* release and free.
>
> and what is "the reference"
I see the comment is munged. But classifiers are reference counted,
try_destroy_classifier will release the reference. And free if the
refcount is now 0.
See setup_classifier above.
>
>> + * and free it if needed.
>> + */
>> + try_destroy_classifier(ff, c->id);
>> goto err_key;
>> }
>>
>> --
>> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible
2025-11-24 22:31 ` Dan Jurgens
@ 2025-11-24 22:38 ` Michael S. Tsirkin
0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 22:38 UTC (permalink / raw)
To: Dan Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Mon, Nov 24, 2025 at 04:31:54PM -0600, Dan Jurgens wrote:
> On 11/24/25 4:04 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:15:19PM -0600, Daniel Jurgens wrote:
> >> Classifiers can be used by more than one rule. If there is an existing
> >> classifier, use it instead of creating a new one. If duplicate
> >> classifiers are created it would artifically limit the number of rules
> >> to the classifier limit, which is likely less than the rules limit.
> >>
> >> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
> >> Reviewed-by: Parav Pandit <parav@nvidia.com>
> >> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
> >> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >> ---
> >> v4:
> >> - Fixed typo in commit message
> >> - for (int -> for (
> >>
> >> v8:
> >> - Removed unused num_classifiers. Jason Wang
> >>
> >> v12:
> >> - Clarified comment about destroy_classifier freeing. MST
> >> - Renamed the classifier field of virtnet_classifier to obj. MST
> >> - Explained why in commit message. MST
> >> ---
> >> ---
> >> drivers/net/virtio_net.c | 51 ++++++++++++++++++++++++++--------------
> >> 1 file changed, 34 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> >> index 7600e2383a72..5e49cd78904f 100644
> >> --- a/drivers/net/virtio_net.c
> >> +++ b/drivers/net/virtio_net.c
> >> @@ -32,6 +32,7 @@
> >> #include <uapi/linux/virtio_pci.h>
> >> #include <uapi/linux/virtio_net_ff.h>
> >> #include <linux/xarray.h>
> >> +#include <linux/refcount.h>
> >>
> >> static int napi_weight = NAPI_POLL_WEIGHT;
> >> module_param(napi_weight, int, 0444);
> >> @@ -302,7 +303,6 @@ struct virtnet_ff {
> >> struct virtio_net_ff_cap_mask_data *ff_mask;
> >> struct virtio_net_ff_actions *ff_actions;
> >> struct xarray classifiers;
> >> - int num_classifiers;
> >> struct virtnet_ethtool_ff ethtool;
> >> };
> >>
> >> @@ -5816,12 +5816,13 @@ struct virtnet_ethtool_rule {
> >> /* The classifier struct must be the last field in this struct */
> >> struct virtnet_classifier {
> >> size_t size;
> >> + refcount_t refcount;
> >> u32 id;
> >> - struct virtio_net_resource_obj_ff_classifier classifier;
> >> + struct virtio_net_resource_obj_ff_classifier obj;
> >> };
> >>
> >> static_assert(sizeof(struct virtnet_classifier) ==
> >> - ALIGN(offsetofend(struct virtnet_classifier, classifier),
> >> + ALIGN(offsetofend(struct virtnet_classifier, obj),
> >> __alignof__(struct virtnet_classifier)),
> >> "virtnet_classifier: classifier must be the last member");
> >>
> >> @@ -5909,11 +5910,24 @@ static bool validate_mask(const struct virtnet_ff *ff,
> >> return false;
> >> }
> >>
> >> -static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> >> +static int setup_classifier(struct virtnet_ff *ff,
> >> + struct virtnet_classifier **c)
> >> {
> >> + struct virtnet_classifier *tmp;
> >> + unsigned long i;
> >> int err;
> >>
> >> - err = xa_alloc(&ff->classifiers, &c->id, c,
> >> + xa_for_each(&ff->classifiers, i, tmp) {
> >> + if ((*c)->size == tmp->size &&
> >> + !memcmp(&tmp->obj, &(*c)->obj, tmp->size)) {
> >> + refcount_inc(&tmp->refcount);
> >> + kfree(*c);
> >> + *c = tmp;
> >> + goto out;
> >> + }
> >> + }
> >> +
> >> + err = xa_alloc(&ff->classifiers, &(*c)->id, *c,
> >> XA_LIMIT(0, le32_to_cpu(ff->ff_caps->classifiers_limit) - 1),
> >> GFP_KERNEL);
> >> if (err)
> >> @@ -5921,29 +5935,30 @@ static int setup_classifier(struct virtnet_ff *ff, struct virtnet_classifier *c)
> >>
> >> err = virtio_admin_obj_create(ff->vdev,
> >> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> >> - c->id,
> >> + (*c)->id,
> >> VIRTIO_ADMIN_GROUP_TYPE_SELF,
> >> 0,
> >> - &c->classifier,
> >> - c->size);
> >> + &(*c)->obj,
> >> + (*c)->size);
> >> if (err)
> >> goto err_xarray;
> >>
> >> + refcount_set(&(*c)->refcount, 1);
> >> +out:
> >> return 0;
> >>
> >> err_xarray:
> >> - xa_erase(&ff->classifiers, c->id);
> >> + xa_erase(&ff->classifiers, (*c)->id);
> >>
> >> return err;
> >> }
> >>
> >> -static void destroy_classifier(struct virtnet_ff *ff,
> >> - u32 classifier_id)
> >> +static void try_destroy_classifier(struct virtnet_ff *ff, u32 classifier_id)
> >> {
> >> struct virtnet_classifier *c;
> >>
> >> c = xa_load(&ff->classifiers, classifier_id);
> >> - if (c) {
> >> + if (c && refcount_dec_and_test(&c->refcount)) {
> >> virtio_admin_obj_destroy(ff->vdev,
> >> VIRTIO_NET_RESOURCE_OBJ_FF_CLASSIFIER,
> >> c->id,
> >> @@ -5967,7 +5982,7 @@ static void destroy_ethtool_rule(struct virtnet_ff *ff,
> >> 0);
> >>
> >> xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> >> - destroy_classifier(ff, eth_rule->classifier_id);
> >> + try_destroy_classifier(ff, eth_rule->classifier_id);
> >> kfree(eth_rule);
> >> }
> >>
> >> @@ -6139,7 +6154,7 @@ static int build_and_insert(struct virtnet_ff *ff,
> >> }
> >>
> >> c->size = classifier_size;
> >> - classifier = &c->classifier;
> >> + classifier = &c->obj;
> >> classifier->count = num_hdrs;
> >> selector = (void *)&classifier->selectors[0];
> >>
> >> @@ -6149,14 +6164,16 @@ static int build_and_insert(struct virtnet_ff *ff,
> >> if (err)
> >> goto err_key;
> >>
> >> - err = setup_classifier(ff, c);
> >> + err = setup_classifier(ff, &c);
> >> if (err)
> >> goto err_classifier;
> >>
> >> err = insert_rule(ff, eth_rule, c->id, key, key_size);
> >> if (err) {
> >> - /* destroy_classifier will free the classifier */
> >> - destroy_classifier(ff, c->id);
> >> + /* destroy_classifier release the reference on the classifier
> >
> >
> > try_destroy_classifier ? and I think you mean *will* release and free.
> >
> > and what is "the reference"
>
> I see the comment is munged. But classifiers are reference counted,
> try_destroy_classifier will release the reference. And free if the
> refcount is now 0.
>
> See setup_classifier above.
ah I got it.
you mean refcount - the reference count - not "the reference".
And in the context of refcount_t there's decrement and
increment, not "release". acquire/release in fact refer to memory ordering.
> >
> >> + * and free it if needed.
> >> + */
> >> + try_destroy_classifier(ff, c->id);
> >> goto err_key;
> >> }
> >>
> >> --
> >> 2.50.1
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules
2025-11-24 21:51 ` Michael S. Tsirkin
@ 2025-11-24 22:41 ` Dan Jurgens
2025-11-26 5:48 ` Dan Jurgens
1 sibling, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-24 22:41 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 3:51 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:20PM -0600, Daniel Jurgens wrote:
>> Add support for IP_USER type rules from ethtool.
>>
>> Example:
>> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
>> Added rule with ID 1
>>
>> The example rule will drop packets with the source IP specified.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> ---
>> v4:
>> - Fixed bug in protocol check of parse_ip4
>> - (u8 *) to (void *) casting.
>> - Alignment issues.
>>
>> v12
>> - refactor calculate_flow_sizes to remove goto. MST
>> - refactor build_and_insert to remove goto validate. MST
>> - Move parse_ip4 l3_mask check to TCP/UDP patch. MST
>> - Check saddr/daddr mask before copying in parse_ip4. MST
>> - Remove tos check in setup_ip_key_mask.
>
> So if user attempts to set a filter by tos now, what blocks it?
> because parse_ip4 seems to ignore it ...
>
>> - check l4_4_bytes mask is 0 in setup_ip_key_mask. MST
>> - changed return of setup_ip_key_mask to -EINVAL.
>> - BUG_ON if key overflows u8 size in calculate_flow_sizes. MST
>> ---
>> ---
>> drivers/net/virtio_net.c | 119 +++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 113 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index 5e49cd78904f..b0b9972fe624 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -5894,6 +5894,34 @@ static bool validate_eth_mask(const struct virtnet_ff *ff,
>> return true;
>> }
>>
>> +static bool validate_ip4_mask(const struct virtnet_ff *ff,
>> + const struct virtio_net_ff_selector *sel,
>> + const struct virtio_net_ff_selector *sel_cap)
>> +{
>> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
>> + struct iphdr *cap, *mask;
>> +
>> + cap = (struct iphdr *)&sel_cap->mask;
>> + mask = (struct iphdr *)&sel->mask;
>> +
>> + if (mask->saddr &&
>> + !check_mask_vs_cap(&mask->saddr, &cap->saddr,
>> + sizeof(__be32), partial_mask))
>> + return false;
>> +
>> + if (mask->daddr &&
>> + !check_mask_vs_cap(&mask->daddr, &cap->daddr,
>> + sizeof(__be32), partial_mask))
>> + return false;
>> +
>> + if (mask->protocol &&
>> + !check_mask_vs_cap(&mask->protocol, &cap->protocol,
>> + sizeof(u8), partial_mask))
>> + return false;
>> +
>> + return true;
>> +}
>> +
>> static bool validate_mask(const struct virtnet_ff *ff,
>> const struct virtio_net_ff_selector *sel)
>> {
>> @@ -5905,11 +5933,36 @@ static bool validate_mask(const struct virtnet_ff *ff,
>> switch (sel->type) {
>> case VIRTIO_NET_FF_MASK_TYPE_ETH:
>> return validate_eth_mask(ff, sel, sel_cap);
>> +
>> + case VIRTIO_NET_FF_MASK_TYPE_IPV4:
>> + return validate_ip4_mask(ff, sel, sel_cap);
>> }
>>
>> return false;
>> }
>>
>> +static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> + const struct ethtool_rx_flow_spec *fs)
>> +{
>> + const struct ethtool_usrip4_spec *l3_mask = &fs->m_u.usr_ip4_spec;
>> + const struct ethtool_usrip4_spec *l3_val = &fs->h_u.usr_ip4_spec;
>> +
>> + if (mask->saddr) {
>> + mask->saddr = l3_mask->ip4src;
>> + key->saddr = l3_val->ip4src;
>> + }
>
> So if mast->saddr is already set you over-write it?
>
> But what sets it? Don't you really mean l3_mask->ip4src maybe?
Yes your right. My abbreviated test was only checking filtering by port
number on ipv4. Will fix that as well.
>
>
>
>> +
>> + if (mask->daddr) {
>> + mask->daddr = l3_mask->ip4dst;
>> + key->daddr = l3_val->ip4dst;
>> + }
>> +}
>
>
> Same question.
>
>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules
2025-11-24 22:02 ` Michael S. Tsirkin
@ 2025-11-24 22:47 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-24 22:47 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 4:02 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:22PM -0600, Daniel Jurgens wrote:
>> Implement TCP and UDP V4/V6 ethtool flow types.
>>
>> Examples:
>> $ ethtool -U ens9 flow-type udp4 dst-ip 192.168.5.2 dst-port\
>> 4321 action 20
>> Added rule with ID 4
>>
>> This example directs IPv4 UDP traffic with the specified address and
>> port to queue 20.
>>
>> $ ethtool -U ens9 flow-type tcp6 src-ip 2001:db8::1 src-port 1234 dst-ip\
>> 2001:db8::2 dst-port 4321 action 12
>> Added rule with ID 5
>>
>> This example directs IPv6 TCP traffic with the specified address and
>> port to queue 12.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> ---
>> v4: (*num_hdrs)++ to ++(*num_hdrs)
>>
>> v12:
>> - Refactor calculate_flow_sizes. MST
>> - Refactor build_and_insert to remove goto validate. MST
>> - Move parse_ip4/6 l3_mask check here. MST
>> ---
>> ---
>> drivers/net/virtio_net.c | 223 +++++++++++++++++++++++++++++++++++++--
>> 1 file changed, 212 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index bb8ec4265da5..e6c7e8cd4ab4 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -5950,6 +5950,52 @@ static bool validate_ip6_mask(const struct virtnet_ff *ff,
>> return true;
>> }
>>
>> +static bool validate_tcp_mask(const struct virtnet_ff *ff,
>> + const struct virtio_net_ff_selector *sel,
>> + const struct virtio_net_ff_selector *sel_cap)
>> +{
>> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
>> + struct tcphdr *cap, *mask;
>> +
>> + cap = (struct tcphdr *)&sel_cap->mask;
>> + mask = (struct tcphdr *)&sel->mask;
>> +
>> + if (mask->source &&
>> + !check_mask_vs_cap(&mask->source, &cap->source,
>> + sizeof(cap->source), partial_mask))
>> + return false;
>> +
>> + if (mask->dest &&
>> + !check_mask_vs_cap(&mask->dest, &cap->dest,
>> + sizeof(cap->dest), partial_mask))
>> + return false;
>> +
>> + return true;
>> +}
>> +
>> +static bool validate_udp_mask(const struct virtnet_ff *ff,
>> + const struct virtio_net_ff_selector *sel,
>> + const struct virtio_net_ff_selector *sel_cap)
>> +{
>> + bool partial_mask = !!(sel_cap->flags & VIRTIO_NET_FF_MASK_F_PARTIAL_MASK);
>> + struct udphdr *cap, *mask;
>> +
>> + cap = (struct udphdr *)&sel_cap->mask;
>> + mask = (struct udphdr *)&sel->mask;
>> +
>> + if (mask->source &&
>> + !check_mask_vs_cap(&mask->source, &cap->source,
>> + sizeof(cap->source), partial_mask))
>> + return false;
>> +
>> + if (mask->dest &&
>> + !check_mask_vs_cap(&mask->dest, &cap->dest,
>> + sizeof(cap->dest), partial_mask))
>> + return false;
>> +
>> + return true;
>> +}
>> +
>> static bool validate_mask(const struct virtnet_ff *ff,
>> const struct virtio_net_ff_selector *sel)
>> {
>> @@ -5967,11 +6013,45 @@ static bool validate_mask(const struct virtnet_ff *ff,
>>
>> case VIRTIO_NET_FF_MASK_TYPE_IPV6:
>> return validate_ip6_mask(ff, sel, sel_cap);
>> +
>> + case VIRTIO_NET_FF_MASK_TYPE_TCP:
>> + return validate_tcp_mask(ff, sel, sel_cap);
>> +
>> + case VIRTIO_NET_FF_MASK_TYPE_UDP:
>> + return validate_udp_mask(ff, sel, sel_cap);
>> }
>>
>> return false;
>> }
>>
>> +static void set_tcp(struct tcphdr *mask, struct tcphdr *key,
>> + __be16 psrc_m, __be16 psrc_k,
>> + __be16 pdst_m, __be16 pdst_k)
>> +{
>> + if (psrc_m) {
>> + mask->source = psrc_m;
>> + key->source = psrc_k;
>> + }
>> + if (pdst_m) {
>> + mask->dest = pdst_m;
>> + key->dest = pdst_k;
>> + }
>> +}
>> +
>> +static void set_udp(struct udphdr *mask, struct udphdr *key,
>> + __be16 psrc_m, __be16 psrc_k,
>> + __be16 pdst_m, __be16 pdst_k)
>> +{
>> + if (psrc_m) {
>> + mask->source = psrc_m;
>> + key->source = psrc_k;
>> + }
>> + if (pdst_m) {
>> + mask->dest = pdst_m;
>> + key->dest = pdst_k;
>> + }
>> +}
>> +
>> static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> const struct ethtool_rx_flow_spec *fs)
>> {
>> @@ -5987,6 +6067,11 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> mask->daddr = l3_mask->ip4dst;
>> key->daddr = l3_val->ip4dst;
>> }
>> +
>> + if (l3_mask->proto) {
>> + mask->protocol = l3_mask->proto;
>> + key->protocol = l3_val->proto;
>> + }
>> }
>>
>> static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
>> @@ -6004,16 +6089,35 @@ static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
>> memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
>> memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
>> }
>> +
>> + if (l3_mask->l4_proto) {
>> + mask->nexthdr = l3_mask->l4_proto;
>> + key->nexthdr = l3_val->l4_proto;
>> + }
>> }
>>
>> static bool has_ipv4(u32 flow_type)
>> {
>> - return flow_type == IP_USER_FLOW;
>> + return flow_type == TCP_V4_FLOW ||
>> + flow_type == UDP_V4_FLOW ||
>> + flow_type == IP_USER_FLOW;
>> }
>>
>> static bool has_ipv6(u32 flow_type)
>> {
>> - return flow_type == IPV6_USER_FLOW;
>> + return flow_type == TCP_V6_FLOW ||
>> + flow_type == UDP_V6_FLOW ||
>> + flow_type == IPV6_USER_FLOW;
>> +}
>> +
>> +static bool has_tcp(u32 flow_type)
>> +{
>> + return flow_type == TCP_V4_FLOW || flow_type == TCP_V6_FLOW;
>> +}
>> +
>> +static bool has_udp(u32 flow_type)
>> +{
>> + return flow_type == UDP_V4_FLOW || flow_type == UDP_V6_FLOW;
>> }
>>
>> static int setup_classifier(struct virtnet_ff *ff,
>> @@ -6153,6 +6257,10 @@ static bool supported_flow_type(const struct ethtool_rx_flow_spec *fs)
>> case ETHER_FLOW:
>> case IP_USER_FLOW:
>> case IPV6_USER_FLOW:
>> + case TCP_V4_FLOW:
>> + case TCP_V6_FLOW:
>> + case UDP_V4_FLOW:
>> + case UDP_V6_FLOW:
>> return true;
>> }
>>
>> @@ -6194,6 +6302,12 @@ static void calculate_flow_sizes(struct ethtool_rx_flow_spec *fs,
>> size += sizeof(struct iphdr);
>> else if (has_ipv6(fs->flow_type))
>> size += sizeof(struct ipv6hdr);
>> +
>> + if (has_tcp(fs->flow_type) || has_udp(fs->flow_type)) {
>> + ++(*num_hdrs);
>> + size += has_tcp(fs->flow_type) ? sizeof(struct tcphdr) :
>> + sizeof(struct udphdr);
>> + }
>> }
>>
>> BUG_ON(size > 0xff);
>> @@ -6233,7 +6347,8 @@ static void setup_eth_hdr_key_mask(struct virtio_net_ff_selector *selector,
>>
>> static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>> u8 *key,
>> - const struct ethtool_rx_flow_spec *fs)
>> + const struct ethtool_rx_flow_spec *fs,
>> + int num_hdrs)
>> {
>> struct ipv6hdr *v6_m = (struct ipv6hdr *)&selector->mask;
>> struct iphdr *v4_m = (struct iphdr *)&selector->mask;
>> @@ -6244,23 +6359,95 @@ static int setup_ip_key_mask(struct virtio_net_ff_selector *selector,
>> selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV6;
>> selector->length = sizeof(struct ipv6hdr);
>>
>> - if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
>> - fs->m_u.usr_ip6_spec.l4_4_bytes)
>> + if (num_hdrs == 2 && (fs->h_u.usr_ip6_spec.l4_4_bytes ||
>> + fs->m_u.usr_ip6_spec.l4_4_bytes))
>> return -EINVAL;
>>
>> parse_ip6(v6_m, v6_k, fs);
>> +
>> + if (num_hdrs > 2) {
>> + v6_m->nexthdr = 0xff;
>> + if (has_tcp(fs->flow_type))
>> + v6_k->nexthdr = IPPROTO_TCP;
>> + else
>> + v6_k->nexthdr = IPPROTO_UDP;
>> + }
>> } else {
>> selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
>> selector->length = sizeof(struct iphdr);
>>
>> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
>> - fs->m_u.usr_ip4_spec.ip_ver ||
>> - fs->m_u.usr_ip4_spec.proto)
>> + if (num_hdrs == 2 &&
>> + (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
>> + fs->m_u.usr_ip4_spec.ip_ver ||
>> + fs->m_u.usr_ip4_spec.proto))
>> return -EINVAL;
>>
>> parse_ip4(v4_m, v4_k, fs);
>> +
>> + if (num_hdrs > 2) {
>> + v4_m->protocol = 0xff;
>> + if (has_tcp(fs->flow_type))
>> + v4_k->protocol = IPPROTO_TCP;
>> + else
>> + v4_k->protocol = IPPROTO_UDP;
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int setup_transport_key_mask(struct virtio_net_ff_selector *selector,
>> + u8 *key,
>> + struct ethtool_rx_flow_spec *fs)
>> +{
>> + struct tcphdr *tcp_m = (struct tcphdr *)&selector->mask;
>> + struct udphdr *udp_m = (struct udphdr *)&selector->mask;
>> + const struct ethtool_tcpip6_spec *v6_l4_mask;
>> + const struct ethtool_tcpip4_spec *v4_l4_mask;
>> + const struct ethtool_tcpip6_spec *v6_l4_key;
>> + const struct ethtool_tcpip4_spec *v4_l4_key;
>> + struct tcphdr *tcp_k = (struct tcphdr *)key;
>> + struct udphdr *udp_k = (struct udphdr *)key;
>> +
>> + if (has_tcp(fs->flow_type)) {
>> + selector->type = VIRTIO_NET_FF_MASK_TYPE_TCP;
>> + selector->length = sizeof(struct tcphdr);
>> +
>> + if (has_ipv6(fs->flow_type)) {
>> + v6_l4_mask = &fs->m_u.tcp_ip6_spec;
>> + v6_l4_key = &fs->h_u.tcp_ip6_spec;
>> +
>> + set_tcp(tcp_m, tcp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
>> + v6_l4_mask->pdst, v6_l4_key->pdst);
>> + } else {
>> + v4_l4_mask = &fs->m_u.tcp_ip4_spec;
>> + v4_l4_key = &fs->h_u.tcp_ip4_spec;
>> +
>> + set_tcp(tcp_m, tcp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
>> + v4_l4_mask->pdst, v4_l4_key->pdst);
>> + }
>> +
>> + } else if (has_udp(fs->flow_type)) {
>> + selector->type = VIRTIO_NET_FF_MASK_TYPE_UDP;
>> + selector->length = sizeof(struct udphdr);
>> +
>> + if (has_ipv6(fs->flow_type)) {
>> + v6_l4_mask = &fs->m_u.udp_ip6_spec;
>> + v6_l4_key = &fs->h_u.udp_ip6_spec;
>> +
>> + set_udp(udp_m, udp_k, v6_l4_mask->psrc, v6_l4_key->psrc,
>> + v6_l4_mask->pdst, v6_l4_key->pdst);
>> + } else {
>> + v4_l4_mask = &fs->m_u.udp_ip4_spec;
>> + v4_l4_key = &fs->h_u.udp_ip4_spec;
>> +
>> + set_udp(udp_m, udp_k, v4_l4_mask->psrc, v4_l4_key->psrc,
>> + v4_l4_mask->pdst, v4_l4_key->pdst);
>> + }
>> + } else {
>> + return -EOPNOTSUPP;
>> }
>>
>> return 0;
>> @@ -6300,6 +6487,7 @@ static int build_and_insert(struct virtnet_ff *ff,
>> struct virtio_net_ff_selector *selector;
>> struct virtnet_classifier *c;
>> size_t classifier_size;
>> + size_t key_offset;
>> int num_hdrs;
>> u8 key_size;
>> u8 *key;
>> @@ -6332,11 +6520,24 @@ static int build_and_insert(struct virtnet_ff *ff,
>> setup_eth_hdr_key_mask(selector, key, fs, num_hdrs);
>>
>> if (num_hdrs != 1) {
>> + key_offset = selector->length;
>> selector = next_selector(selector);
>>
>> - err = setup_ip_key_mask(selector, key + sizeof(struct ethhdr), fs);
>> + err = setup_ip_key_mask(selector, key + key_offset,
>> + fs, num_hdrs);
>> if (err)
>> goto err_classifier;
>> +
>> + if (num_hdrs >= 2) {
>
>
> So elsewhere it is num_hdrs > 2 here it's >= 2 ...
>
> all this is confusing.
>
>
>
> Can you please add some constants so reader can understand why
> is each condition checked.
>
>
>
> For example, is this not invoked on ip only filters? num_hdrs will be 2,
> right?
>
It is invoked, incorrectly. But ethool is well behavied. I'll just
chechk flow_type vs num_hdrs.
>> + key_offset += selector->length;
>> + selector = next_selector(selector);
>> +
>> + err = setup_transport_key_mask(selector,
>> + key + key_offset,
>> + fs);
>> + if (err)
>> + goto err_classifier;
>> + }
>> }
>>
>> err = validate_classifier_selectors(ff, classifier, num_hdrs);
>> --
>> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
2025-11-20 1:51 ` Jakub Kicinski
2025-11-24 21:01 ` Michael S. Tsirkin
@ 2025-11-24 22:54 ` Michael S. Tsirkin
2025-11-26 6:11 ` Dan Jurgens
2 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 22:54 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:16PM -0600, Daniel Jurgens wrote:
> index 4738ffe3b5c6..e84a305d2b2a 100644
> --- a/drivers/virtio/virtio_admin_commands.c
> +++ b/drivers/virtio/virtio_admin_commands.c
> @@ -161,6 +161,8 @@ int virtio_admin_obj_destroy(struct virtio_device *vdev,
> err = vdev->config->admin_cmd_exec(vdev, &cmd);
> kfree(data);
>
> + WARN_ON_ONCE(err);
> +
> return err;
> }
The reason I suggested WARN_ON_ONCE is because callers generally can not
handle errors. if you return int you assume callers will do that so then
warning does not make sense.
Bottom line - make this return void.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering
2025-11-24 21:59 ` Michael S. Tsirkin
@ 2025-11-24 23:04 ` Dan Jurgens
2025-11-24 23:12 ` Michael S. Tsirkin
0 siblings, 1 reply; 40+ messages in thread
From: Dan Jurgens @ 2025-11-24 23:04 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 3:59 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:21PM -0600, Daniel Jurgens wrote:
>> Implement support for IPV6_USER_FLOW type rules.
>>
>> return false;
>> @@ -5958,11 +5989,33 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>> }
>> }
>>
>> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
>> + const struct ethtool_rx_flow_spec *fs)
>> +{
>
> I note logic wise it is different from ipv4, it is looking at the fs.
I'm not following you here. They both get the l3_mask and l3_val from
the flow spec.
>
>> + const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
>> + const struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
>> +
>> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
>> + memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
>> + memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
>> + }
>> +
>> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
>> + memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
>> + memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
>> + }
>
> Is this enough?
> For example, what if user tries to set up a filter by l4_proto ?
>
That's in the next patch.
>
>> +}
>> +
>> static bool has_ipv4(u32 flow_type)
>> {
>> return flow_type == IP_USER_FLOW;
>> }
>>
>> +static bool has_ipv6(u32 flow_type)
>> +{
>> + return flow_type == IPV6_USER_FLOW;
>> +}
>> +
dr);
>>
>> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
>> - fs->m_u.usr_ip4_spec.ip_ver ||
>> - fs->m_u.usr_ip4_spec.proto)
>> - return -EINVAL;
>> + if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
>> + fs->m_u.usr_ip6_spec.l4_4_bytes)
>> + return -EINVAL;
>>
>> - parse_ip4(v4_m, v4_k, fs);
>> + parse_ip6(v6_m, v6_k, fs);
>
>
> why does ipv6 not check unsupported fields unlike ipv4?
The UAPI for user_ip6 doesn't make the same assertions:
/**
* struct ethtool_usrip6_spec - general flow specification for IPv6
* @ip6src: Source host
* @ip6dst: Destination host
* @l4_4_bytes: First 4 bytes of transport (layer 4) header
* @tclass: Traffic Class
* @l4_proto: Transport protocol number (nexthdr after any Extension
Headers) ]
*/
/**
* struct ethtool_usrip4_spec - general flow specification for IPv4
* @ip4src: Source host
* @ip4dst: Destination host
* @l4_4_bytes: First 4 bytes of transport (layer 4) header
* @tos: Type-of-service
* @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
* @proto: Transport protocol number; mask must be 0
*/
A check of l4_proto is probably reasonable though, since this is adding
filter by IP only, so l4_proto should be unset.
>
>> + } else {
>> + selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
>> + selector->length = sizeof(struct iphdr);
>> +
>> + if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
>> + fs->m_u.usr_ip4_spec.ip_ver ||
>> + fs->m_u.usr_ip4_spec.proto)
>> + return -EINVAL;
>> +
>> + parse_ip4(v4_m, v4_k, fs);
>> + }
>>
>> return 0;
>> }
>> --
>> 2.50.1
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering
2025-11-24 23:04 ` Dan Jurgens
@ 2025-11-24 23:12 ` Michael S. Tsirkin
2025-11-25 0:10 ` Dan Jurgens
0 siblings, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-24 23:12 UTC (permalink / raw)
To: Dan Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Mon, Nov 24, 2025 at 05:04:30PM -0600, Dan Jurgens wrote:
> On 11/24/25 3:59 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:15:21PM -0600, Daniel Jurgens wrote:
> >> Implement support for IPV6_USER_FLOW type rules.
> >>
>
> >> return false;
> >> @@ -5958,11 +5989,33 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
> >> }
> >> }
> >>
> >> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
> >> + const struct ethtool_rx_flow_spec *fs)
> >> +{
> >
> > I note logic wise it is different from ipv4, it is looking at the fs.
>
> I'm not following you here. They both get the l3_mask and l3_val from
> the flow spec.
yes but ipv4 is buggy in your patch.
> >
> >> + const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
> >> + const struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
> >> +
> >> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
> >> + memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
> >> + memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
> >> + }
> >> +
> >> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
> >> + memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
> >> + memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
> >> + }
> >
> > Is this enough?
> > For example, what if user tries to set up a filter by l4_proto ?
> >
>
> That's in the next patch.
yes but if just this one is applied (e.g. by bisect)?
> >
> >> +}
> >> +
> >> static bool has_ipv4(u32 flow_type)
> >> {
> >> return flow_type == IP_USER_FLOW;
> >> }
> >>
> >> +static bool has_ipv6(u32 flow_type)
> >> +{
> >> + return flow_type == IPV6_USER_FLOW;
> >> +}
> >> +
> dr);
> >>
> >> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> >> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> >> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
> >> - fs->m_u.usr_ip4_spec.ip_ver ||
> >> - fs->m_u.usr_ip4_spec.proto)
> >> - return -EINVAL;
> >> + if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
> >> + fs->m_u.usr_ip6_spec.l4_4_bytes)
> >> + return -EINVAL;
> >>
> >> - parse_ip4(v4_m, v4_k, fs);
> >> + parse_ip6(v6_m, v6_k, fs);
> >
> >
> > why does ipv6 not check unsupported fields unlike ipv4?
>
> The UAPI for user_ip6 doesn't make the same assertions:
>
> /**
>
> * struct ethtool_usrip6_spec - general flow specification for IPv6
>
> * @ip6src: Source host
>
> * @ip6dst: Destination host
>
> * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>
> * @tclass: Traffic Class
>
> * @l4_proto: Transport protocol number (nexthdr after any Extension
> Headers) ]
> */
>
> /**
> * struct ethtool_usrip4_spec - general flow specification for IPv4
> * @ip4src: Source host
> * @ip4dst: Destination host
> * @l4_4_bytes: First 4 bytes of transport (layer 4) header
> * @tos: Type-of-service
> * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
> * @proto: Transport protocol number; mask must be 0
> */
>
> A check of l4_proto is probably reasonable though, since this is adding
> filter by IP only, so l4_proto should be unset.
maybe run this by relevant maintainers.
>
> >
> >> + } else {
> >> + selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
> >> + selector->length = sizeof(struct iphdr);
> >> +
> >> + if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
> >> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
> >> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
> >> + fs->m_u.usr_ip4_spec.ip_ver ||
> >> + fs->m_u.usr_ip4_spec.proto)
> >> + return -EINVAL;
> >> +
> >> + parse_ip4(v4_m, v4_k, fs);
> >> + }
> >>
> >> return 0;
> >> }
> >> --
> >> 2.50.1
> >
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-24 21:01 ` Michael S. Tsirkin
@ 2025-11-25 0:05 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-25 0:05 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 3:01 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:16PM -0600, Daniel Jurgens wrote:
>> When probing a virtnet device, attempt to read the flow filter
>> + for (i = 0; i < ff->ff_mask->count; i++) {
>> + if (sel->length > MAX_SEL_LEN) {
>> + WARN_ON_ONCE(true);
>> + err = -EINVAL;
>> + goto err_ff_action;
>> + }
>> + real_ff_mask_size += sizeof(struct virtio_net_ff_selector) + sel->length;
>> + if (real_ff_mask_size > ff_mask_size) {
>> + WARN_ON_ONCE(true);
>> + err = -EINVAL;
>> + goto err_ff_action;
>> + }
>> + sel = (void *)sel + sizeof(*sel) + sel->length;
>> + }
>
>
> I am trying to figure out whether this is safe with
> a buggy/malicious device which passes count > VIRTIO_NET_FF_MASK_TYPE_MAX
It should be safe. The count is u8, so it's bounded at a low number of
iterations. We shouldn't overrun the allocated memory with the existing
checks.
>
>
> In fact, what if a future device supports more types?
> There does not need to be a negotiation about what driver
> needs, right?
>
I think I should a check of the type, check that each type is only set
once. And break if I hit a type >= VIRTIO_NET_FF_MASK_TYPE.
I think that should be sufficient. If the spec is ever expanded to
include more selector types it would have to insist they come after the
existing ones. The MAX_SEL_LEN check will come after the break on unkown
type.
Then it should be able to maintain compatibility with newer controllers.
>
>> +
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering
2025-11-24 23:12 ` Michael S. Tsirkin
@ 2025-11-25 0:10 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-25 0:10 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 5:12 PM, Michael S. Tsirkin wrote:
> On Mon, Nov 24, 2025 at 05:04:30PM -0600, Dan Jurgens wrote:
>> On 11/24/25 3:59 PM, Michael S. Tsirkin wrote:
>>> On Wed, Nov 19, 2025 at 01:15:21PM -0600, Daniel Jurgens wrote:
>>>> Implement support for IPV6_USER_FLOW type rules.
>>>>
>>
>>>> return false;
>>>> @@ -5958,11 +5989,33 @@ static void parse_ip4(struct iphdr *mask, struct iphdr *key,
>>>> }
>>>> }
>>>>
>>>> +static void parse_ip6(struct ipv6hdr *mask, struct ipv6hdr *key,
>>>> + const struct ethtool_rx_flow_spec *fs)
>>>> +{
>>>
>>> I note logic wise it is different from ipv4, it is looking at the fs.
>>
>> I'm not following you here. They both get the l3_mask and l3_val from
>> the flow spec.
>
> yes but ipv4 is buggy in your patch.
>
Agreed, will fix that.
>>>
>>>> + const struct ethtool_usrip6_spec *l3_mask = &fs->m_u.usr_ip6_spec;
>>>> + const struct ethtool_usrip6_spec *l3_val = &fs->h_u.usr_ip6_spec;
>>>> +
>>>> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6src)) {
>>>> + memcpy(&mask->saddr, l3_mask->ip6src, sizeof(mask->saddr));
>>>> + memcpy(&key->saddr, l3_val->ip6src, sizeof(key->saddr));
>>>> + }
>>>> +
>>>> + if (!ipv6_addr_any((struct in6_addr *)l3_mask->ip6dst)) {
>>>> + memcpy(&mask->daddr, l3_mask->ip6dst, sizeof(mask->daddr));
>>>> + memcpy(&key->daddr, l3_val->ip6dst, sizeof(key->daddr));
>>>> + }
>>>
>>> Is this enough?
>>> For example, what if user tries to set up a filter by l4_proto ?
>>>
>>
>> That's in the next patch.
>
> yes but if just this one is applied (e.g. by bisect)?
>
1. You told me to move it to the TCP patch last review.
2. None of this code is really reachable until the get ops are added in
the last patch. ethtool needs to do gets to know if/how it can set.
Bisecting would be a strange way to try to debug this series, since
functionality is added by flow type.
>
>>>
>>>> +}
>>>> +
>>>> static bool has_ipv4(u32 flow_type)
>>>> {
>>>> return flow_type == IP_USER_FLOW;
>>>> }
>>>>
>>>> +static bool has_ipv6(u32 flow_type)
>>>> +{
>>>> + return flow_type == IPV6_USER_FLOW;
>>>> +}
>>>> +
>> dr);
>>>>
>>>> - if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>>>> - fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>>>> - fs->m_u.usr_ip4_spec.l4_4_bytes ||
>>>> - fs->m_u.usr_ip4_spec.ip_ver ||
>>>> - fs->m_u.usr_ip4_spec.proto)
>>>> - return -EINVAL;
>>>> + if (fs->h_u.usr_ip6_spec.l4_4_bytes ||
>>>> + fs->m_u.usr_ip6_spec.l4_4_bytes)
>>>> + return -EINVAL;
>>>>
>>>> - parse_ip4(v4_m, v4_k, fs);
>>>> + parse_ip6(v6_m, v6_k, fs);
>>>
>>>
>>> why does ipv6 not check unsupported fields unlike ipv4?
>>
>> The UAPI for user_ip6 doesn't make the same assertions:
>>
>> /**
>>
>> * struct ethtool_usrip6_spec - general flow specification for IPv6
>>
>> * @ip6src: Source host
>>
>> * @ip6dst: Destination host
>>
>> * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>>
>> * @tclass: Traffic Class
>>
>> * @l4_proto: Transport protocol number (nexthdr after any Extension
>> Headers) ]
>> */
>>
>> /**
>> * struct ethtool_usrip4_spec - general flow specification for IPv4
>> * @ip4src: Source host
>> * @ip4dst: Destination host
>> * @l4_4_bytes: First 4 bytes of transport (layer 4) header
>> * @tos: Type-of-service
>> * @ip_ver: Value must be %ETH_RX_NFC_IP4; mask must be 0
>> * @proto: Transport protocol number; mask must be 0
>> */
>>
>> A check of l4_proto is probably reasonable though, since this is adding
>> filter by IP only, so l4_proto should be unset.
>
>
> maybe run this by relevant maintainers.
>>
>>>
>>>> + } else {
>>>> + selector->type = VIRTIO_NET_FF_MASK_TYPE_IPV4;
>>>> + selector->length = sizeof(struct iphdr);
>>>> +
>>>> + if (fs->h_u.usr_ip4_spec.l4_4_bytes ||
>>>> + fs->h_u.usr_ip4_spec.ip_ver != ETH_RX_NFC_IP4 ||
>>>> + fs->m_u.usr_ip4_spec.l4_4_bytes ||
>>>> + fs->m_u.usr_ip4_spec.ip_ver ||
>>>> + fs->m_u.usr_ip4_spec.proto)
>>>> + return -EINVAL;
>>>> +
>>>> + parse_ip4(v4_m, v4_k, fs);
>>>> + }
>>>>
>>>> return 0;
>>>> }
>>>> --
>>>> 2.50.1
>>>
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-19 19:15 ` [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
2025-11-24 21:05 ` Michael S. Tsirkin
@ 2025-11-25 14:25 ` Michael S. Tsirkin
2025-11-25 15:39 ` Dan Jurgens
1 sibling, 1 reply; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-25 14:25 UTC (permalink / raw)
To: Daniel Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 19, 2025 at 01:15:18PM -0600, Daniel Jurgens wrote:
> +static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
> + struct ethtool_rx_flow_spec *fs,
> + u16 curr_queue_pairs)
> +{
> + struct virtnet_ethtool_rule *eth_rule;
> + int err;
> +
> + if (!ff->ff_supported)
> + return -EOPNOTSUPP;
> +
> + err = validate_flow_input(ff, fs, curr_queue_pairs);
> + if (err)
> + return err;
> +
> + eth_rule = kzalloc(sizeof(*eth_rule), GFP_KERNEL);
> + if (!eth_rule)
> + return -ENOMEM;
> +
> + err = xa_alloc(&ff->ethtool.rules, &fs->location, eth_rule,
> + XA_LIMIT(0, le32_to_cpu(ff->ff_caps->rules_limit) - 1),
> + GFP_KERNEL);
> + if (err)
> + goto err_rule;
> +
> + eth_rule->flow_spec = *fs;
> +
> + err = build_and_insert(ff, eth_rule);
> + if (err)
> + goto err_xa;
btw kind of inelegant that we change fs->location if build_and_insert fails.
restore it?
> + return err;
> +
> +err_xa:
> + xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
> +
> +err_rule:
> + fs->location = RX_CLS_LOC_ANY;
> + kfree(eth_rule);
> +
> + return err;
> +}
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-25 14:25 ` Michael S. Tsirkin
@ 2025-11-25 15:39 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-25 15:39 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/25/25 8:25 AM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:18PM -0600, Daniel Jurgens wrote:
>> +static int virtnet_ethtool_flow_insert(struct virtnet_ff *ff,
>> + struct ethtool_rx_flow_spec *fs,
>> + u16 curr_queue_pairs)
>> + err = build_and_insert(ff, eth_rule);
>> + if (err)
>> + goto err_xa;
>
>
> btw kind of inelegant that we change fs->location if build_and_insert fails.
> restore it?
>
It's not needed based on the current implementation of ethtool, it won't
use the location field if the return is an error. Parav suggested I add
it during our internal review. Leave the input unchanged if we fail.
>> + return err;
>> +
>> +err_xa:
>> + xa_erase(&ff->ethtool.rules, eth_rule->flow_spec.location);
>> +
>> +err_rule:
>> + fs->location = RX_CLS_LOC_ANY;
>> + kfree(eth_rule);
>> +
>> + return err;
>> +}
>
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules
2025-11-24 21:51 ` Michael S. Tsirkin
2025-11-24 22:41 ` Dan Jurgens
@ 2025-11-26 5:48 ` Dan Jurgens
1 sibling, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-26 5:48 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 3:51 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:20PM -0600, Daniel Jurgens wrote:
>> Add support for IP_USER type rules from ethtool.
>>
>> Example:
>> $ ethtool -U ens9 flow-type ip4 src-ip 192.168.51.101 action -1
>> Added rule with ID 1
>>
>> The example rule will drop packets with the source IP specified.
>>
>> Signed-off-by: Daniel Jurgens <danielj@nvidia.com>
>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>> Reviewed-by: Shahar Shitrit <shshitrit@nvidia.com>
>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
>> ---
>> v4:
>> - Fixed bug in protocol check of parse_ip4
>> - (u8 *) to (void *) casting.
>> - Alignment issues.
>>
>> v12
>> - refactor calculate_flow_sizes to remove goto. MST
>> - refactor build_and_insert to remove goto validate. MST
>> - Move parse_ip4 l3_mask check to TCP/UDP patch. MST
>> - Check saddr/daddr mask before copying in parse_ip4. MST
>> - Remove tos check in setup_ip_key_mask.
>
> So if user attempts to set a filter by tos now, what blocks it?
> because parse_ip4 seems to ignore it ...
>
Added it to validate_ipv4_mask.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps
2025-11-24 22:54 ` Michael S. Tsirkin
@ 2025-11-26 6:11 ` Dan Jurgens
0 siblings, 0 replies; 40+ messages in thread
From: Dan Jurgens @ 2025-11-26 6:11 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 4:54 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:16PM -0600, Daniel Jurgens wrote:
>> index 4738ffe3b5c6..e84a305d2b2a 100644
>> --- a/drivers/virtio/virtio_admin_commands.c
>> +++ b/drivers/virtio/virtio_admin_commands.c
>> @@ -161,6 +161,8 @@ int virtio_admin_obj_destroy(struct virtio_device *vdev,
>> err = vdev->config->admin_cmd_exec(vdev, &cmd);
>> kfree(data);
>>
>> + WARN_ON_ONCE(err);
>> +
>> return err;
>> }
>
>
> The reason I suggested WARN_ON_ONCE is because callers generally can not
> handle errors. if you return int you assume callers will do that so then
> warning does not make sense.
>
> Bottom line - make this return void.
>
>
Done, also, this hunk was misplaced, moved it to the patch this function
was introduced.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-24 21:05 ` Michael S. Tsirkin
@ 2025-11-26 16:25 ` Dan Jurgens
2025-11-26 18:00 ` Michael S. Tsirkin
0 siblings, 1 reply; 40+ messages in thread
From: Dan Jurgens @ 2025-11-26 16:25 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On 11/24/25 3:05 PM, Michael S. Tsirkin wrote:
> On Wed, Nov 19, 2025 at 01:15:18PM -0600, Daniel Jurgens wrote:
>> @@ -5681,6 +5710,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
>> .get_rxfh_fields = virtnet_get_hashflow,
>> .set_rxfh_fields = virtnet_set_hashflow,
>> .get_rx_ring_count = virtnet_get_rx_ring_count,
>> + .set_rxnfc = virtnet_set_rxnfc,
>> };
>>
>> static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
>
> should we not wire up get_rxnfc too? weird to be able to set but
> not get.
>
I prefer to do that as the last patch. That's what really turn the
feature on. ethtool need to do gets before it can set. Also, this patch
is already quite large.
^ permalink raw reply [flat|nested] 40+ messages in thread
* Re: [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules
2025-11-26 16:25 ` Dan Jurgens
@ 2025-11-26 18:00 ` Michael S. Tsirkin
0 siblings, 0 replies; 40+ messages in thread
From: Michael S. Tsirkin @ 2025-11-26 18:00 UTC (permalink / raw)
To: Dan Jurgens
Cc: netdev, jasowang, pabeni, virtualization, parav, shshitrit,
yohadt, xuanzhuo, eperezma, jgg, kevin.tian, kuba, andrew+netdev,
edumazet
On Wed, Nov 26, 2025 at 10:25:44AM -0600, Dan Jurgens wrote:
> On 11/24/25 3:05 PM, Michael S. Tsirkin wrote:
> > On Wed, Nov 19, 2025 at 01:15:18PM -0600, Daniel Jurgens wrote:
> >> @@ -5681,6 +5710,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
> >> .get_rxfh_fields = virtnet_get_hashflow,
> >> .set_rxfh_fields = virtnet_set_hashflow,
> >> .get_rx_ring_count = virtnet_get_rx_ring_count,
> >> + .set_rxnfc = virtnet_set_rxnfc,
> >> };
> >>
> >> static void virtnet_get_queue_stats_rx(struct net_device *dev, int i,
> >
> > should we not wire up get_rxnfc too? weird to be able to set but
> > not get.
> >
>
> I prefer to do that as the last patch. That's what really turn the
> feature on. ethtool need to do gets before it can set. Also, this patch
> is already quite large.
ok
^ permalink raw reply [flat|nested] 40+ messages in thread
end of thread, other threads:[~2025-11-26 18:00 UTC | newest]
Thread overview: 40+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-19 19:15 [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 01/12] virtio_pci: Remove supported_cap size build assert Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 02/12] virtio: Add config_op for admin commands Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 03/12] virtio: Expose generic device capability operations Daniel Jurgens
2025-11-24 20:30 ` Michael S. Tsirkin
2025-11-24 22:24 ` Dan Jurgens
2025-11-24 22:27 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 04/12] virtio: Expose object create and destroy API Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 05/12] virtio_net: Query and set flow filter caps Daniel Jurgens
2025-11-20 1:51 ` Jakub Kicinski
2025-11-20 15:39 ` Dan Jurgens
2025-11-24 21:01 ` Michael S. Tsirkin
2025-11-25 0:05 ` Dan Jurgens
2025-11-24 22:54 ` Michael S. Tsirkin
2025-11-26 6:11 ` Dan Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 06/12] virtio_net: Create a FF group for ethtool steering Daniel Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 07/12] virtio_net: Implement layer 2 ethtool flow rules Daniel Jurgens
2025-11-24 21:05 ` Michael S. Tsirkin
2025-11-26 16:25 ` Dan Jurgens
2025-11-26 18:00 ` Michael S. Tsirkin
2025-11-25 14:25 ` Michael S. Tsirkin
2025-11-25 15:39 ` Dan Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 08/12] virtio_net: Use existing classifier if possible Daniel Jurgens
2025-11-24 22:04 ` Michael S. Tsirkin
2025-11-24 22:31 ` Dan Jurgens
2025-11-24 22:38 ` Michael S. Tsirkin
2025-11-19 19:15 ` [PATCH net-next v12 09/12] virtio_net: Implement IPv4 ethtool flow rules Daniel Jurgens
2025-11-24 21:51 ` Michael S. Tsirkin
2025-11-24 22:41 ` Dan Jurgens
2025-11-26 5:48 ` Dan Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 10/12] virtio_net: Add support for IPv6 ethtool steering Daniel Jurgens
2025-11-24 21:59 ` Michael S. Tsirkin
2025-11-24 23:04 ` Dan Jurgens
2025-11-24 23:12 ` Michael S. Tsirkin
2025-11-25 0:10 ` Dan Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 11/12] virtio_net: Add support for TCP and UDP ethtool rules Daniel Jurgens
2025-11-24 22:02 ` Michael S. Tsirkin
2025-11-24 22:47 ` Dan Jurgens
2025-11-19 19:15 ` [PATCH net-next v12 12/12] virtio_net: Add get ethtool flow rules ops Daniel Jurgens
2025-11-19 20:22 ` [PATCH net-next v12 00/12] virtio_net: Add ethtool flow rules support Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).