public inbox for dev@dpdk.org
 help / color / mirror / Atom feed
* [PATCH 0/4] net/gve: add flow steering support
@ 2026-02-27 19:51 Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
                   ` (5 more replies)
  0 siblings, 6 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-02-27 19:51 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary

This patch series adds flow steering support to the Google Virtual
Ethernet (gve) driver. This functionality allows traffic to be directed
to specific receive queues based on user-specified flow patterns.

The series includes foundational support for extended admin queue
commands needed to handle flow rules, the specific adminqueue commands
for flow rule management, and the integration with the DPDK rte_flow
API. The series adds support flow matching on the following protocols:
IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.

Patch Overview:

1. "net/gve: add flow steering device option" checks for and enables
   the flow steering capability in the device options during
   initialization.
2. "net/gve: introduce extended adminq command" adds infrastructure
   for sending extended admin queue commands. These commands use a
   flexible buffer descriptor format required for flow rule management.
3. "net/gve: add adminq commands for flow steering" implements the
   specific admin queue commands to add and remove flow rules on the
   device, including handling of rule IDs and parameters.
4. "net/gve: add rte flow API integration" exposes the flow steering
   functionality via the DPDK rte_flow API. This includes strict
   pattern validation, rule parsing, and lifecycle management (create,
   destroy, flush). It ensures thread-safe access to the flow subsystem
   and proper resource cleanup during device reset.

Jasper Tran O'Leary (2):
  net/gve: add adminq commands for flow steering
  net/gve: add rte flow API integration

Vee Agarwal (2):
  net/gve: add flow steering device option
  net/gve: introduce extended adminq command

 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  20 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/base/gve_adminq.c      | 118 ++++-
 drivers/net/gve/base/gve_adminq.h      |  57 +++
 drivers/net/gve/gve_ethdev.c           |  87 +++-
 drivers/net/gve/gve_ethdev.h           |  46 ++
 drivers/net/gve/gve_flow_rule.c        | 645 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |  64 +++
 drivers/net/gve/meson.build            |   1 +
 11 files changed, 1049 insertions(+), 5 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/4] net/gve: add flow steering device option
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
@ 2026-02-27 19:51 ` Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-02-27 19:51 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary

From: Vee Agarwal <veethebee@google.com>

Add a new device option to signal to the driver that the device supports
flow steering. This device option also carries the maximum number of
flow steering rules that the device can store.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 36 ++++++++++++++++++++++++++++---
 drivers/net/gve/base/gve_adminq.h | 11 ++++++++++
 drivers/net/gve/gve_ethdev.h      |  2 ++
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 6bd98d5..64b9468 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -36,6 +36,7 @@ void gve_parse_device_option(struct gve_priv *priv,
 			     struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			     struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			     struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			     struct gve_device_option_flow_steering **dev_op_flow_steering,
 			     struct gve_device_option_modify_ring **dev_op_modify_ring,
 			     struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -109,6 +110,22 @@ void gve_parse_device_option(struct gve_priv *priv,
 		}
 		*dev_op_dqo_rda = RTE_PTR_ADD(option, sizeof(*option));
 		break;
+	case GVE_DEV_OPT_ID_FLOW_STEERING:
+		if (option_length < sizeof(**dev_op_flow_steering) ||
+		    req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+			PMD_DRV_LOG(WARNING, GVE_DEVICE_OPTION_ERROR_FMT,
+				    "Flow Steering", (int)sizeof(**dev_op_flow_steering),
+				    GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+				    option_length, req_feat_mask);
+			break;
+		}
+
+		if (option_length > sizeof(**dev_op_flow_steering)) {
+			PMD_DRV_LOG(WARNING,
+				    GVE_DEVICE_OPTION_TOO_BIG_FMT, "Flow Steering");
+		}
+		*dev_op_flow_steering = RTE_PTR_ADD(option, sizeof(*option));
+		break;
 	case GVE_DEV_OPT_ID_MODIFY_RING:
 		/* Min ring size bound is optional. */
 		if (option_length < (sizeof(**dev_op_modify_ring) -
@@ -167,6 +184,7 @@ gve_process_device_options(struct gve_priv *priv,
 			   struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			   struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			   struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			   struct gve_device_option_flow_steering **dev_op_flow_steering,
 			   struct gve_device_option_modify_ring **dev_op_modify_ring,
 			   struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -188,8 +206,8 @@ gve_process_device_options(struct gve_priv *priv,
 
 		gve_parse_device_option(priv, dev_opt,
 					dev_op_gqi_rda, dev_op_gqi_qpl,
-					dev_op_dqo_rda, dev_op_modify_ring,
-					dev_op_jumbo_frames);
+					dev_op_dqo_rda, dev_op_flow_steering,
+					dev_op_modify_ring, dev_op_jumbo_frames);
 		dev_opt = next_opt;
 	}
 
@@ -777,9 +795,19 @@ gve_set_max_desc_cnt(struct gve_priv *priv,
 
 static void gve_enable_supported_features(struct gve_priv *priv,
 	u32 supported_features_mask,
+	const struct gve_device_option_flow_steering *dev_op_flow_steering,
 	const struct gve_device_option_modify_ring *dev_op_modify_ring,
 	const struct gve_device_option_jumbo_frames *dev_op_jumbo_frames)
 {
+	if (dev_op_flow_steering &&
+	    (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK) &&
+	    dev_op_flow_steering->max_flow_rules) {
+		priv->max_flow_rules =
+			be32_to_cpu(dev_op_flow_steering->max_flow_rules);
+		PMD_DRV_LOG(INFO,
+			    "FLOW STEERING device option enabled with max rule limit of %u.",
+			    priv->max_flow_rules);
+	}
 	if (dev_op_modify_ring &&
 	    (supported_features_mask & GVE_SUP_MODIFY_RING_MASK)) {
 		PMD_DRV_LOG(INFO, "MODIFY RING device option enabled.");
@@ -802,6 +830,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 {
 	struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
 	struct gve_device_option_modify_ring *dev_op_modify_ring = NULL;
+	struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
 	struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
 	struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
 	struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
@@ -829,6 +858,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 
 	err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
 					 &dev_op_gqi_qpl, &dev_op_dqo_rda,
+					 &dev_op_flow_steering,
 					 &dev_op_modify_ring,
 					 &dev_op_jumbo_frames);
 	if (err)
@@ -884,7 +914,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 	priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
 
 	gve_enable_supported_features(priv, supported_features_mask,
-				      dev_op_modify_ring,
+				      dev_op_flow_steering, dev_op_modify_ring,
 				      dev_op_jumbo_frames);
 
 free_device_descriptor:
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index 6a3d469..e237353 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -117,6 +117,14 @@ struct gve_ring_size_bound {
 
 GVE_CHECK_STRUCT_LEN(4, gve_ring_size_bound);
 
+struct gve_device_option_flow_steering {
+	__be32 supported_features_mask;
+	__be32 reserved;
+	__be32 max_flow_rules;
+};
+
+GVE_CHECK_STRUCT_LEN(12, gve_device_option_flow_steering);
+
 struct gve_device_option_modify_ring {
 	__be32 supported_features_mask;
 	struct gve_ring_size_bound max_ring_size;
@@ -148,6 +156,7 @@ enum gve_dev_opt_id {
 	GVE_DEV_OPT_ID_DQO_RDA = 0x4,
 	GVE_DEV_OPT_ID_MODIFY_RING = 0x6,
 	GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+	GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
 };
 
 enum gve_dev_opt_req_feat_mask {
@@ -155,6 +164,7 @@ enum gve_dev_opt_req_feat_mask {
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_RDA = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
+	GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_MODIFY_RING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
 };
@@ -162,6 +172,7 @@ enum gve_dev_opt_req_feat_mask {
 enum gve_sup_feature_mask {
 	GVE_SUP_MODIFY_RING_MASK = 1 << 0,
 	GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+	GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
 };
 
 #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index f7cc781..3a810b6 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -332,6 +332,8 @@ struct gve_priv {
 
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
+
+	uint32_t max_flow_rules;
 };
 
 static inline bool
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/4] net/gve: introduce extended adminq command
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
@ 2026-02-27 19:51 ` Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-02-27 19:51 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary

From: Vee Agarwal <veethebee@google.com>

Flow steering adminq commands are too large to fit into a normal adminq
command buffer which accepts at most 56 bytes. As a result, introduce
extended adminq commands which permit larger command buffers using
indirection. Namely, extended command operations point to inner command
buffers allocated at a specified DMA address. As specified with the
device, all extended commands will use inner opcodes larger than 0xFF.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 30 ++++++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 16 ++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 64b9468..0cc6d44 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -438,6 +438,8 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 
 	memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
 	opcode = be32_to_cpu(READ_ONCE32(cmd->opcode));
+	if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+		opcode = be32_to_cpu(READ_ONCE32(cmd->extended_command.inner_opcode));
 
 	switch (opcode) {
 	case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -516,6 +518,34 @@ static int gve_adminq_execute_cmd(struct gve_priv *priv,
 	return gve_adminq_kick_and_wait(priv);
 }
 
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
+					   size_t cmd_size, void *cmd_orig)
+{
+	union gve_adminq_command cmd;
+	struct gve_dma_mem inner_cmd_dma_mem;
+	void *inner_cmd;
+	int err;
+
+	inner_cmd = gve_alloc_dma_mem(&inner_cmd_dma_mem, cmd_size);
+	if (!inner_cmd)
+		return -ENOMEM;
+
+	memcpy(inner_cmd, cmd_orig, cmd_size);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+	cmd.extended_command = (struct gve_adminq_extended_command) {
+		.inner_opcode = cpu_to_be32(opcode),
+		.inner_length = cpu_to_be32(cmd_size),
+		.inner_command_addr = cpu_to_be64(inner_cmd_dma_mem.pa),
+	};
+
+	err = gve_adminq_execute_cmd(priv, &cmd);
+
+	gve_free_dma_mem(&inner_cmd_dma_mem);
+	return err;
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index e237353..f52658e 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -25,8 +25,15 @@ enum gve_adminq_opcodes {
 	GVE_ADMINQ_REPORT_LINK_SPEED		= 0xD,
 	GVE_ADMINQ_GET_PTYPE_MAP		= 0xE,
 	GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY	= 0xF,
+	/* For commands that are larger than 56 bytes */
+	GVE_ADMINQ_EXTENDED_COMMAND		= 0xFF,
 };
 
+/* The normal adminq command is restricted to be 56 bytes at maximum. For the
+ * longer adminq command, it is wrapped by GVE_ADMINQ_EXTENDED_COMMAND with
+ * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
+ * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
+ */
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -194,6 +201,14 @@ enum gve_driver_capbility {
 #define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
 #define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
 
+struct gve_adminq_extended_command {
+	__be32 inner_opcode;
+	__be32 inner_length;
+	__be64 inner_command_addr;
+};
+
+GVE_CHECK_STRUCT_LEN(16, gve_adminq_extended_command);
+
 struct gve_driver_info {
 	u8 os_type;	/* 0x05 = DPDK */
 	u8 driver_major;
@@ -440,6 +455,7 @@ union gve_adminq_command {
 			struct gve_adminq_get_ptype_map get_ptype_map;
 			struct gve_adminq_verify_driver_compatibility
 				verify_driver_compatibility;
+			struct gve_adminq_extended_command extended_command;
 		};
 	};
 	u8 reserved[64];
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/4] net/gve: add adminq commands for flow steering
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
@ 2026-02-27 19:51 ` Jasper Tran O'Leary
  2026-02-27 19:51 ` [PATCH 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-02-27 19:51 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal

Add new adminq commands for the driver to configure flow rules that are
stored in the device. For configuring flow rules, 3 sub commands are
supported.
- create: creates a new flow rule with a specific rule_id.
- destroy: deletes an existing flow rule with the specified rule_id.
- flush: clears and deletes all currently active flow rules.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 52 +++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 30 ++++++++++++++++
 drivers/net/gve/gve_ethdev.h      |  1 +
 drivers/net/gve/gve_flow_rule.h   | 59 +++++++++++++++++++++++++++++++
 4 files changed, 142 insertions(+)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 0cc6d44..9a94591 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -239,6 +239,7 @@ int gve_adminq_alloc(struct gve_priv *priv)
 	priv->adminq_report_stats_cnt = 0;
 	priv->adminq_report_link_speed_cnt = 0;
 	priv->adminq_get_ptype_map_cnt = 0;
+	priv->adminq_cfg_flow_rule_cnt = 0;
 
 	/* Setup Admin queue with the device */
 	rte_pci_read_config(priv->pci_dev, &pci_rev_id, sizeof(pci_rev_id),
@@ -487,6 +488,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 	case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
 		priv->adminq_verify_driver_compatibility_cnt++;
 		break;
+	case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+		priv->adminq_cfg_flow_rule_cnt++;
+		break;
 	default:
 		PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode);
 	}
@@ -546,6 +550,54 @@ static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
 	return err;
 }
 
+static int
+gve_adminq_configure_flow_rule(struct gve_priv *priv,
+			       struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+	int err = gve_adminq_execute_extended_cmd(priv,
+			GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+			sizeof(struct gve_adminq_configure_flow_rule),
+			flow_rule_cmd);
+
+	return err;
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_ADD),
+		.location = cpu_to_be32(loc),
+		.rule = {
+			.flow_type = cpu_to_be16(rule->flow_type),
+			.action = cpu_to_be16(rule->action),
+			.key = rule->key,
+			.mask = rule->mask,
+		},
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_DEL),
+		.location = cpu_to_be32(loc),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_RESET),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index f52658e..d8e5e6a 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -7,6 +7,7 @@
 #define _GVE_ADMINQ_H
 
 #include "gve_osdep.h"
+#include "../gve_flow_rule.h"
 
 /* Admin queue opcodes */
 enum gve_adminq_opcodes {
@@ -34,6 +35,10 @@ enum gve_adminq_opcodes {
  * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
  * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
  */
+enum gve_adminq_extended_cmd_opcodes {
+	GVE_ADMINQ_CONFIGURE_FLOW_RULE	= 0x101,
+};
+
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -434,6 +439,26 @@ struct gve_adminq_configure_rss {
 	__be64 indir_addr;
 };
 
+/* Flow rule definition for the admin queue using network byte order (big
+ * endian). This struct represents the hardware wire format and should not be
+ * used outside of admin queue contexts.
+ */
+struct gve_adminq_flow_rule {
+	__be16 flow_type;
+	__be16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+struct gve_adminq_configure_flow_rule {
+	__be16 opcode;
+	u8 padding[2];
+	struct gve_adminq_flow_rule rule;
+	__be32 location;
+};
+
+GVE_CHECK_STRUCT_LEN(92, gve_adminq_configure_flow_rule);
+
 union gve_adminq_command {
 	struct {
 		__be32 opcode;
@@ -499,4 +524,9 @@ int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
 int gve_adminq_configure_rss(struct gve_priv *priv,
 			     struct gve_rss_config *rss_config);
 
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_ADMINQ_H */
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 3a810b6..4e07ca8 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -314,6 +314,7 @@ struct gve_priv {
 	uint32_t adminq_report_link_speed_cnt;
 	uint32_t adminq_get_ptype_map_cnt;
 	uint32_t adminq_verify_driver_compatibility_cnt;
+	uint32_t adminq_cfg_flow_rule_cnt;
 	volatile uint32_t state_flags;
 
 	/* Gvnic device link speed from hypervisor. */
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
new file mode 100644
index 0000000..d1a2622
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2022 Intel Corporation
+ */
+
+#ifndef _GVE_FLOW_RULE_H_
+#define _GVE_FLOW_RULE_H_
+
+#include "base/gve_osdep.h"
+
+enum gve_adminq_flow_rule_cfg_opcode {
+	GVE_FLOW_RULE_CFG_ADD	= 0,
+	GVE_FLOW_RULE_CFG_DEL	= 1,
+	GVE_FLOW_RULE_CFG_RESET	= 2,
+};
+
+enum gve_adminq_flow_type {
+	GVE_FLOW_TYPE_TCPV4,
+	GVE_FLOW_TYPE_UDPV4,
+	GVE_FLOW_TYPE_SCTPV4,
+	GVE_FLOW_TYPE_AHV4,
+	GVE_FLOW_TYPE_ESPV4,
+	GVE_FLOW_TYPE_TCPV6,
+	GVE_FLOW_TYPE_UDPV6,
+	GVE_FLOW_TYPE_SCTPV6,
+	GVE_FLOW_TYPE_AHV6,
+	GVE_FLOW_TYPE_ESPV6,
+};
+
+struct gve_flow_spec {
+	__be32 src_ip[4];
+	__be32 dst_ip[4];
+	union {
+		struct {
+			__be16 src_port;
+			__be16 dst_port;
+		};
+		__be32 spi;
+	};
+	union {
+		u8 tos;
+		u8 tclass;
+	};
+};
+
+/* Flow rule parameters using mixed endianness.
+ * - flow_type and action are guest endian.
+ * - key and mask are in network byte order (big endian), matching rte_flow.
+ * This struct is used by the driver when validating and creating flow rules;
+ * guest endian fields are only converted to network byte order within admin
+ * queue functions.
+ */
+struct gve_flow_rule_params {
+	u16 flow_type;
+	u16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+#endif /* _GVE_FLOW_RULE_H_ */
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/4] net/gve: add rte flow API integration
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
                   ` (2 preceding siblings ...)
  2026-02-27 19:51 ` [PATCH 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
@ 2026-02-27 19:51 ` Jasper Tran O'Leary
  2026-02-27 22:52 ` [PATCH 0/4] net/gve: add flow steering support Stephen Hemminger
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-02-27 19:51 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal

Implement driver callbacks for the following rte flow operations:
create, destroy, and flush. This change enables receive flow steering
(RFS) for n-tuple based flow rules for the gve driver.

The implementation supports matching ingress IPv4/IPv6 traffic combined
with TCP, UDP, SCTP, ESP, or AH protocols. Supported fields for
matching include IP source/destination addresses, L4 source/destination
ports (for TCP/UDP/SCTP), and SPI (for ESP/AH). The only supported
action is RTE_FLOW_ACTION_TYPE_QUEUE, which steers matching packets to
a specified rx queue.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
---
 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  20 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/gve_ethdev.c           |  87 +++-
 drivers/net/gve/gve_ethdev.h           |  43 ++
 drivers/net/gve/gve_flow_rule.c        | 645 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |   5 +
 drivers/net/gve/meson.build            |   1 +
 9 files changed, 815 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c

diff --git a/doc/guides/nics/features/gve.ini b/doc/guides/nics/features/gve.ini
index ed040a0..89c97fd 100644
--- a/doc/guides/nics/features/gve.ini
+++ b/doc/guides/nics/features/gve.ini
@@ -19,3 +19,15 @@ Linux                = Y
 x86-32               = Y
 x86-64               = Y
 Usage doc            = Y
+
+[rte_flow items]
+ah                   = Y
+esp                  = Y
+ipv4                 = Y
+ipv6                 = Y
+sctp                 = Y
+tcp                  = Y
+udp                  = Y
+
+[rte_flow actions]
+queue                = Y
diff --git a/doc/guides/nics/gve.rst b/doc/guides/nics/gve.rst
index 6b4d1f7..59e0066 100644
--- a/doc/guides/nics/gve.rst
+++ b/doc/guides/nics/gve.rst
@@ -103,6 +103,26 @@ the redirection table will be available for querying upon initial hash configura
 When performing redirection table updates,
 it is possible to update individual table entries.
 
+Flow Steering
+^^^^^^^^^^^^^
+
+The driver supports receive flow steering (RFS) via the standard ``rte_flow``
+API. This allows applications to steer traffic to specific queues based on
+5-tuple matching. 3-tuple matching may be supported in future releases.
+
+Supported Patterns:
+  - IPv4/IPv6 source and destination addresses.
+  - TCP/UDP/SCTP source and destination ports.
+  - ESP/AH SPI.
+
+Supported Actions:
+  - ``RTE_FLOW_ACTION_TYPE_QUEUE``: Steer packets to a specific Rx queue.
+
+Limitations:
+  - Only ingress flow rules are supported.
+  - Flow priorities are not supported (must be 0).
+  - Masking is limited to full matches i.e. 0x00...0 or 0xFF...F.
+
 Application-Initiated Reset
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The driver allows an application to reset the gVNIC device.
diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst
index 1855d90..e45ed27 100644
--- a/doc/guides/rel_notes/release_26_03.rst
+++ b/doc/guides/rel_notes/release_26_03.rst
@@ -78,6 +78,7 @@ New Features
 * **Updated Google Virtual Ethernet (gve) driver.**
 
   * Added application-initiated device reset.
+  * Add support for receive flow steering.
 
 * **Updated Intel iavf driver.**
 
diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h
index 99514cb..18363fa 100644
--- a/drivers/net/gve/base/gve.h
+++ b/drivers/net/gve/base/gve.h
@@ -50,7 +50,8 @@ enum gve_state_flags_bit {
 	GVE_PRIV_FLAGS_ADMIN_QUEUE_OK		= 1,
 	GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK	= 2,
 	GVE_PRIV_FLAGS_DEVICE_RINGS_OK		= 3,
-	GVE_PRIV_FLAGS_NAPI_ENABLED		= 4,
+	GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK	= 4,
+	GVE_PRIV_FLAGS_NAPI_ENABLED		= 5,
 };
 
 enum gve_rss_hash_algorithm {
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index 5912fec..0d4caab 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -510,6 +510,57 @@ gve_free_ptype_lut_dqo(struct gve_priv *priv)
 	}
 }
 
+static void
+gve_flow_free_bmp(struct gve_priv *priv)
+{
+	rte_free(priv->avail_flow_rule_bmp_mem);
+	priv->avail_flow_rule_bmp_mem = NULL;
+	priv->avail_flow_rule_bmp = NULL;
+}
+
+static int
+gve_setup_flow_subsystem(struct gve_priv *priv)
+{
+	int err;
+
+	priv->flow_rule_bmp_size =
+			rte_bitmap_get_memory_footprint(priv->max_flow_rules);
+	priv->avail_flow_rule_bmp_mem = rte_zmalloc("gve_flow_rule_bmp",
+			priv->flow_rule_bmp_size, 0);
+	if (!priv->avail_flow_rule_bmp_mem) {
+		PMD_DRV_LOG(ERR, "Failed to alloc bitmap for flow rules.");
+		err = -ENOMEM;
+		goto free_flow_rule_bmp;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Failed to initialize flow rule bitmap.");
+		goto free_flow_rule_bmp;
+	}
+
+	TAILQ_INIT(&priv->active_flows);
+	gve_set_flow_subsystem_ok(priv);
+
+	return 0;
+
+free_flow_rule_bmp:
+	gve_flow_free_bmp(priv);
+	return err;
+}
+
+static void
+gve_teardown_flow_subsystem(struct gve_priv *priv)
+{
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+	gve_free_flow_rules(priv);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+}
+
 static void
 gve_teardown_device_resources(struct gve_priv *priv)
 {
@@ -519,7 +570,9 @@ gve_teardown_device_resources(struct gve_priv *priv)
 	if (gve_get_device_resources_ok(priv)) {
 		err = gve_adminq_deconfigure_device_resources(priv);
 		if (err)
-			PMD_DRV_LOG(ERR, "Could not deconfigure device resources: err=%d", err);
+			PMD_DRV_LOG(ERR,
+				"Could not deconfigure device resources: err=%d",
+				err);
 	}
 
 	gve_free_ptype_lut_dqo(priv);
@@ -543,6 +596,11 @@ gve_dev_close(struct rte_eth_dev *dev)
 			PMD_DRV_LOG(ERR, "Failed to stop dev.");
 	}
 
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
+	pthread_mutex_destroy(&priv->flow_rule_lock);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -566,6 +624,9 @@ gve_dev_reset(struct rte_eth_dev *dev)
 	}
 
 	/* Tear down all device resources before re-initializing. */
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -1094,6 +1155,18 @@ gve_rss_reta_query(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+gve_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+
+	if (!gve_get_flow_subsystem_ok(priv))
+		return -ENOTSUP;
+
+	*ops = &gve_flow_ops;
+	return 0;
+}
+
 static const struct eth_dev_ops gve_eth_dev_ops = {
 	.dev_configure        = gve_dev_configure,
 	.dev_start            = gve_dev_start,
@@ -1109,6 +1182,7 @@ static const struct eth_dev_ops gve_eth_dev_ops = {
 	.tx_queue_start       = gve_tx_queue_start,
 	.rx_queue_stop        = gve_rx_queue_stop,
 	.tx_queue_stop        = gve_tx_queue_stop,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1136,6 +1210,7 @@ static const struct eth_dev_ops gve_eth_dev_ops_dqo = {
 	.tx_queue_start       = gve_tx_queue_start_dqo,
 	.rx_queue_stop        = gve_rx_queue_stop_dqo,
 	.tx_queue_stop        = gve_tx_queue_stop_dqo,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1303,6 +1378,14 @@ gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
 		    priv->max_nb_txq, priv->max_nb_rxq);
 
 setup_device:
+	if (priv->max_flow_rules) {
+		err = gve_setup_flow_subsystem(priv);
+		if (err)
+			PMD_DRV_LOG(WARNING,
+				    "Failed to set up flow subsystem: err=%d, flow steering will be disabled.",
+				    err);
+	}
+
 	err = gve_setup_device_resources(priv);
 	if (!err)
 		return 0;
@@ -1377,6 +1460,8 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 
 	eth_dev->data->mac_addrs = &priv->dev_addr;
 
+	pthread_mutex_init(&priv->flow_rule_lock, NULL);
+
 	return 0;
 }
 
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 4e07ca8..2d570d0 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -9,6 +9,8 @@
 #include <ethdev_pci.h>
 #include <rte_ether.h>
 #include <rte_pci.h>
+#include <pthread.h>
+#include <rte_bitmap.h>
 
 #include "base/gve.h"
 
@@ -252,6 +254,13 @@ struct gve_rx_queue {
 	uint8_t is_gqi_qpl;
 };
 
+struct gve_flow {
+	uint32_t rule_id;
+	TAILQ_ENTRY(gve_flow) list_handle;
+};
+
+extern const struct rte_flow_ops gve_flow_ops;
+
 struct gve_priv {
 	struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
 	const struct rte_memzone *irq_dbs_mz;
@@ -334,7 +343,13 @@ struct gve_priv {
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
 
+	/* Flow rule management */
 	uint32_t max_flow_rules;
+	uint32_t flow_rule_bmp_size;
+	struct rte_bitmap *avail_flow_rule_bmp; /* Tracks available rule IDs (1 = available) */
+	void *avail_flow_rule_bmp_mem; /* Backing memory for the bitmap */
+	pthread_mutex_t flow_rule_lock; /* Lock for bitmap and tailq access */
+	TAILQ_HEAD(, gve_flow) active_flows;
 };
 
 static inline bool
@@ -407,6 +422,34 @@ gve_clear_device_rings_ok(struct gve_priv *priv)
 				&priv->state_flags);
 }
 
+static inline bool
+gve_get_flow_subsystem_ok(struct gve_priv *priv)
+{
+	bool ret;
+
+	ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				      &priv->state_flags);
+	rte_atomic_thread_fence(rte_memory_order_acquire);
+
+	return ret;
+}
+
+static inline void
+gve_set_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				&priv->state_flags);
+}
+
 int
 gve_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_id, uint16_t nb_desc,
 		   unsigned int socket_id, const struct rte_eth_rxconf *conf,
diff --git a/drivers/net/gve/gve_flow_rule.c b/drivers/net/gve/gve_flow_rule.c
new file mode 100644
index 0000000..fae5edf
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.c
@@ -0,0 +1,645 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2023 Google LLC
+ */
+
+#include <rte_flow.h>
+#include <rte_flow_driver.h>
+#include "base/gve_adminq.h"
+#include "gve_ethdev.h"
+
+static int
+gve_validate_flow_attr(const struct rte_flow_attr *attr,
+		       struct rte_flow_error *error)
+{
+	if (!attr) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, NULL,
+				"Invalid flow attribute");
+		return -EINVAL;
+	}
+	if (attr->egress || attr->transfer) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, attr,
+				"Only ingress is supported");
+		return -EINVAL;
+	}
+	if (!attr->ingress) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, attr,
+				"Ingress attribute must be set");
+		return -EINVAL;
+	}
+	if (attr->priority != 0) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, attr,
+				"Priority levels are not supported");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void
+gve_parse_ipv4(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv4 *spec = item->spec;
+		const struct rte_flow_item_ipv4 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv4_mask;
+
+		rule->key.src_ip[0] = spec->hdr.src_addr;
+		rule->key.dst_ip[0] = spec->hdr.dst_addr;
+		rule->mask.src_ip[0] = mask->hdr.src_addr;
+		rule->mask.dst_ip[0] = mask->hdr.dst_addr;
+	}
+}
+
+static void
+gve_parse_ipv6(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv6 *spec = item->spec;
+		const struct rte_flow_item_ipv6 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv6_mask;
+		const __be32 *src_ip = (const __be32 *)&spec->hdr.src_addr;
+		const __be32 *src_mask = (const __be32 *)&mask->hdr.src_addr;
+		const __be32 *dst_ip = (const __be32 *)&spec->hdr.dst_addr;
+		const __be32 *dst_mask = (const __be32 *)&mask->hdr.dst_addr;
+		int i;
+
+		/*
+		 * The device expects IPv6 addresses as an array of 4 32-bit words
+		 * in reverse word order (the MSB word at index 3 and the LSB word
+		 * at index 0). We must reverse the DPDK network byte order array.
+		 */
+		for (i = 0; i < 4; i++) {
+			rule->key.src_ip[3 - i] = src_ip[i];
+			rule->key.dst_ip[3 - i] = dst_ip[i];
+			rule->mask.src_ip[3 - i] = src_mask[i];
+			rule->mask.dst_ip[3 - i] = dst_mask[i];
+		}
+	}
+}
+
+static void
+gve_parse_udp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_udp *spec = item->spec;
+		const struct rte_flow_item_udp *mask =
+			item->mask ? item->mask : &rte_flow_item_udp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_tcp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_tcp *spec = item->spec;
+		const struct rte_flow_item_tcp *mask =
+			item->mask ? item->mask : &rte_flow_item_tcp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_sctp(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_sctp *spec = item->spec;
+		const struct rte_flow_item_sctp *mask =
+			item->mask ? item->mask : &rte_flow_item_sctp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_esp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_esp *spec = item->spec;
+		const struct rte_flow_item_esp *mask =
+			item->mask ? item->mask : &rte_flow_item_esp_mask;
+
+		rule->key.spi = spec->hdr.spi;
+		rule->mask.spi = mask->hdr.spi;
+	}
+}
+
+static void
+gve_parse_ah(const struct rte_flow_item *item, struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ah *spec = item->spec;
+		const struct rte_flow_item_ah *mask =
+			item->mask ? item->mask : &rte_flow_item_ah_mask;
+
+		rule->key.spi = spec->spi;
+		rule->mask.spi = mask->spi;
+	}
+}
+
+static int
+gve_validate_and_parse_flow_pattern(const struct rte_flow_item pattern[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_item *item = pattern;
+	enum rte_flow_item_type l3_type = RTE_FLOW_ITEM_TYPE_VOID;
+	enum rte_flow_item_type l4_type = RTE_FLOW_ITEM_TYPE_VOID;
+
+	if (!pattern) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM_NUM, NULL,
+				"Invalid flow pattern");
+		return -EINVAL;
+	}
+
+	for (; item->type != RTE_FLOW_ITEM_TYPE_END; item++) {
+		if (item->last) {
+			/* Last and range are not supported as match criteria. */
+			rte_flow_error_set(error, EINVAL,
+					   RTE_FLOW_ERROR_TYPE_ITEM,
+					   item,
+					   "No support for range");
+			return -EINVAL;
+		}
+		switch (item->type) {
+		case RTE_FLOW_ITEM_TYPE_VOID:
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV4:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv4(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV6:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv6(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_udp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_tcp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_sctp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_esp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ah(item, rule);
+			l4_type = item->type;
+			break;
+		default:
+			rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Unsupported flow pattern item type");
+			return -EINVAL;
+		}
+	}
+
+	switch (l3_type) {
+	case RTE_FLOW_ITEM_TYPE_IPV4:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	case RTE_FLOW_ITEM_TYPE_IPV6:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	default:
+		goto unsupported_flow;
+	}
+
+	return 0;
+
+unsupported_flow:
+	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
+			   NULL, "Unsupported L3/L4 combination");
+	return -EINVAL;
+}
+
+static int
+gve_validate_and_parse_flow_actions(struct rte_eth_dev *dev,
+				    const struct rte_flow_action actions[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_action_queue *action_queue;
+	const struct rte_flow_action *action = actions;
+	int num_queue_actions = 0;
+
+	if (!actions) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM, NULL,
+				   "Invalid flow actions");
+		return -EINVAL;
+	}
+
+	while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_VOID:
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			if (!action->conf) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action,
+						   "QUEUE action config cannot be NULL.");
+				return -EINVAL;
+			}
+
+			action_queue = action->conf;
+			if (action_queue->index >= dev->data->nb_rx_queues) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action, "Invalid Queue ID");
+				return -EINVAL;
+			}
+
+			rule->action = action_queue->index;
+			num_queue_actions++;
+			break;
+		default:
+			rte_flow_error_set(error, ENOTSUP,
+					   RTE_FLOW_ERROR_TYPE_ACTION,
+					   action,
+					   "Unsupported action. Only QUEUE is permitted.");
+			return -ENOTSUP;
+		}
+		action++;
+	}
+
+	if (num_queue_actions == 0) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "A QUEUE action is required.");
+		return -EINVAL;
+	}
+
+	if (num_queue_actions > 1) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "Only a single QUEUE action is allowed.");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+gve_validate_and_parse_flow(struct rte_eth_dev *dev,
+			    const struct rte_flow_attr *attr,
+			    const struct rte_flow_item pattern[],
+			    const struct rte_flow_action actions[],
+			    struct rte_flow_error *error,
+			    struct gve_flow_rule_params *rule)
+{
+	int err;
+
+	err = gve_validate_flow_attr(attr, error);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_pattern(pattern, error, rule);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_actions(dev, actions, error, rule);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+int
+gve_flow_init_bmp(struct gve_priv *priv)
+{
+	priv->avail_flow_rule_bmp = rte_bitmap_init_with_all_set(priv->max_flow_rules,
+			priv->avail_flow_rule_bmp_mem, priv->flow_rule_bmp_size);
+	if (!priv->avail_flow_rule_bmp) {
+		PMD_DRV_LOG(ERR, "Flow subsystem failed: cannot init bitmap.");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+/*
+ * The caller must acquire the flow rule lock before calling this function.
+ */
+int
+gve_free_flow_rules(struct gve_priv *priv)
+{
+	struct gve_flow *flow;
+	int err = 0;
+
+	if (!TAILQ_EMPTY(&priv->active_flows)) {
+		err = gve_adminq_reset_flow_rules(priv);
+		if (err) {
+			PMD_DRV_LOG(ERR,
+				"Failed to reset flow rules, internal device err=%d",
+				err);
+		}
+
+		/* Free flows even if AQ fails to avoid leaking memory. */
+		while (!TAILQ_EMPTY(&priv->active_flows)) {
+			flow = TAILQ_FIRST(&priv->active_flows);
+			TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+			rte_free(flow);
+		}
+	}
+
+	return err;
+}
+
+static struct rte_flow *
+gve_create_flow_rule(struct rte_eth_dev *dev,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item pattern[],
+		     const struct rte_flow_action actions[],
+		     struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow_rule_params rule = {0};
+	uint64_t bmp_slab __rte_unused;
+	struct gve_flow *flow;
+	int err;
+
+	err = gve_validate_and_parse_flow(dev, attr, pattern, actions, error,
+					  &rule);
+	if (err)
+		return NULL;
+
+	flow = rte_zmalloc("gve_flow", sizeof(struct gve_flow), 0);
+	if (!flow) {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to allocate memory for flow rule.");
+		return NULL;
+	}
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, flow subsystem not initialized.");
+		goto free_flow_and_unlock;
+	}
+
+	/* Try to allocate a new rule ID from the bitmap. */
+	if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &flow->rule_id,
+			&bmp_slab) == 1) {
+		rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
+	} else {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, could not allocate a new rule ID.");
+		goto free_flow_and_unlock;
+	}
+
+	err = gve_adminq_add_flow_rule(priv, &rule, flow->rule_id);
+	if (err) {
+		rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+		rte_flow_error_set(error, -err,
+				   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				   "Failed to create flow rule, internal device error.");
+		goto free_flow_and_unlock;
+	}
+
+	TAILQ_INSERT_TAIL(&priv->active_flows, flow, list_handle);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return (struct rte_flow *)flow;
+
+free_flow_and_unlock:
+	rte_free(flow);
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return NULL;
+}
+
+static int
+gve_destroy_flow_rule(struct rte_eth_dev *dev, struct rte_flow *flow_handle,
+		      struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow *flow;
+	bool flow_rule_active;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock_and_return;
+	}
+
+	flow = (struct gve_flow *)flow_handle;
+
+	if (!flow) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, invalid flow provided.");
+		err = -EINVAL;
+		goto unlock_and_return;
+	}
+
+	if (flow->rule_id >= priv->max_flow_rules) {
+		PMD_DRV_LOG(ERR,
+			"Cannot destroy flow rule with invalid ID %d.",
+			flow->rule_id);
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, rule ID is invalid.");
+		err = -EINVAL;
+		goto unlock_and_return;
+	}
+
+	flow_rule_active = !rte_bitmap_get(priv->avail_flow_rule_bmp,
+					   flow->rule_id);
+
+	if (!flow_rule_active) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, handle not found in active list.");
+		err = -EINVAL;
+		goto unlock_and_return;
+	}
+
+	err = gve_adminq_del_flow_rule(priv, flow->rule_id);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, internal device error.");
+		goto unlock_and_return;
+	}
+
+	rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+	TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+	rte_free(flow);
+
+	err = 0;
+
+unlock_and_return:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+static int
+gve_flush_flow_rules(struct rte_eth_dev *dev, struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock_and_return;
+	}
+
+	err = gve_free_flow_rules(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules due to internal device error, disabling flow subsystem.");
+		gve_clear_flow_subsystem_ok(priv);
+		goto unlock_and_return;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to re-initialize rule ID bitmap, disabling flow subsystem.");
+		gve_clear_flow_subsystem_ok(priv);
+		goto unlock_and_return;
+	}
+
+	err = 0;
+
+unlock_and_return:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+const struct rte_flow_ops gve_flow_ops = {
+	.create = gve_create_flow_rule,
+	.destroy = gve_destroy_flow_rule,
+	.flush = gve_flush_flow_rules,
+};
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
index d1a2622..d483914 100644
--- a/drivers/net/gve/gve_flow_rule.h
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -56,4 +56,9 @@ struct gve_flow_rule_params {
 	struct gve_flow_spec mask;
 };
 
+struct gve_priv;
+
+int gve_flow_init_bmp(struct gve_priv *priv);
+int gve_free_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_FLOW_RULE_H_ */
diff --git a/drivers/net/gve/meson.build b/drivers/net/gve/meson.build
index c6a9f36..7074988 100644
--- a/drivers/net/gve/meson.build
+++ b/drivers/net/gve/meson.build
@@ -16,5 +16,6 @@ sources = files(
         'gve_ethdev.c',
         'gve_version.c',
         'gve_rss.c',
+        'gve_flow_rule.c',
 )
 includes += include_directories('base')
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] net/gve: add flow steering support
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
                   ` (3 preceding siblings ...)
  2026-02-27 19:51 ` [PATCH 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
@ 2026-02-27 22:52 ` Stephen Hemminger
  2026-03-03  1:00   ` Jasper Tran O'Leary
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
  5 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2026-02-27 22:52 UTC (permalink / raw)
  To: Jasper Tran O'Leary; +Cc: dev

On Fri, 27 Feb 2026 19:51:22 +0000
"Jasper Tran O'Leary" <jtranoleary@google.com> wrote:

> This patch series adds flow steering support to the Google Virtual
> Ethernet (gve) driver. This functionality allows traffic to be directed
> to specific receive queues based on user-specified flow patterns.
> 
> The series includes foundational support for extended admin queue
> commands needed to handle flow rules, the specific adminqueue commands
> for flow rule management, and the integration with the DPDK rte_flow
> API. The series adds support flow matching on the following protocols:
> IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> 
> Patch Overview:
> 
> 1. "net/gve: add flow steering device option" checks for and enables
>    the flow steering capability in the device options during
>    initialization.
> 2. "net/gve: introduce extended adminq command" adds infrastructure
>    for sending extended admin queue commands. These commands use a
>    flexible buffer descriptor format required for flow rule management.
> 3. "net/gve: add adminq commands for flow steering" implements the
>    specific admin queue commands to add and remove flow rules on the
>    device, including handling of rule IDs and parameters.
> 4. "net/gve: add rte flow API integration" exposes the flow steering
>    functionality via the DPDK rte_flow API. This includes strict
>    pattern validation, rule parsing, and lifecycle management (create,
>    destroy, flush). It ensures thread-safe access to the flow subsystem
>    and proper resource cleanup during device reset.
> 
> Jasper Tran O'Leary (2):
>   net/gve: add adminq commands for flow steering
>   net/gve: add rte flow API integration
> 
> Vee Agarwal (2):
>   net/gve: add flow steering device option
>   net/gve: introduce extended adminq command
> 
>  doc/guides/nics/features/gve.ini       |  12 +
>  doc/guides/nics/gve.rst                |  20 +
>  doc/guides/rel_notes/release_26_03.rst |   1 +
>  drivers/net/gve/base/gve.h             |   3 +-
>  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
>  drivers/net/gve/base/gve_adminq.h      |  57 +++
>  drivers/net/gve/gve_ethdev.c           |  87 +++-
>  drivers/net/gve/gve_ethdev.h           |  46 ++
>  drivers/net/gve/gve_flow_rule.c        | 645 +++++++++++++++++++++++++
>  drivers/net/gve/gve_flow_rule.h        |  64 +++
>  drivers/net/gve/meson.build            |   1 +
>  11 files changed, 1049 insertions(+), 5 deletions(-)
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> 

There is a lot here, so sent AI to take a look.

Summary:
Error 1 — Resource leak: If gve_flush_flow_rules fails and disables the
flow subsystem via gve_clear_flow_subsystem_ok(), the bitmap memory
(avail_flow_rule_bmp_mem) is never freed because both gve_dev_close and
gve_dev_reset gate their teardown on gve_get_flow_subsystem_ok()
returning true. The guard needs to check for allocated memory rather
than subsystem state.

Error 2 — pthread_mutex_init with NULL attributes in shared memory: The
flow_rule_lock lives in dev_private (DPDK shared memory) but is
initialized without PTHREAD_PROCESS_SHARED. This will cause undefined
behavior with secondary processes. Switching to rte_spinlock_t would be
the simplest fix since the critical sections are short (bitmap scan +
TAILQ operations).

Patches 1–3 are clean. The overall structure and error handling in patch 4 is solid — the allocation-before-lock pattern, bitmap rollback on adminq failure, and defense-in-depth validation in destroy are all well done.
Long form:
# Code Review: GVE Flow Steering Patch Series (4 patches)

**Series**: `[PATCH 1/4]` through `[PATCH 4/4]`
**Author**: Jasper Tran O'Leary / Vee Agarwal (Google)
**Subject**: Add receive flow steering (RFS) support to the GVE driver

---

## Summary

This series adds n-tuple flow steering to the GVE (Google Virtual Ethernet) driver via the `rte_flow` API. The implementation is cleanly structured across four patches: device option discovery (1/4), extended adminq infrastructure (2/4), flow rule adminq commands (3/4), and full rte_flow integration (4/4). The code quality is generally good with thorough validation, proper locking, and well-structured error handling.

Two correctness issues were identified: a resource leak when the flow subsystem is disabled on error, and a missing `PTHREAD_PROCESS_SHARED` attribute on a mutex in shared memory.

---

## Patch 1/4: `net/gve: add flow steering device option`

### Errors

None.

### Warnings

None.

This patch is clean. The device option parsing follows the established pattern for existing options (modify_ring, jumbo_frames), the byte-swap of `max_flow_rules` is correct, and the zero-check on the big-endian value before byte-swap is valid (non-zero in any byte order is still non-zero).

---

## Patch 2/4: `net/gve: introduce extended adminq command`

### Errors

None.

### Warnings

None.

The extended command mechanism is straightforward: allocate DMA memory for the inner command, copy the command in, set up the outer wrapper with the DMA address, execute, and free. The error path is correct — `gve_free_dma_mem` is called after `gve_adminq_execute_cmd` regardless of success or failure.

---

## Patch 3/4: `net/gve: add adminq commands for flow steering`

### Errors

None.

### Warnings

**1. `gve_flow_rule.h` copyright appears to be copied from an Intel driver file.**

```c
/* SPDX-License-Identifier: BSD-3-Clause
 * Copyright(C) 2022 Intel Corporation
 */
```

This is a new file in the Google GVE driver. The Intel copyright and 2022 date look like they were carried over from whichever file was used as a template. This should be updated to reflect the actual author.

---

## Patch 4/4: `net/gve: add rte flow API integration`

This is the largest patch (815 lines added) and where the significant findings are.

### Errors

**1. Resource leak: bitmap memory leaked when flow subsystem is disabled on error.**

In `gve_flush_flow_rules`, if either `gve_free_flow_rules()` or `gve_flow_init_bmp()` fails, the code disables the subsystem:

```c
gve_clear_flow_subsystem_ok(priv);
```

However, in both `gve_dev_close` and `gve_dev_reset`, teardown is gated on the flag:

```c
if (gve_get_flow_subsystem_ok(priv))
    gve_teardown_flow_subsystem(priv);
```

If the subsystem was disabled by a failed flush, `gve_teardown_flow_subsystem` is never called, and `priv->avail_flow_rule_bmp_mem` is never freed. This is a memory leak.

**Suggested fix**: Either (a) always call `gve_flow_free_bmp(priv)` in close/reset regardless of the flag, or (b) have the flush error path free the bitmap memory itself, or (c) unconditionally call teardown in close/reset:

```c
/* In gve_dev_close / gve_dev_reset: */
if (priv->avail_flow_rule_bmp_mem)
    gve_teardown_flow_subsystem(priv);
```

**2. `pthread_mutex_init` without `PTHREAD_PROCESS_SHARED` on a mutex in shared memory.**

In `gve_dev_init`:

```c
pthread_mutex_init(&priv->flow_rule_lock, NULL);
```

`priv` is `dev->data->dev_private`, which is allocated in DPDK shared memory accessible by both primary and secondary processes. A pthread mutex in shared memory initialized with `NULL` attributes has undefined behavior when used across processes. It may appear to work in testing but fail in production with secondary processes.

**Suggested fix**: Either use `PTHREAD_PROCESS_SHARED`:

```c
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pthread_mutex_init(&priv->flow_rule_lock, &attr);
pthread_mutexattr_destroy(&attr);
```

Or switch to `rte_spinlock_t` which works correctly in shared memory without special initialization (appropriate here since the critical sections are short — bitmap scan, TAILQ insert/remove):

```c
rte_spinlock_t flow_rule_lock;
/* ... */
rte_spinlock_init(&priv->flow_rule_lock);
```

### Warnings

**3. Implicit pointer comparisons throughout `gve_flow_rule.c`.**

Several places use `!pointer` instead of `pointer == NULL`:

```c
if (!flow) {          /* line ~1641, ~1710 */
if (!action->conf) {  /* line ~1510 */
if (!attr) {          /* line ~1173 */
if (!actions) {       /* line ~1498 */
if (!pattern) {       /* line ~1329 */
```

DPDK coding style requires explicit comparison with NULL for pointers. The idiomatic form is `if (flow == NULL)`.

**4. `rte_zmalloc` used for flow rule metadata that does not require hugepage memory.**

The `gve_flow` structs (8 bytes: a `uint32_t` + TAILQ pointers) are allocated via `rte_zmalloc`. These are small control-plane structures not accessed by DMA and not requiring NUMA placement. Standard `malloc` would be more appropriate per DPDK guidelines and would not consume limited hugepage resources.

Similarly, `priv->avail_flow_rule_bmp_mem` is allocated with `rte_zmalloc`. Since `rte_bitmap` may require specific alignment, this one is more defensible, but worth considering whether standard allocation would suffice.

**5. Standalone `rte_atomic_thread_fence()` in flow subsystem flag accessors.**

The `gve_get_flow_subsystem_ok` / `gve_set_flow_subsystem_ok` / `gve_clear_flow_subsystem_ok` functions use standalone fences with `rte_bit_relaxed_*` operations:

```c
static inline bool
gve_get_flow_subsystem_ok(struct gve_priv *priv)
{
    bool ret;
    ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
                                  &priv->state_flags);
    rte_atomic_thread_fence(rte_memory_order_acquire);
    return ret;
}
```

These follow the existing pattern for other state flags in the driver, so this is consistent. However, it's worth noting that in every call site in this patch, the flag is checked while holding `flow_rule_lock`, which already provides the necessary memory ordering. The fences are redundant in those paths (but harmless).

**6. RST documentation uses simple bullet lists where definition lists would be cleaner.**

In `doc/guides/nics/gve.rst`:

```rst
Supported Patterns:
  - IPv4/IPv6 source and destination addresses.
  - TCP/UDP/SCTP source and destination ports.
  - ESP/AH SPI.
```

These term+list groupings would read better as RST definition lists. This is minor given the lists are short.

### Design Notes (Info)

**Well-structured error handling in `gve_create_flow_rule`**: The allocation-before-lock pattern avoids holding the mutex during memory allocation, and the `free_flow_and_unlock` goto label correctly handles all error paths. The bitmap-set-on-error rollback in the adminq failure case is also correct.

**`gve_destroy_flow_rule` double-validates flow state**: The function checks both the flow pointer for NULL and verifies `rule_id < max_flow_rules` before checking the bitmap. This defense-in-depth is good practice.

**IPv6 word-reversal in `gve_parse_ipv6`**: The comment explaining the device's expected word order is clear and the implementation correctly reverses the 32-bit words. This is the kind of hardware-specific detail that benefits from the inline comment.

---

## Cross-Patch Observations

The series is well-ordered: each patch builds incrementally and the dependencies flow naturally (device option → extended command infrastructure → flow rule commands → rte_flow integration). The documentation, feature matrix, and release notes are all updated in patch 4/4 together with the code, which is correct per DPDK guidelines.

The `Co-developed-by` / `Signed-off-by` tag sequences are correctly formatted per Linux kernel convention.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 0/4] net/gve: add flow steering support
  2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
                   ` (4 preceding siblings ...)
  2026-02-27 22:52 ` [PATCH 0/4] net/gve: add flow steering support Stephen Hemminger
@ 2026-03-03  0:58 ` Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
                     ` (5 more replies)
  5 siblings, 6 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  0:58 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary

This patch series adds flow steering support to the Google Virtual
Ethernet (gve) driver. This functionality allows traffic to be directed
to specific receive queues based on user-specified flow patterns.

The series includes foundational support for extended admin queue
commands needed to handle flow rules, the specific adminqueue commands
for flow rule management, and the integration with the DPDK rte_flow
API. The series adds support flow matching on the following protocols:
IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.

Patch Overview:

1. "net/gve: add flow steering device option" checks for and enables
   the flow steering capability in the device options during
   initialization.
2. "net/gve: introduce extended adminq command" adds infrastructure
   for sending extended admin queue commands. These commands use a
   flexible buffer descriptor format required for flow rule management.
3. "net/gve: add adminq commands for flow steering" implements the
   specific admin queue commands to add and remove flow rules on the
   device, including handling of rule IDs and parameters.
4. "net/gve: add rte flow API integration" exposes the flow steering
   functionality via the DPDK rte_flow API. This includes strict
   pattern validation, rule parsing, and lifecycle management (create,
   destroy, flush). It ensures thread-safe access to the flow subsystem
   and proper resource cleanup during device reset.

Jasper Tran O'Leary (2):
  net/gve: add adminq commands for flow steering
  net/gve: add rte flow API integration

Vee Agarwal (2):
  net/gve: add flow steering device option
  net/gve: introduce extended adminq command

 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  26 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/base/gve_adminq.c      | 118 ++++-
 drivers/net/gve/base/gve_adminq.h      |  57 +++
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  46 ++
 drivers/net/gve/gve_flow_rule.c        | 656 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |  65 +++
 drivers/net/gve/meson.build            |   1 +
 11 files changed, 1063 insertions(+), 5 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/4] net/gve: add flow steering device option
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
@ 2026-03-03  0:58   ` Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  0:58 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Add a new device option to signal to the driver that the device supports
flow steering. This device option also carries the maximum number of
flow steering rules that the device can store.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 36 ++++++++++++++++++++++++++++---
 drivers/net/gve/base/gve_adminq.h | 11 ++++++++++
 drivers/net/gve/gve_ethdev.h      |  2 ++
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 6bd98d5..64b9468 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -36,6 +36,7 @@ void gve_parse_device_option(struct gve_priv *priv,
 			     struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			     struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			     struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			     struct gve_device_option_flow_steering **dev_op_flow_steering,
 			     struct gve_device_option_modify_ring **dev_op_modify_ring,
 			     struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -109,6 +110,22 @@ void gve_parse_device_option(struct gve_priv *priv,
 		}
 		*dev_op_dqo_rda = RTE_PTR_ADD(option, sizeof(*option));
 		break;
+	case GVE_DEV_OPT_ID_FLOW_STEERING:
+		if (option_length < sizeof(**dev_op_flow_steering) ||
+		    req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+			PMD_DRV_LOG(WARNING, GVE_DEVICE_OPTION_ERROR_FMT,
+				    "Flow Steering", (int)sizeof(**dev_op_flow_steering),
+				    GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+				    option_length, req_feat_mask);
+			break;
+		}
+
+		if (option_length > sizeof(**dev_op_flow_steering)) {
+			PMD_DRV_LOG(WARNING,
+				    GVE_DEVICE_OPTION_TOO_BIG_FMT, "Flow Steering");
+		}
+		*dev_op_flow_steering = RTE_PTR_ADD(option, sizeof(*option));
+		break;
 	case GVE_DEV_OPT_ID_MODIFY_RING:
 		/* Min ring size bound is optional. */
 		if (option_length < (sizeof(**dev_op_modify_ring) -
@@ -167,6 +184,7 @@ gve_process_device_options(struct gve_priv *priv,
 			   struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			   struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			   struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			   struct gve_device_option_flow_steering **dev_op_flow_steering,
 			   struct gve_device_option_modify_ring **dev_op_modify_ring,
 			   struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -188,8 +206,8 @@ gve_process_device_options(struct gve_priv *priv,
 
 		gve_parse_device_option(priv, dev_opt,
 					dev_op_gqi_rda, dev_op_gqi_qpl,
-					dev_op_dqo_rda, dev_op_modify_ring,
-					dev_op_jumbo_frames);
+					dev_op_dqo_rda, dev_op_flow_steering,
+					dev_op_modify_ring, dev_op_jumbo_frames);
 		dev_opt = next_opt;
 	}
 
@@ -777,9 +795,19 @@ gve_set_max_desc_cnt(struct gve_priv *priv,
 
 static void gve_enable_supported_features(struct gve_priv *priv,
 	u32 supported_features_mask,
+	const struct gve_device_option_flow_steering *dev_op_flow_steering,
 	const struct gve_device_option_modify_ring *dev_op_modify_ring,
 	const struct gve_device_option_jumbo_frames *dev_op_jumbo_frames)
 {
+	if (dev_op_flow_steering &&
+	    (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK) &&
+	    dev_op_flow_steering->max_flow_rules) {
+		priv->max_flow_rules =
+			be32_to_cpu(dev_op_flow_steering->max_flow_rules);
+		PMD_DRV_LOG(INFO,
+			    "FLOW STEERING device option enabled with max rule limit of %u.",
+			    priv->max_flow_rules);
+	}
 	if (dev_op_modify_ring &&
 	    (supported_features_mask & GVE_SUP_MODIFY_RING_MASK)) {
 		PMD_DRV_LOG(INFO, "MODIFY RING device option enabled.");
@@ -802,6 +830,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 {
 	struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
 	struct gve_device_option_modify_ring *dev_op_modify_ring = NULL;
+	struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
 	struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
 	struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
 	struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
@@ -829,6 +858,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 
 	err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
 					 &dev_op_gqi_qpl, &dev_op_dqo_rda,
+					 &dev_op_flow_steering,
 					 &dev_op_modify_ring,
 					 &dev_op_jumbo_frames);
 	if (err)
@@ -884,7 +914,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 	priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
 
 	gve_enable_supported_features(priv, supported_features_mask,
-				      dev_op_modify_ring,
+				      dev_op_flow_steering, dev_op_modify_ring,
 				      dev_op_jumbo_frames);
 
 free_device_descriptor:
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index 6a3d469..e237353 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -117,6 +117,14 @@ struct gve_ring_size_bound {
 
 GVE_CHECK_STRUCT_LEN(4, gve_ring_size_bound);
 
+struct gve_device_option_flow_steering {
+	__be32 supported_features_mask;
+	__be32 reserved;
+	__be32 max_flow_rules;
+};
+
+GVE_CHECK_STRUCT_LEN(12, gve_device_option_flow_steering);
+
 struct gve_device_option_modify_ring {
 	__be32 supported_features_mask;
 	struct gve_ring_size_bound max_ring_size;
@@ -148,6 +156,7 @@ enum gve_dev_opt_id {
 	GVE_DEV_OPT_ID_DQO_RDA = 0x4,
 	GVE_DEV_OPT_ID_MODIFY_RING = 0x6,
 	GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+	GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
 };
 
 enum gve_dev_opt_req_feat_mask {
@@ -155,6 +164,7 @@ enum gve_dev_opt_req_feat_mask {
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_RDA = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
+	GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_MODIFY_RING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
 };
@@ -162,6 +172,7 @@ enum gve_dev_opt_req_feat_mask {
 enum gve_sup_feature_mask {
 	GVE_SUP_MODIFY_RING_MASK = 1 << 0,
 	GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+	GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
 };
 
 #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index f7cc781..3a810b6 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -332,6 +332,8 @@ struct gve_priv {
 
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
+
+	uint32_t max_flow_rules;
 };
 
 static inline bool
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/4] net/gve: introduce extended adminq command
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
@ 2026-03-03  0:58   ` Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  0:58 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Flow steering adminq commands are too large to fit into a normal adminq
command buffer which accepts at most 56 bytes. As a result, introduce
extended adminq commands which permit larger command buffers using
indirection. Namely, extended command operations point to inner command
buffers allocated at a specified DMA address. As specified with the
device, all extended commands will use inner opcodes larger than 0xFF.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 30 ++++++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 16 ++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 64b9468..0cc6d44 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -438,6 +438,8 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 
 	memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
 	opcode = be32_to_cpu(READ_ONCE32(cmd->opcode));
+	if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+		opcode = be32_to_cpu(READ_ONCE32(cmd->extended_command.inner_opcode));
 
 	switch (opcode) {
 	case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -516,6 +518,34 @@ static int gve_adminq_execute_cmd(struct gve_priv *priv,
 	return gve_adminq_kick_and_wait(priv);
 }
 
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
+					   size_t cmd_size, void *cmd_orig)
+{
+	union gve_adminq_command cmd;
+	struct gve_dma_mem inner_cmd_dma_mem;
+	void *inner_cmd;
+	int err;
+
+	inner_cmd = gve_alloc_dma_mem(&inner_cmd_dma_mem, cmd_size);
+	if (!inner_cmd)
+		return -ENOMEM;
+
+	memcpy(inner_cmd, cmd_orig, cmd_size);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+	cmd.extended_command = (struct gve_adminq_extended_command) {
+		.inner_opcode = cpu_to_be32(opcode),
+		.inner_length = cpu_to_be32(cmd_size),
+		.inner_command_addr = cpu_to_be64(inner_cmd_dma_mem.pa),
+	};
+
+	err = gve_adminq_execute_cmd(priv, &cmd);
+
+	gve_free_dma_mem(&inner_cmd_dma_mem);
+	return err;
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index e237353..f52658e 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -25,8 +25,15 @@ enum gve_adminq_opcodes {
 	GVE_ADMINQ_REPORT_LINK_SPEED		= 0xD,
 	GVE_ADMINQ_GET_PTYPE_MAP		= 0xE,
 	GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY	= 0xF,
+	/* For commands that are larger than 56 bytes */
+	GVE_ADMINQ_EXTENDED_COMMAND		= 0xFF,
 };
 
+/* The normal adminq command is restricted to be 56 bytes at maximum. For the
+ * longer adminq command, it is wrapped by GVE_ADMINQ_EXTENDED_COMMAND with
+ * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
+ * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
+ */
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -194,6 +201,14 @@ enum gve_driver_capbility {
 #define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
 #define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
 
+struct gve_adminq_extended_command {
+	__be32 inner_opcode;
+	__be32 inner_length;
+	__be64 inner_command_addr;
+};
+
+GVE_CHECK_STRUCT_LEN(16, gve_adminq_extended_command);
+
 struct gve_driver_info {
 	u8 os_type;	/* 0x05 = DPDK */
 	u8 driver_major;
@@ -440,6 +455,7 @@ union gve_adminq_command {
 			struct gve_adminq_get_ptype_map get_ptype_map;
 			struct gve_adminq_verify_driver_compatibility
 				verify_driver_compatibility;
+			struct gve_adminq_extended_command extended_command;
 		};
 	};
 	u8 reserved[64];
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/4] net/gve: add adminq commands for flow steering
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
@ 2026-03-03  0:58   ` Jasper Tran O'Leary
  2026-03-03  0:58   ` [PATCH v2 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  0:58 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Add new adminq commands for the driver to configure flow rules that are
stored in the device. For configuring flow rules, 3 sub commands are
supported.
- create: creates a new flow rule with a specific rule_id.
- destroy: deletes an existing flow rule with the specified rule_id.
- flush: clears and deletes all currently active flow rules.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 52 +++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 30 ++++++++++++++++
 drivers/net/gve/gve_ethdev.h      |  1 +
 drivers/net/gve/gve_flow_rule.h   | 59 +++++++++++++++++++++++++++++++
 4 files changed, 142 insertions(+)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 0cc6d44..9a94591 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -239,6 +239,7 @@ int gve_adminq_alloc(struct gve_priv *priv)
 	priv->adminq_report_stats_cnt = 0;
 	priv->adminq_report_link_speed_cnt = 0;
 	priv->adminq_get_ptype_map_cnt = 0;
+	priv->adminq_cfg_flow_rule_cnt = 0;
 
 	/* Setup Admin queue with the device */
 	rte_pci_read_config(priv->pci_dev, &pci_rev_id, sizeof(pci_rev_id),
@@ -487,6 +488,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 	case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
 		priv->adminq_verify_driver_compatibility_cnt++;
 		break;
+	case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+		priv->adminq_cfg_flow_rule_cnt++;
+		break;
 	default:
 		PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode);
 	}
@@ -546,6 +550,54 @@ static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
 	return err;
 }
 
+static int
+gve_adminq_configure_flow_rule(struct gve_priv *priv,
+			       struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+	int err = gve_adminq_execute_extended_cmd(priv,
+			GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+			sizeof(struct gve_adminq_configure_flow_rule),
+			flow_rule_cmd);
+
+	return err;
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_ADD),
+		.location = cpu_to_be32(loc),
+		.rule = {
+			.flow_type = cpu_to_be16(rule->flow_type),
+			.action = cpu_to_be16(rule->action),
+			.key = rule->key,
+			.mask = rule->mask,
+		},
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_DEL),
+		.location = cpu_to_be32(loc),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_RESET),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index f52658e..d8e5e6a 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -7,6 +7,7 @@
 #define _GVE_ADMINQ_H
 
 #include "gve_osdep.h"
+#include "../gve_flow_rule.h"
 
 /* Admin queue opcodes */
 enum gve_adminq_opcodes {
@@ -34,6 +35,10 @@ enum gve_adminq_opcodes {
  * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
  * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
  */
+enum gve_adminq_extended_cmd_opcodes {
+	GVE_ADMINQ_CONFIGURE_FLOW_RULE	= 0x101,
+};
+
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -434,6 +439,26 @@ struct gve_adminq_configure_rss {
 	__be64 indir_addr;
 };
 
+/* Flow rule definition for the admin queue using network byte order (big
+ * endian). This struct represents the hardware wire format and should not be
+ * used outside of admin queue contexts.
+ */
+struct gve_adminq_flow_rule {
+	__be16 flow_type;
+	__be16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+struct gve_adminq_configure_flow_rule {
+	__be16 opcode;
+	u8 padding[2];
+	struct gve_adminq_flow_rule rule;
+	__be32 location;
+};
+
+GVE_CHECK_STRUCT_LEN(92, gve_adminq_configure_flow_rule);
+
 union gve_adminq_command {
 	struct {
 		__be32 opcode;
@@ -499,4 +524,9 @@ int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
 int gve_adminq_configure_rss(struct gve_priv *priv,
 			     struct gve_rss_config *rss_config);
 
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_ADMINQ_H */
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 3a810b6..4e07ca8 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -314,6 +314,7 @@ struct gve_priv {
 	uint32_t adminq_report_link_speed_cnt;
 	uint32_t adminq_get_ptype_map_cnt;
 	uint32_t adminq_verify_driver_compatibility_cnt;
+	uint32_t adminq_cfg_flow_rule_cnt;
 	volatile uint32_t state_flags;
 
 	/* Gvnic device link speed from hypervisor. */
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
new file mode 100644
index 0000000..8c17ddd
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#ifndef _GVE_FLOW_RULE_H_
+#define _GVE_FLOW_RULE_H_
+
+#include "base/gve_osdep.h"
+
+enum gve_adminq_flow_rule_cfg_opcode {
+	GVE_FLOW_RULE_CFG_ADD	= 0,
+	GVE_FLOW_RULE_CFG_DEL	= 1,
+	GVE_FLOW_RULE_CFG_RESET	= 2,
+};
+
+enum gve_adminq_flow_type {
+	GVE_FLOW_TYPE_TCPV4,
+	GVE_FLOW_TYPE_UDPV4,
+	GVE_FLOW_TYPE_SCTPV4,
+	GVE_FLOW_TYPE_AHV4,
+	GVE_FLOW_TYPE_ESPV4,
+	GVE_FLOW_TYPE_TCPV6,
+	GVE_FLOW_TYPE_UDPV6,
+	GVE_FLOW_TYPE_SCTPV6,
+	GVE_FLOW_TYPE_AHV6,
+	GVE_FLOW_TYPE_ESPV6,
+};
+
+struct gve_flow_spec {
+	__be32 src_ip[4];
+	__be32 dst_ip[4];
+	union {
+		struct {
+			__be16 src_port;
+			__be16 dst_port;
+		};
+		__be32 spi;
+	};
+	union {
+		u8 tos;
+		u8 tclass;
+	};
+};
+
+/* Flow rule parameters using mixed endianness.
+ * - flow_type and action are guest endian.
+ * - key and mask are in network byte order (big endian), matching rte_flow.
+ * This struct is used by the driver when validating and creating flow rules;
+ * guest endian fields are only converted to network byte order within admin
+ * queue functions.
+ */
+struct gve_flow_rule_params {
+	u16 flow_type;
+	u16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+#endif /* _GVE_FLOW_RULE_H_ */
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/4] net/gve: add rte flow API integration
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
                     ` (2 preceding siblings ...)
  2026-03-03  0:58   ` [PATCH v2 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
@ 2026-03-03  0:58   ` Jasper Tran O'Leary
  2026-03-03 15:21   ` [PATCH v2 0/4] net/gve: add flow steering support Stephen Hemminger
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  0:58 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Implement driver callbacks for the following rte flow operations:
create, destroy, and flush. This change enables receive flow steering
(RFS) for n-tuple based flow rules for the gve driver.

The implementation supports matching ingress IPv4/IPv6 traffic combined
with TCP, UDP, SCTP, ESP, or AH protocols. Supported fields for
matching include IP source/destination addresses, L4 source/destination
ports (for TCP/UDP/SCTP), and SPI (for ESP/AH). The only supported
action is RTE_FLOW_ACTION_TYPE_QUEUE, which steers matching packets to
a specified rx queue.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  26 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  43 ++
 drivers/net/gve/gve_flow_rule.c        | 656 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |   6 +
 drivers/net/gve/meson.build            |   1 +
 9 files changed, 829 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c

diff --git a/doc/guides/nics/features/gve.ini b/doc/guides/nics/features/gve.ini
index ed040a0..89c97fd 100644
--- a/doc/guides/nics/features/gve.ini
+++ b/doc/guides/nics/features/gve.ini
@@ -19,3 +19,15 @@ Linux                = Y
 x86-32               = Y
 x86-64               = Y
 Usage doc            = Y
+
+[rte_flow items]
+ah                   = Y
+esp                  = Y
+ipv4                 = Y
+ipv6                 = Y
+sctp                 = Y
+tcp                  = Y
+udp                  = Y
+
+[rte_flow actions]
+queue                = Y
diff --git a/doc/guides/nics/gve.rst b/doc/guides/nics/gve.rst
index 6b4d1f7..64cd931 100644
--- a/doc/guides/nics/gve.rst
+++ b/doc/guides/nics/gve.rst
@@ -103,6 +103,32 @@ the redirection table will be available for querying upon initial hash configura
 When performing redirection table updates,
 it is possible to update individual table entries.
 
+Flow Steering
+^^^^^^^^^^^^^
+
+The driver supports receive flow steering (RFS) via the standard ``rte_flow``
+API. This allows applications to steer traffic to specific queues based on
+5-tuple matching. 3-tuple matching may be supported in future releases.
+
+**Supported Patterns**
+
+L3 Protocols
+  IPv4/IPv6 source and destination addresses.
+L4 Protocols
+  TCP/UDP/SCTP source and destination ports.
+Security Protocols
+  ESP/AH SPI.
+
+**Supported Actions**
+
+- ``RTE_FLOW_ACTION_TYPE_QUEUE``: Steer packets to a specific Rx queue.
+
+**Limitations**
+
+- Only ingress flow rules are supported.
+- Flow priorities are not supported (must be 0).
+- Masking is limited to full matches i.e. 0x00...0 or 0xFF...F.
+
 Application-Initiated Reset
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The driver allows an application to reset the gVNIC device.
diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst
index 1855d90..e45ed27 100644
--- a/doc/guides/rel_notes/release_26_03.rst
+++ b/doc/guides/rel_notes/release_26_03.rst
@@ -78,6 +78,7 @@ New Features
 * **Updated Google Virtual Ethernet (gve) driver.**
 
   * Added application-initiated device reset.
+  * Add support for receive flow steering.
 
 * **Updated Intel iavf driver.**
 
diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h
index 99514cb..18363fa 100644
--- a/drivers/net/gve/base/gve.h
+++ b/drivers/net/gve/base/gve.h
@@ -50,7 +50,8 @@ enum gve_state_flags_bit {
 	GVE_PRIV_FLAGS_ADMIN_QUEUE_OK		= 1,
 	GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK	= 2,
 	GVE_PRIV_FLAGS_DEVICE_RINGS_OK		= 3,
-	GVE_PRIV_FLAGS_NAPI_ENABLED		= 4,
+	GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK	= 4,
+	GVE_PRIV_FLAGS_NAPI_ENABLED		= 5,
 };
 
 enum gve_rss_hash_algorithm {
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index 5912fec..6ce3ef3 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -510,6 +510,49 @@ gve_free_ptype_lut_dqo(struct gve_priv *priv)
 	}
 }
 
+static int
+gve_setup_flow_subsystem(struct gve_priv *priv)
+{
+	int err;
+
+	priv->flow_rule_bmp_size =
+			rte_bitmap_get_memory_footprint(priv->max_flow_rules);
+	priv->avail_flow_rule_bmp_mem = rte_zmalloc("gve_flow_rule_bmp",
+			priv->flow_rule_bmp_size, 0);
+	if (!priv->avail_flow_rule_bmp_mem) {
+		PMD_DRV_LOG(ERR, "Failed to alloc bitmap for flow rules.");
+		err = -ENOMEM;
+		goto free_flow_rule_bmp;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Failed to initialize flow rule bitmap.");
+		goto free_flow_rule_bmp;
+	}
+
+	TAILQ_INIT(&priv->active_flows);
+	gve_set_flow_subsystem_ok(priv);
+
+	return 0;
+
+free_flow_rule_bmp:
+	gve_flow_free_bmp(priv);
+	return err;
+}
+
+static void
+gve_teardown_flow_subsystem(struct gve_priv *priv)
+{
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+	gve_free_flow_rules(priv);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+}
+
 static void
 gve_teardown_device_resources(struct gve_priv *priv)
 {
@@ -519,7 +562,9 @@ gve_teardown_device_resources(struct gve_priv *priv)
 	if (gve_get_device_resources_ok(priv)) {
 		err = gve_adminq_deconfigure_device_resources(priv);
 		if (err)
-			PMD_DRV_LOG(ERR, "Could not deconfigure device resources: err=%d", err);
+			PMD_DRV_LOG(ERR,
+				"Could not deconfigure device resources: err=%d",
+				err);
 	}
 
 	gve_free_ptype_lut_dqo(priv);
@@ -543,6 +588,11 @@ gve_dev_close(struct rte_eth_dev *dev)
 			PMD_DRV_LOG(ERR, "Failed to stop dev.");
 	}
 
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
+	pthread_mutex_destroy(&priv->flow_rule_lock);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -566,6 +616,9 @@ gve_dev_reset(struct rte_eth_dev *dev)
 	}
 
 	/* Tear down all device resources before re-initializing. */
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -1094,6 +1147,18 @@ gve_rss_reta_query(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+gve_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+
+	if (!gve_get_flow_subsystem_ok(priv))
+		return -ENOTSUP;
+
+	*ops = &gve_flow_ops;
+	return 0;
+}
+
 static const struct eth_dev_ops gve_eth_dev_ops = {
 	.dev_configure        = gve_dev_configure,
 	.dev_start            = gve_dev_start,
@@ -1109,6 +1174,7 @@ static const struct eth_dev_ops gve_eth_dev_ops = {
 	.tx_queue_start       = gve_tx_queue_start,
 	.rx_queue_stop        = gve_rx_queue_stop,
 	.tx_queue_stop        = gve_tx_queue_stop,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1136,6 +1202,7 @@ static const struct eth_dev_ops gve_eth_dev_ops_dqo = {
 	.tx_queue_start       = gve_tx_queue_start_dqo,
 	.rx_queue_stop        = gve_rx_queue_stop_dqo,
 	.tx_queue_stop        = gve_tx_queue_stop_dqo,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1303,6 +1370,14 @@ gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
 		    priv->max_nb_txq, priv->max_nb_rxq);
 
 setup_device:
+	if (priv->max_flow_rules) {
+		err = gve_setup_flow_subsystem(priv);
+		if (err)
+			PMD_DRV_LOG(WARNING,
+				    "Failed to set up flow subsystem: err=%d, flow steering will be disabled.",
+				    err);
+	}
+
 	err = gve_setup_device_resources(priv);
 	if (!err)
 		return 0;
@@ -1318,6 +1393,7 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 	int max_tx_queues, max_rx_queues;
 	struct rte_pci_device *pci_dev;
 	struct gve_registers *reg_bar;
+	pthread_mutexattr_t mutexattr;
 	rte_be32_t *db_bar;
 	int err;
 
@@ -1377,6 +1453,11 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 
 	eth_dev->data->mac_addrs = &priv->dev_addr;
 
+	pthread_mutexattr_init(&mutexattr);
+	pthread_mutexattr_setpshared(&mutexattr, PTHREAD_PROCESS_SHARED);
+	pthread_mutex_init(&priv->flow_rule_lock, &mutexattr);
+	pthread_mutexattr_destroy(&mutexattr);
+
 	return 0;
 }
 
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 4e07ca8..2d570d0 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -9,6 +9,8 @@
 #include <ethdev_pci.h>
 #include <rte_ether.h>
 #include <rte_pci.h>
+#include <pthread.h>
+#include <rte_bitmap.h>
 
 #include "base/gve.h"
 
@@ -252,6 +254,13 @@ struct gve_rx_queue {
 	uint8_t is_gqi_qpl;
 };
 
+struct gve_flow {
+	uint32_t rule_id;
+	TAILQ_ENTRY(gve_flow) list_handle;
+};
+
+extern const struct rte_flow_ops gve_flow_ops;
+
 struct gve_priv {
 	struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
 	const struct rte_memzone *irq_dbs_mz;
@@ -334,7 +343,13 @@ struct gve_priv {
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
 
+	/* Flow rule management */
 	uint32_t max_flow_rules;
+	uint32_t flow_rule_bmp_size;
+	struct rte_bitmap *avail_flow_rule_bmp; /* Tracks available rule IDs (1 = available) */
+	void *avail_flow_rule_bmp_mem; /* Backing memory for the bitmap */
+	pthread_mutex_t flow_rule_lock; /* Lock for bitmap and tailq access */
+	TAILQ_HEAD(, gve_flow) active_flows;
 };
 
 static inline bool
@@ -407,6 +422,34 @@ gve_clear_device_rings_ok(struct gve_priv *priv)
 				&priv->state_flags);
 }
 
+static inline bool
+gve_get_flow_subsystem_ok(struct gve_priv *priv)
+{
+	bool ret;
+
+	ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				      &priv->state_flags);
+	rte_atomic_thread_fence(rte_memory_order_acquire);
+
+	return ret;
+}
+
+static inline void
+gve_set_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				&priv->state_flags);
+}
+
 int
 gve_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_id, uint16_t nb_desc,
 		   unsigned int socket_id, const struct rte_eth_rxconf *conf,
diff --git a/drivers/net/gve/gve_flow_rule.c b/drivers/net/gve/gve_flow_rule.c
new file mode 100644
index 0000000..15fc111
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.c
@@ -0,0 +1,656 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#include <rte_flow.h>
+#include <rte_flow_driver.h>
+#include "base/gve_adminq.h"
+#include "gve_ethdev.h"
+
+static int
+gve_validate_flow_attr(const struct rte_flow_attr *attr,
+		       struct rte_flow_error *error)
+{
+	if (attr == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, NULL,
+				"Invalid flow attribute");
+		return -EINVAL;
+	}
+	if (attr->egress || attr->transfer) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, attr,
+				"Only ingress is supported");
+		return -EINVAL;
+	}
+	if (!attr->ingress) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, attr,
+				"Ingress attribute must be set");
+		return -EINVAL;
+	}
+	if (attr->priority != 0) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, attr,
+				"Priority levels are not supported");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void
+gve_parse_ipv4(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv4 *spec = item->spec;
+		const struct rte_flow_item_ipv4 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv4_mask;
+
+		rule->key.src_ip[0] = spec->hdr.src_addr;
+		rule->key.dst_ip[0] = spec->hdr.dst_addr;
+		rule->mask.src_ip[0] = mask->hdr.src_addr;
+		rule->mask.dst_ip[0] = mask->hdr.dst_addr;
+	}
+}
+
+static void
+gve_parse_ipv6(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv6 *spec = item->spec;
+		const struct rte_flow_item_ipv6 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv6_mask;
+		const __be32 *src_ip = (const __be32 *)&spec->hdr.src_addr;
+		const __be32 *src_mask = (const __be32 *)&mask->hdr.src_addr;
+		const __be32 *dst_ip = (const __be32 *)&spec->hdr.dst_addr;
+		const __be32 *dst_mask = (const __be32 *)&mask->hdr.dst_addr;
+		int i;
+
+		/*
+		 * The device expects IPv6 addresses as an array of 4 32-bit words
+		 * in reverse word order (the MSB word at index 3 and the LSB word
+		 * at index 0). We must reverse the DPDK network byte order array.
+		 */
+		for (i = 0; i < 4; i++) {
+			rule->key.src_ip[3 - i] = src_ip[i];
+			rule->key.dst_ip[3 - i] = dst_ip[i];
+			rule->mask.src_ip[3 - i] = src_mask[i];
+			rule->mask.dst_ip[3 - i] = dst_mask[i];
+		}
+	}
+}
+
+static void
+gve_parse_udp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_udp *spec = item->spec;
+		const struct rte_flow_item_udp *mask =
+			item->mask ? item->mask : &rte_flow_item_udp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_tcp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_tcp *spec = item->spec;
+		const struct rte_flow_item_tcp *mask =
+			item->mask ? item->mask : &rte_flow_item_tcp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_sctp(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_sctp *spec = item->spec;
+		const struct rte_flow_item_sctp *mask =
+			item->mask ? item->mask : &rte_flow_item_sctp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_esp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_esp *spec = item->spec;
+		const struct rte_flow_item_esp *mask =
+			item->mask ? item->mask : &rte_flow_item_esp_mask;
+
+		rule->key.spi = spec->hdr.spi;
+		rule->mask.spi = mask->hdr.spi;
+	}
+}
+
+static void
+gve_parse_ah(const struct rte_flow_item *item, struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ah *spec = item->spec;
+		const struct rte_flow_item_ah *mask =
+			item->mask ? item->mask : &rte_flow_item_ah_mask;
+
+		rule->key.spi = spec->spi;
+		rule->mask.spi = mask->spi;
+	}
+}
+
+static int
+gve_validate_and_parse_flow_pattern(const struct rte_flow_item pattern[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_item *item = pattern;
+	enum rte_flow_item_type l3_type = RTE_FLOW_ITEM_TYPE_VOID;
+	enum rte_flow_item_type l4_type = RTE_FLOW_ITEM_TYPE_VOID;
+
+	if (pattern == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM_NUM, NULL,
+				"Invalid flow pattern");
+		return -EINVAL;
+	}
+
+	for (; item->type != RTE_FLOW_ITEM_TYPE_END; item++) {
+		if (item->last) {
+			/* Last and range are not supported as match criteria. */
+			rte_flow_error_set(error, EINVAL,
+					   RTE_FLOW_ERROR_TYPE_ITEM,
+					   item,
+					   "No support for range");
+			return -EINVAL;
+		}
+		switch (item->type) {
+		case RTE_FLOW_ITEM_TYPE_VOID:
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV4:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv4(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV6:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv6(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_udp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_tcp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_sctp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_esp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ah(item, rule);
+			l4_type = item->type;
+			break;
+		default:
+			rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Unsupported flow pattern item type");
+			return -EINVAL;
+		}
+	}
+
+	switch (l3_type) {
+	case RTE_FLOW_ITEM_TYPE_IPV4:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	case RTE_FLOW_ITEM_TYPE_IPV6:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	default:
+		goto unsupported_flow;
+	}
+
+	return 0;
+
+unsupported_flow:
+	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
+			   NULL, "Unsupported L3/L4 combination");
+	return -EINVAL;
+}
+
+static int
+gve_validate_and_parse_flow_actions(struct rte_eth_dev *dev,
+				    const struct rte_flow_action actions[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_action_queue *action_queue;
+	const struct rte_flow_action *action = actions;
+	int num_queue_actions = 0;
+
+	if (actions == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM, NULL,
+				   "Invalid flow actions");
+		return -EINVAL;
+	}
+
+	while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_VOID:
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			if (action->conf == NULL) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action,
+						   "QUEUE action config cannot be NULL.");
+				return -EINVAL;
+			}
+
+			action_queue = action->conf;
+			if (action_queue->index >= dev->data->nb_rx_queues) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action, "Invalid Queue ID");
+				return -EINVAL;
+			}
+
+			rule->action = action_queue->index;
+			num_queue_actions++;
+			break;
+		default:
+			rte_flow_error_set(error, ENOTSUP,
+					   RTE_FLOW_ERROR_TYPE_ACTION,
+					   action,
+					   "Unsupported action. Only QUEUE is permitted.");
+			return -ENOTSUP;
+		}
+		action++;
+	}
+
+	if (num_queue_actions == 0) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "A QUEUE action is required.");
+		return -EINVAL;
+	}
+
+	if (num_queue_actions > 1) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "Only a single QUEUE action is allowed.");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+gve_validate_and_parse_flow(struct rte_eth_dev *dev,
+			    const struct rte_flow_attr *attr,
+			    const struct rte_flow_item pattern[],
+			    const struct rte_flow_action actions[],
+			    struct rte_flow_error *error,
+			    struct gve_flow_rule_params *rule)
+{
+	int err;
+
+	err = gve_validate_flow_attr(attr, error);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_pattern(pattern, error, rule);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_actions(dev, actions, error, rule);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+int
+gve_flow_init_bmp(struct gve_priv *priv)
+{
+	priv->avail_flow_rule_bmp = rte_bitmap_init_with_all_set(priv->max_flow_rules,
+			priv->avail_flow_rule_bmp_mem, priv->flow_rule_bmp_size);
+	if (priv->avail_flow_rule_bmp == NULL) {
+		PMD_DRV_LOG(ERR, "Flow subsystem failed: cannot init bitmap.");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+gve_flow_free_bmp(struct gve_priv *priv)
+{
+	rte_free(priv->avail_flow_rule_bmp_mem);
+	priv->avail_flow_rule_bmp_mem = NULL;
+	priv->avail_flow_rule_bmp = NULL;
+}
+
+/*
+ * The caller must acquire the flow rule lock before calling this function.
+ */
+int
+gve_free_flow_rules(struct gve_priv *priv)
+{
+	struct gve_flow *flow;
+	int err = 0;
+
+	if (!TAILQ_EMPTY(&priv->active_flows)) {
+		err = gve_adminq_reset_flow_rules(priv);
+		if (err) {
+			PMD_DRV_LOG(ERR,
+				"Failed to reset flow rules, internal device err=%d",
+				err);
+		}
+
+		/* Free flows even if AQ fails to avoid leaking memory. */
+		while (!TAILQ_EMPTY(&priv->active_flows)) {
+			flow = TAILQ_FIRST(&priv->active_flows);
+			TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+			rte_free(flow);
+		}
+	}
+
+	return err;
+}
+
+static struct rte_flow *
+gve_create_flow_rule(struct rte_eth_dev *dev,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item pattern[],
+		     const struct rte_flow_action actions[],
+		     struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow_rule_params rule = {0};
+	uint64_t bmp_slab __rte_unused;
+	struct gve_flow *flow;
+	int err;
+
+	err = gve_validate_and_parse_flow(dev, attr, pattern, actions, error,
+					  &rule);
+	if (err)
+		return NULL;
+
+	flow = rte_zmalloc("gve_flow", sizeof(struct gve_flow), 0);
+	if (flow == NULL) {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to allocate memory for flow rule.");
+		return NULL;
+	}
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, flow subsystem not initialized.");
+		goto free_flow_and_unlock;
+	}
+
+	/* Try to allocate a new rule ID from the bitmap. */
+	if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &flow->rule_id,
+			&bmp_slab) == 1) {
+		rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
+	} else {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, could not allocate a new rule ID.");
+		goto free_flow_and_unlock;
+	}
+
+	err = gve_adminq_add_flow_rule(priv, &rule, flow->rule_id);
+	if (err) {
+		rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+		rte_flow_error_set(error, -err,
+				   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				   "Failed to create flow rule, internal device error.");
+		goto free_flow_and_unlock;
+	}
+
+	TAILQ_INSERT_TAIL(&priv->active_flows, flow, list_handle);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return (struct rte_flow *)flow;
+
+free_flow_and_unlock:
+	rte_free(flow);
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return NULL;
+}
+
+static int
+gve_destroy_flow_rule(struct rte_eth_dev *dev, struct rte_flow *flow_handle,
+		      struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow *flow;
+	bool flow_rule_active;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	flow = (struct gve_flow *)flow_handle;
+
+	if (flow == NULL) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, invalid flow provided.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	if (flow->rule_id >= priv->max_flow_rules) {
+		PMD_DRV_LOG(ERR,
+			"Cannot destroy flow rule with invalid ID %d.",
+			flow->rule_id);
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, rule ID is invalid.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	flow_rule_active = !rte_bitmap_get(priv->avail_flow_rule_bmp,
+					   flow->rule_id);
+
+	if (!flow_rule_active) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, handle not found in active list.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	err = gve_adminq_del_flow_rule(priv, flow->rule_id);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, internal device error.");
+		goto unlock;
+	}
+
+	rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+	TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+	rte_free(flow);
+
+	err = 0;
+
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+static int
+gve_flush_flow_rules(struct rte_eth_dev *dev, struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	err = gve_free_flow_rules(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules due to internal device error, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to re-initialize rule ID bitmap, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return 0;
+
+disable_and_free:
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+const struct rte_flow_ops gve_flow_ops = {
+	.create = gve_create_flow_rule,
+	.destroy = gve_destroy_flow_rule,
+	.flush = gve_flush_flow_rules,
+};
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
index 8c17ddd..d597a6c 100644
--- a/drivers/net/gve/gve_flow_rule.h
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -56,4 +56,10 @@ struct gve_flow_rule_params {
 	struct gve_flow_spec mask;
 };
 
+struct gve_priv;
+
+int gve_flow_init_bmp(struct gve_priv *priv);
+void gve_flow_free_bmp(struct gve_priv *priv);
+int gve_free_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_FLOW_RULE_H_ */
diff --git a/drivers/net/gve/meson.build b/drivers/net/gve/meson.build
index c6a9f36..7074988 100644
--- a/drivers/net/gve/meson.build
+++ b/drivers/net/gve/meson.build
@@ -16,5 +16,6 @@ sources = files(
         'gve_ethdev.c',
         'gve_version.c',
         'gve_rss.c',
+        'gve_flow_rule.c',
 )
 includes += include_directories('base')
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 0/4] net/gve: add flow steering support
  2026-02-27 22:52 ` [PATCH 0/4] net/gve: add flow steering support Stephen Hemminger
@ 2026-03-03  1:00   ` Jasper Tran O'Leary
  0 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-03  1:00 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Joshua Washington

[-- Attachment #1: Type: text/plain, Size: 13660 bytes --]

Thanks for the notes, I've addressed the issues as follows and submitted a
v2 revision.

[Patch 3/4]

1. Updated copyright header in `gve_flow_rule.h`.

[Patch 4/4]

0. Similar to patch 3, updated the copyright header in `gve_flow_rule.c`.
1. Fixed the memory leak in rule flush by freeing the bitmap on flush
failure. This preserves the invariant where `gve_get_flow_subsystem_okay`
is true if and only if the flow ID bitmap and flow rule tail queue are
initialized.
2. Added a `mutexattr_t` with `PTHREAD_PROCESS_SHARED` to the flow mutex
initialization.
3. Changed the aforementioned instances of `(!pointer)` and a few others to
use `pointer == NULL`.
4. Retained the use of `rte_zmalloc` for both `struct gve_flow` and
`priv->avail_flow_rule_bmp_mem` because this memory is shared between
processes.
5. Retained memory barriers in case we call the flag checks from
non-critical paths in the future.
6. Updated the documentation to use a definition list for supported
protocols, and made minor formatting adjustments to make the definition
list look more natural in the flow steering subsection.

On Fri, Feb 27, 2026 at 2:52 PM Stephen Hemminger <
stephen@networkplumber.org> wrote:

> On Fri, 27 Feb 2026 19:51:22 +0000
> "Jasper Tran O'Leary" <jtranoleary@google.com> wrote:
>
> > This patch series adds flow steering support to the Google Virtual
> > Ethernet (gve) driver. This functionality allows traffic to be directed
> > to specific receive queues based on user-specified flow patterns.
> >
> > The series includes foundational support for extended admin queue
> > commands needed to handle flow rules, the specific adminqueue commands
> > for flow rule management, and the integration with the DPDK rte_flow
> > API. The series adds support flow matching on the following protocols:
> > IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> >
> > Patch Overview:
> >
> > 1. "net/gve: add flow steering device option" checks for and enables
> >    the flow steering capability in the device options during
> >    initialization.
> > 2. "net/gve: introduce extended adminq command" adds infrastructure
> >    for sending extended admin queue commands. These commands use a
> >    flexible buffer descriptor format required for flow rule management.
> > 3. "net/gve: add adminq commands for flow steering" implements the
> >    specific admin queue commands to add and remove flow rules on the
> >    device, including handling of rule IDs and parameters.
> > 4. "net/gve: add rte flow API integration" exposes the flow steering
> >    functionality via the DPDK rte_flow API. This includes strict
> >    pattern validation, rule parsing, and lifecycle management (create,
> >    destroy, flush). It ensures thread-safe access to the flow subsystem
> >    and proper resource cleanup during device reset.
> >
> > Jasper Tran O'Leary (2):
> >   net/gve: add adminq commands for flow steering
> >   net/gve: add rte flow API integration
> >
> > Vee Agarwal (2):
> >   net/gve: add flow steering device option
> >   net/gve: introduce extended adminq command
> >
> >  doc/guides/nics/features/gve.ini       |  12 +
> >  doc/guides/nics/gve.rst                |  20 +
> >  doc/guides/rel_notes/release_26_03.rst |   1 +
> >  drivers/net/gve/base/gve.h             |   3 +-
> >  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
> >  drivers/net/gve/base/gve_adminq.h      |  57 +++
> >  drivers/net/gve/gve_ethdev.c           |  87 +++-
> >  drivers/net/gve/gve_ethdev.h           |  46 ++
> >  drivers/net/gve/gve_flow_rule.c        | 645 +++++++++++++++++++++++++
> >  drivers/net/gve/gve_flow_rule.h        |  64 +++
> >  drivers/net/gve/meson.build            |   1 +
> >  11 files changed, 1049 insertions(+), 5 deletions(-)
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> >
>
> There is a lot here, so sent AI to take a look.
>
> Summary:
> Error 1 — Resource leak: If gve_flush_flow_rules fails and disables the
> flow subsystem via gve_clear_flow_subsystem_ok(), the bitmap memory
> (avail_flow_rule_bmp_mem) is never freed because both gve_dev_close and
> gve_dev_reset gate their teardown on gve_get_flow_subsystem_ok()
> returning true. The guard needs to check for allocated memory rather
> than subsystem state.
>
> Error 2 — pthread_mutex_init with NULL attributes in shared memory: The
> flow_rule_lock lives in dev_private (DPDK shared memory) but is
> initialized without PTHREAD_PROCESS_SHARED. This will cause undefined
> behavior with secondary processes. Switching to rte_spinlock_t would be
> the simplest fix since the critical sections are short (bitmap scan +
> TAILQ operations).
>
> Patches 1–3 are clean. The overall structure and error handling in patch 4
> is solid — the allocation-before-lock pattern, bitmap rollback on adminq
> failure, and defense-in-depth validation in destroy are all well done.
> Long form:
> # Code Review: GVE Flow Steering Patch Series (4 patches)
>
> **Series**: `[PATCH 1/4]` through `[PATCH 4/4]`
> **Author**: Jasper Tran O'Leary / Vee Agarwal (Google)
> **Subject**: Add receive flow steering (RFS) support to the GVE driver
>
> ---
>
> ## Summary
>
> This series adds n-tuple flow steering to the GVE (Google Virtual
> Ethernet) driver via the `rte_flow` API. The implementation is cleanly
> structured across four patches: device option discovery (1/4), extended
> adminq infrastructure (2/4), flow rule adminq commands (3/4), and full
> rte_flow integration (4/4). The code quality is generally good with
> thorough validation, proper locking, and well-structured error handling.
>
> Two correctness issues were identified: a resource leak when the flow
> subsystem is disabled on error, and a missing `PTHREAD_PROCESS_SHARED`
> attribute on a mutex in shared memory.
>
> ---
>
> ## Patch 1/4: `net/gve: add flow steering device option`
>
> ### Errors
>
> None.
>
> ### Warnings
>
> None.
>
> This patch is clean. The device option parsing follows the established
> pattern for existing options (modify_ring, jumbo_frames), the byte-swap of
> `max_flow_rules` is correct, and the zero-check on the big-endian value
> before byte-swap is valid (non-zero in any byte order is still non-zero).
>
> ---
>
> ## Patch 2/4: `net/gve: introduce extended adminq command`
>
> ### Errors
>
> None.
>
> ### Warnings
>
> None.
>
> The extended command mechanism is straightforward: allocate DMA memory for
> the inner command, copy the command in, set up the outer wrapper with the
> DMA address, execute, and free. The error path is correct —
> `gve_free_dma_mem` is called after `gve_adminq_execute_cmd` regardless of
> success or failure.
>
> ---
>
> ## Patch 3/4: `net/gve: add adminq commands for flow steering`
>
> ### Errors
>
> None.
>
> ### Warnings
>
> **1. `gve_flow_rule.h` copyright appears to be copied from an Intel driver
> file.**
>
> ```c
> /* SPDX-License-Identifier: BSD-3-Clause
>  * Copyright(C) 2022 Intel Corporation
>  */
> ```
>
> This is a new file in the Google GVE driver. The Intel copyright and 2022
> date look like they were carried over from whichever file was used as a
> template. This should be updated to reflect the actual author.
>
> ---
>
> ## Patch 4/4: `net/gve: add rte flow API integration`
>
> This is the largest patch (815 lines added) and where the significant
> findings are.
>
> ### Errors
>
> **1. Resource leak: bitmap memory leaked when flow subsystem is disabled
> on error.**
>
> In `gve_flush_flow_rules`, if either `gve_free_flow_rules()` or
> `gve_flow_init_bmp()` fails, the code disables the subsystem:
>
> ```c
> gve_clear_flow_subsystem_ok(priv);
> ```
>
> However, in both `gve_dev_close` and `gve_dev_reset`, teardown is gated on
> the flag:
>
> ```c
> if (gve_get_flow_subsystem_ok(priv))
>     gve_teardown_flow_subsystem(priv);
> ```
>
> If the subsystem was disabled by a failed flush,
> `gve_teardown_flow_subsystem` is never called, and
> `priv->avail_flow_rule_bmp_mem` is never freed. This is a memory leak.
>
> **Suggested fix**: Either (a) always call `gve_flow_free_bmp(priv)` in
> close/reset regardless of the flag, or (b) have the flush error path free
> the bitmap memory itself, or (c) unconditionally call teardown in
> close/reset:
>
> ```c
> /* In gve_dev_close / gve_dev_reset: */
> if (priv->avail_flow_rule_bmp_mem)
>     gve_teardown_flow_subsystem(priv);
> ```
>
> **2. `pthread_mutex_init` without `PTHREAD_PROCESS_SHARED` on a mutex in
> shared memory.**
>
> In `gve_dev_init`:
>
> ```c
> pthread_mutex_init(&priv->flow_rule_lock, NULL);
> ```
>
> `priv` is `dev->data->dev_private`, which is allocated in DPDK shared
> memory accessible by both primary and secondary processes. A pthread mutex
> in shared memory initialized with `NULL` attributes has undefined behavior
> when used across processes. It may appear to work in testing but fail in
> production with secondary processes.
>
> **Suggested fix**: Either use `PTHREAD_PROCESS_SHARED`:
>
> ```c
> pthread_mutexattr_t attr;
> pthread_mutexattr_init(&attr);
> pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
> pthread_mutex_init(&priv->flow_rule_lock, &attr);
> pthread_mutexattr_destroy(&attr);
> ```
>
> Or switch to `rte_spinlock_t` which works correctly in shared memory
> without special initialization (appropriate here since the critical
> sections are short — bitmap scan, TAILQ insert/remove):
>
> ```c
> rte_spinlock_t flow_rule_lock;
> /* ... */
> rte_spinlock_init(&priv->flow_rule_lock);
> ```
>
> ### Warnings
>
> **3. Implicit pointer comparisons throughout `gve_flow_rule.c`.**
>
> Several places use `!pointer` instead of `pointer == NULL`:
>
> ```c
> if (!flow) {          /* line ~1641, ~1710 */
> if (!action->conf) {  /* line ~1510 */
> if (!attr) {          /* line ~1173 */
> if (!actions) {       /* line ~1498 */
> if (!pattern) {       /* line ~1329 */
> ```
>
> DPDK coding style requires explicit comparison with NULL for pointers. The
> idiomatic form is `if (flow == NULL)`.
>
> **4. `rte_zmalloc` used for flow rule metadata that does not require
> hugepage memory.**
>
> The `gve_flow` structs (8 bytes: a `uint32_t` + TAILQ pointers) are
> allocated via `rte_zmalloc`. These are small control-plane structures not
> accessed by DMA and not requiring NUMA placement. Standard `malloc` would
> be more appropriate per DPDK guidelines and would not consume limited
> hugepage resources.
>
> Similarly, `priv->avail_flow_rule_bmp_mem` is allocated with
> `rte_zmalloc`. Since `rte_bitmap` may require specific alignment, this one
> is more defensible, but worth considering whether standard allocation would
> suffice.
>
> **5. Standalone `rte_atomic_thread_fence()` in flow subsystem flag
> accessors.**
>
> The `gve_get_flow_subsystem_ok` / `gve_set_flow_subsystem_ok` /
> `gve_clear_flow_subsystem_ok` functions use standalone fences with
> `rte_bit_relaxed_*` operations:
>
> ```c
> static inline bool
> gve_get_flow_subsystem_ok(struct gve_priv *priv)
> {
>     bool ret;
>     ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
>                                   &priv->state_flags);
>     rte_atomic_thread_fence(rte_memory_order_acquire);
>     return ret;
> }
> ```
>
> These follow the existing pattern for other state flags in the driver, so
> this is consistent. However, it's worth noting that in every call site in
> this patch, the flag is checked while holding `flow_rule_lock`, which
> already provides the necessary memory ordering. The fences are redundant in
> those paths (but harmless).
>
> **6. RST documentation uses simple bullet lists where definition lists
> would be cleaner.**
>
> In `doc/guides/nics/gve.rst`:
>
> ```rst
> Supported Patterns:
>   - IPv4/IPv6 source and destination addresses.
>   - TCP/UDP/SCTP source and destination ports.
>   - ESP/AH SPI.
> ```
>
> These term+list groupings would read better as RST definition lists. This
> is minor given the lists are short.
>
> ### Design Notes (Info)
>
> **Well-structured error handling in `gve_create_flow_rule`**: The
> allocation-before-lock pattern avoids holding the mutex during memory
> allocation, and the `free_flow_and_unlock` goto label correctly handles all
> error paths. The bitmap-set-on-error rollback in the adminq failure case is
> also correct.
>
> **`gve_destroy_flow_rule` double-validates flow state**: The function
> checks both the flow pointer for NULL and verifies `rule_id <
> max_flow_rules` before checking the bitmap. This defense-in-depth is good
> practice.
>
> **IPv6 word-reversal in `gve_parse_ipv6`**: The comment explaining the
> device's expected word order is clear and the implementation correctly
> reverses the 32-bit words. This is the kind of hardware-specific detail
> that benefits from the inline comment.
>
> ---
>
> ## Cross-Patch Observations
>
> The series is well-ordered: each patch builds incrementally and the
> dependencies flow naturally (device option → extended command
> infrastructure → flow rule commands → rte_flow integration). The
> documentation, feature matrix, and release notes are all updated in patch
> 4/4 together with the code, which is correct per DPDK guidelines.
>
> The `Co-developed-by` / `Signed-off-by` tag sequences are correctly
> formatted per Linux kernel convention.
>

[-- Attachment #2: Type: text/html, Size: 14903 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/4] net/gve: add flow steering support
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
                     ` (3 preceding siblings ...)
  2026-03-03  0:58   ` [PATCH v2 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
@ 2026-03-03 15:21   ` Stephen Hemminger
  2026-03-04  1:49     ` Jasper Tran O'Leary
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
  5 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2026-03-03 15:21 UTC (permalink / raw)
  To: Jasper Tran O'Leary; +Cc: dev

On Tue,  3 Mar 2026 00:58:00 +0000
"Jasper Tran O'Leary" <jtranoleary@google.com> wrote:

> This patch series adds flow steering support to the Google Virtual
> Ethernet (gve) driver. This functionality allows traffic to be directed
> to specific receive queues based on user-specified flow patterns.
> 
> The series includes foundational support for extended admin queue
> commands needed to handle flow rules, the specific adminqueue commands
> for flow rule management, and the integration with the DPDK rte_flow
> API. The series adds support flow matching on the following protocols:
> IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> 
> Patch Overview:
> 
> 1. "net/gve: add flow steering device option" checks for and enables
>    the flow steering capability in the device options during
>    initialization.
> 2. "net/gve: introduce extended adminq command" adds infrastructure
>    for sending extended admin queue commands. These commands use a
>    flexible buffer descriptor format required for flow rule management.
> 3. "net/gve: add adminq commands for flow steering" implements the
>    specific admin queue commands to add and remove flow rules on the
>    device, including handling of rule IDs and parameters.
> 4. "net/gve: add rte flow API integration" exposes the flow steering
>    functionality via the DPDK rte_flow API. This includes strict
>    pattern validation, rule parsing, and lifecycle management (create,
>    destroy, flush). It ensures thread-safe access to the flow subsystem
>    and proper resource cleanup during device reset.
> 
> Jasper Tran O'Leary (2):
>   net/gve: add adminq commands for flow steering
>   net/gve: add rte flow API integration
> 
> Vee Agarwal (2):
>   net/gve: add flow steering device option
>   net/gve: introduce extended adminq command
> 
>  doc/guides/nics/features/gve.ini       |  12 +
>  doc/guides/nics/gve.rst                |  26 +
>  doc/guides/rel_notes/release_26_03.rst |   1 +
>  drivers/net/gve/base/gve.h             |   3 +-
>  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
>  drivers/net/gve/base/gve_adminq.h      |  57 +++
>  drivers/net/gve/gve_ethdev.c           |  83 +++-
>  drivers/net/gve/gve_ethdev.h           |  46 ++
>  drivers/net/gve/gve_flow_rule.c        | 656 +++++++++++++++++++++++++
>  drivers/net/gve/gve_flow_rule.h        |  65 +++
>  drivers/net/gve/meson.build            |   1 +
>  11 files changed, 1063 insertions(+), 5 deletions(-)
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> 

Automated review spotted a few things:

1. rte_bitmap_scan usage is incorrect.
2. Using rte_malloc where not necessary
3. Grammar in the release note.

I can fix the last one when merging, the second one is not a big issue
but would be good to have.

---


Error: Incorrect rte_bitmap_scan usage — wrong rule ID allocation (~85% confidence)

In gve_create_flow_rule():
c

uint64_t bmp_slab __rte_unused;
...
if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &flow->rule_id,
        &bmp_slab) == 1) {
    rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
}

rte_bitmap_scan() writes the slab base position to pos and the slab bit pattern to slab. The actual bit position of the first available rule is pos + __builtin_ctzll(slab), not pos alone. The __rte_unused annotation on bmp_slab confirms the slab value is being ignored entirely.

After the first rule allocation clears bit 0, a subsequent scan returns the same slab base (pos=0) with bit 0 now clear in the slab. The code would again set rule_id=0 and attempt to clear an already-clear bit, then try to create a duplicate rule on the device.

Suggested fix:
c

uint64_t bmp_slab;
uint32_t pos;
...
if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &pos,
        &bmp_slab) == 1) {
    flow->rule_id = pos + __builtin_ctzll(bmp_slab);
    rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
}

Warning: rte_zmalloc used for ordinary control structures

Both avail_flow_rule_bmp_mem and individual gve_flow structures are allocated with rte_zmalloc, but they don't require DMA access, NUMA placement, or multi-process shared memory. Standard calloc/malloc would be more appropriate and wouldn't consume limited hugepage resources.

Warning: Release notes tense inconsistency
rst

* Added application-initiated device reset.
+ * Add support for receive flow steering.

"Added" (past tense) vs "Add" (imperative). Both entries should use the same tense for consistency. The DPDK convention is imperative mood.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 0/4] net/gve: add flow steering support
  2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
                     ` (4 preceding siblings ...)
  2026-03-03 15:21   ` [PATCH v2 0/4] net/gve: add flow steering support Stephen Hemminger
@ 2026-03-04  1:46   ` Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
                       ` (5 more replies)
  5 siblings, 6 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:46 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Joshua Washington

This patch series adds flow steering support to the Google Virtual
Ethernet (gve) driver. This functionality allows traffic to be directed
to specific receive queues based on user-specified flow patterns.

The series includes foundational support for extended admin queue
commands needed to handle flow rules, the specific adminqueue commands
for flow rule management, and the integration with the DPDK rte_flow
API. The series adds support flow matching on the following protocols:
IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.

Patch Overview:

1. "net/gve: add flow steering device option" checks for and enables
   the flow steering capability in the device options during
   initialization.
2. "net/gve: introduce extended adminq command" adds infrastructure
   for sending extended admin queue commands. These commands use a
   flexible buffer descriptor format required for flow rule management.
3. "net/gve: add adminq commands for flow steering" implements the
   specific admin queue commands to add and remove flow rules on the
   device, including handling of rule IDs and parameters.
4. "net/gve: add rte flow API integration" exposes the flow steering
   functionality via the DPDK rte_flow API. This includes strict
   pattern validation, rule parsing, and lifecycle management (create,
   destroy, flush). It ensures thread-safe access to the flow subsystem
   and proper resource cleanup during device reset.

Jasper Tran O'Leary (2):
  net/gve: add adminq commands for flow steering
  net/gve: add rte flow API integration

Vee Agarwal (2):
  net/gve: add flow steering device option
  net/gve: introduce extended adminq command

 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  27 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/base/gve_adminq.c      | 118 ++++-
 drivers/net/gve/base/gve_adminq.h      |  57 +++
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  46 ++
 drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |  65 +++
 drivers/net/gve/meson.build            |   1 +
 11 files changed, 1066 insertions(+), 5 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v3 1/4] net/gve: add flow steering device option
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
@ 2026-03-04  1:46     ` Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:46 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Add a new device option to signal to the driver that the device supports
flow steering. This device option also carries the maximum number of
flow steering rules that the device can store.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 36 ++++++++++++++++++++++++++++---
 drivers/net/gve/base/gve_adminq.h | 11 ++++++++++
 drivers/net/gve/gve_ethdev.h      |  2 ++
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 6bd98d5..64b9468 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -36,6 +36,7 @@ void gve_parse_device_option(struct gve_priv *priv,
 			     struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			     struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			     struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			     struct gve_device_option_flow_steering **dev_op_flow_steering,
 			     struct gve_device_option_modify_ring **dev_op_modify_ring,
 			     struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -109,6 +110,22 @@ void gve_parse_device_option(struct gve_priv *priv,
 		}
 		*dev_op_dqo_rda = RTE_PTR_ADD(option, sizeof(*option));
 		break;
+	case GVE_DEV_OPT_ID_FLOW_STEERING:
+		if (option_length < sizeof(**dev_op_flow_steering) ||
+		    req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+			PMD_DRV_LOG(WARNING, GVE_DEVICE_OPTION_ERROR_FMT,
+				    "Flow Steering", (int)sizeof(**dev_op_flow_steering),
+				    GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+				    option_length, req_feat_mask);
+			break;
+		}
+
+		if (option_length > sizeof(**dev_op_flow_steering)) {
+			PMD_DRV_LOG(WARNING,
+				    GVE_DEVICE_OPTION_TOO_BIG_FMT, "Flow Steering");
+		}
+		*dev_op_flow_steering = RTE_PTR_ADD(option, sizeof(*option));
+		break;
 	case GVE_DEV_OPT_ID_MODIFY_RING:
 		/* Min ring size bound is optional. */
 		if (option_length < (sizeof(**dev_op_modify_ring) -
@@ -167,6 +184,7 @@ gve_process_device_options(struct gve_priv *priv,
 			   struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			   struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			   struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			   struct gve_device_option_flow_steering **dev_op_flow_steering,
 			   struct gve_device_option_modify_ring **dev_op_modify_ring,
 			   struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -188,8 +206,8 @@ gve_process_device_options(struct gve_priv *priv,
 
 		gve_parse_device_option(priv, dev_opt,
 					dev_op_gqi_rda, dev_op_gqi_qpl,
-					dev_op_dqo_rda, dev_op_modify_ring,
-					dev_op_jumbo_frames);
+					dev_op_dqo_rda, dev_op_flow_steering,
+					dev_op_modify_ring, dev_op_jumbo_frames);
 		dev_opt = next_opt;
 	}
 
@@ -777,9 +795,19 @@ gve_set_max_desc_cnt(struct gve_priv *priv,
 
 static void gve_enable_supported_features(struct gve_priv *priv,
 	u32 supported_features_mask,
+	const struct gve_device_option_flow_steering *dev_op_flow_steering,
 	const struct gve_device_option_modify_ring *dev_op_modify_ring,
 	const struct gve_device_option_jumbo_frames *dev_op_jumbo_frames)
 {
+	if (dev_op_flow_steering &&
+	    (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK) &&
+	    dev_op_flow_steering->max_flow_rules) {
+		priv->max_flow_rules =
+			be32_to_cpu(dev_op_flow_steering->max_flow_rules);
+		PMD_DRV_LOG(INFO,
+			    "FLOW STEERING device option enabled with max rule limit of %u.",
+			    priv->max_flow_rules);
+	}
 	if (dev_op_modify_ring &&
 	    (supported_features_mask & GVE_SUP_MODIFY_RING_MASK)) {
 		PMD_DRV_LOG(INFO, "MODIFY RING device option enabled.");
@@ -802,6 +830,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 {
 	struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
 	struct gve_device_option_modify_ring *dev_op_modify_ring = NULL;
+	struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
 	struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
 	struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
 	struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
@@ -829,6 +858,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 
 	err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
 					 &dev_op_gqi_qpl, &dev_op_dqo_rda,
+					 &dev_op_flow_steering,
 					 &dev_op_modify_ring,
 					 &dev_op_jumbo_frames);
 	if (err)
@@ -884,7 +914,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 	priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
 
 	gve_enable_supported_features(priv, supported_features_mask,
-				      dev_op_modify_ring,
+				      dev_op_flow_steering, dev_op_modify_ring,
 				      dev_op_jumbo_frames);
 
 free_device_descriptor:
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index 6a3d469..e237353 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -117,6 +117,14 @@ struct gve_ring_size_bound {
 
 GVE_CHECK_STRUCT_LEN(4, gve_ring_size_bound);
 
+struct gve_device_option_flow_steering {
+	__be32 supported_features_mask;
+	__be32 reserved;
+	__be32 max_flow_rules;
+};
+
+GVE_CHECK_STRUCT_LEN(12, gve_device_option_flow_steering);
+
 struct gve_device_option_modify_ring {
 	__be32 supported_features_mask;
 	struct gve_ring_size_bound max_ring_size;
@@ -148,6 +156,7 @@ enum gve_dev_opt_id {
 	GVE_DEV_OPT_ID_DQO_RDA = 0x4,
 	GVE_DEV_OPT_ID_MODIFY_RING = 0x6,
 	GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+	GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
 };
 
 enum gve_dev_opt_req_feat_mask {
@@ -155,6 +164,7 @@ enum gve_dev_opt_req_feat_mask {
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_RDA = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
+	GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_MODIFY_RING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
 };
@@ -162,6 +172,7 @@ enum gve_dev_opt_req_feat_mask {
 enum gve_sup_feature_mask {
 	GVE_SUP_MODIFY_RING_MASK = 1 << 0,
 	GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+	GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
 };
 
 #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index f7cc781..3a810b6 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -332,6 +332,8 @@ struct gve_priv {
 
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
+
+	uint32_t max_flow_rules;
 };
 
 static inline bool
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 2/4] net/gve: introduce extended adminq command
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
@ 2026-03-04  1:46     ` Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:46 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Flow steering adminq commands are too large to fit into a normal adminq
command buffer which accepts at most 56 bytes. As a result, introduce
extended adminq commands which permit larger command buffers using
indirection. Namely, extended command operations point to inner command
buffers allocated at a specified DMA address. As specified with the
device, all extended commands will use inner opcodes larger than 0xFF.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 30 ++++++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 16 ++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 64b9468..0cc6d44 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -438,6 +438,8 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 
 	memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
 	opcode = be32_to_cpu(READ_ONCE32(cmd->opcode));
+	if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+		opcode = be32_to_cpu(READ_ONCE32(cmd->extended_command.inner_opcode));
 
 	switch (opcode) {
 	case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -516,6 +518,34 @@ static int gve_adminq_execute_cmd(struct gve_priv *priv,
 	return gve_adminq_kick_and_wait(priv);
 }
 
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
+					   size_t cmd_size, void *cmd_orig)
+{
+	union gve_adminq_command cmd;
+	struct gve_dma_mem inner_cmd_dma_mem;
+	void *inner_cmd;
+	int err;
+
+	inner_cmd = gve_alloc_dma_mem(&inner_cmd_dma_mem, cmd_size);
+	if (!inner_cmd)
+		return -ENOMEM;
+
+	memcpy(inner_cmd, cmd_orig, cmd_size);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+	cmd.extended_command = (struct gve_adminq_extended_command) {
+		.inner_opcode = cpu_to_be32(opcode),
+		.inner_length = cpu_to_be32(cmd_size),
+		.inner_command_addr = cpu_to_be64(inner_cmd_dma_mem.pa),
+	};
+
+	err = gve_adminq_execute_cmd(priv, &cmd);
+
+	gve_free_dma_mem(&inner_cmd_dma_mem);
+	return err;
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index e237353..f52658e 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -25,8 +25,15 @@ enum gve_adminq_opcodes {
 	GVE_ADMINQ_REPORT_LINK_SPEED		= 0xD,
 	GVE_ADMINQ_GET_PTYPE_MAP		= 0xE,
 	GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY	= 0xF,
+	/* For commands that are larger than 56 bytes */
+	GVE_ADMINQ_EXTENDED_COMMAND		= 0xFF,
 };
 
+/* The normal adminq command is restricted to be 56 bytes at maximum. For the
+ * longer adminq command, it is wrapped by GVE_ADMINQ_EXTENDED_COMMAND with
+ * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
+ * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
+ */
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -194,6 +201,14 @@ enum gve_driver_capbility {
 #define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
 #define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
 
+struct gve_adminq_extended_command {
+	__be32 inner_opcode;
+	__be32 inner_length;
+	__be64 inner_command_addr;
+};
+
+GVE_CHECK_STRUCT_LEN(16, gve_adminq_extended_command);
+
 struct gve_driver_info {
 	u8 os_type;	/* 0x05 = DPDK */
 	u8 driver_major;
@@ -440,6 +455,7 @@ union gve_adminq_command {
 			struct gve_adminq_get_ptype_map get_ptype_map;
 			struct gve_adminq_verify_driver_compatibility
 				verify_driver_compatibility;
+			struct gve_adminq_extended_command extended_command;
 		};
 	};
 	u8 reserved[64];
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 3/4] net/gve: add adminq commands for flow steering
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
@ 2026-03-04  1:46     ` Jasper Tran O'Leary
  2026-03-04  1:46     ` [PATCH v3 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:46 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Add new adminq commands for the driver to configure flow rules that are
stored in the device. For configuring flow rules, 3 sub commands are
supported.
- create: creates a new flow rule with a specific rule_id.
- destroy: deletes an existing flow rule with the specified rule_id.
- flush: clears and deletes all currently active flow rules.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 52 +++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 30 ++++++++++++++++
 drivers/net/gve/gve_ethdev.h      |  1 +
 drivers/net/gve/gve_flow_rule.h   | 59 +++++++++++++++++++++++++++++++
 4 files changed, 142 insertions(+)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 0cc6d44..9a94591 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -239,6 +239,7 @@ int gve_adminq_alloc(struct gve_priv *priv)
 	priv->adminq_report_stats_cnt = 0;
 	priv->adminq_report_link_speed_cnt = 0;
 	priv->adminq_get_ptype_map_cnt = 0;
+	priv->adminq_cfg_flow_rule_cnt = 0;
 
 	/* Setup Admin queue with the device */
 	rte_pci_read_config(priv->pci_dev, &pci_rev_id, sizeof(pci_rev_id),
@@ -487,6 +488,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 	case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
 		priv->adminq_verify_driver_compatibility_cnt++;
 		break;
+	case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+		priv->adminq_cfg_flow_rule_cnt++;
+		break;
 	default:
 		PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode);
 	}
@@ -546,6 +550,54 @@ static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
 	return err;
 }
 
+static int
+gve_adminq_configure_flow_rule(struct gve_priv *priv,
+			       struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+	int err = gve_adminq_execute_extended_cmd(priv,
+			GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+			sizeof(struct gve_adminq_configure_flow_rule),
+			flow_rule_cmd);
+
+	return err;
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_ADD),
+		.location = cpu_to_be32(loc),
+		.rule = {
+			.flow_type = cpu_to_be16(rule->flow_type),
+			.action = cpu_to_be16(rule->action),
+			.key = rule->key,
+			.mask = rule->mask,
+		},
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_DEL),
+		.location = cpu_to_be32(loc),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_RESET),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index f52658e..d8e5e6a 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -7,6 +7,7 @@
 #define _GVE_ADMINQ_H
 
 #include "gve_osdep.h"
+#include "../gve_flow_rule.h"
 
 /* Admin queue opcodes */
 enum gve_adminq_opcodes {
@@ -34,6 +35,10 @@ enum gve_adminq_opcodes {
  * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
  * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
  */
+enum gve_adminq_extended_cmd_opcodes {
+	GVE_ADMINQ_CONFIGURE_FLOW_RULE	= 0x101,
+};
+
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -434,6 +439,26 @@ struct gve_adminq_configure_rss {
 	__be64 indir_addr;
 };
 
+/* Flow rule definition for the admin queue using network byte order (big
+ * endian). This struct represents the hardware wire format and should not be
+ * used outside of admin queue contexts.
+ */
+struct gve_adminq_flow_rule {
+	__be16 flow_type;
+	__be16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+struct gve_adminq_configure_flow_rule {
+	__be16 opcode;
+	u8 padding[2];
+	struct gve_adminq_flow_rule rule;
+	__be32 location;
+};
+
+GVE_CHECK_STRUCT_LEN(92, gve_adminq_configure_flow_rule);
+
 union gve_adminq_command {
 	struct {
 		__be32 opcode;
@@ -499,4 +524,9 @@ int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
 int gve_adminq_configure_rss(struct gve_priv *priv,
 			     struct gve_rss_config *rss_config);
 
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_ADMINQ_H */
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 3a810b6..4e07ca8 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -314,6 +314,7 @@ struct gve_priv {
 	uint32_t adminq_report_link_speed_cnt;
 	uint32_t adminq_get_ptype_map_cnt;
 	uint32_t adminq_verify_driver_compatibility_cnt;
+	uint32_t adminq_cfg_flow_rule_cnt;
 	volatile uint32_t state_flags;
 
 	/* Gvnic device link speed from hypervisor. */
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
new file mode 100644
index 0000000..8c17ddd
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#ifndef _GVE_FLOW_RULE_H_
+#define _GVE_FLOW_RULE_H_
+
+#include "base/gve_osdep.h"
+
+enum gve_adminq_flow_rule_cfg_opcode {
+	GVE_FLOW_RULE_CFG_ADD	= 0,
+	GVE_FLOW_RULE_CFG_DEL	= 1,
+	GVE_FLOW_RULE_CFG_RESET	= 2,
+};
+
+enum gve_adminq_flow_type {
+	GVE_FLOW_TYPE_TCPV4,
+	GVE_FLOW_TYPE_UDPV4,
+	GVE_FLOW_TYPE_SCTPV4,
+	GVE_FLOW_TYPE_AHV4,
+	GVE_FLOW_TYPE_ESPV4,
+	GVE_FLOW_TYPE_TCPV6,
+	GVE_FLOW_TYPE_UDPV6,
+	GVE_FLOW_TYPE_SCTPV6,
+	GVE_FLOW_TYPE_AHV6,
+	GVE_FLOW_TYPE_ESPV6,
+};
+
+struct gve_flow_spec {
+	__be32 src_ip[4];
+	__be32 dst_ip[4];
+	union {
+		struct {
+			__be16 src_port;
+			__be16 dst_port;
+		};
+		__be32 spi;
+	};
+	union {
+		u8 tos;
+		u8 tclass;
+	};
+};
+
+/* Flow rule parameters using mixed endianness.
+ * - flow_type and action are guest endian.
+ * - key and mask are in network byte order (big endian), matching rte_flow.
+ * This struct is used by the driver when validating and creating flow rules;
+ * guest endian fields are only converted to network byte order within admin
+ * queue functions.
+ */
+struct gve_flow_rule_params {
+	u16 flow_type;
+	u16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+#endif /* _GVE_FLOW_RULE_H_ */
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v3 4/4] net/gve: add rte flow API integration
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
                       ` (2 preceding siblings ...)
  2026-03-04  1:46     ` [PATCH v3 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
@ 2026-03-04  1:46     ` Jasper Tran O'Leary
  2026-03-04  4:46     ` [PATCH v3 0/4] net/gve: add flow steering support Jasper Tran O'Leary
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:46 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Implement driver callbacks for the following rte flow operations:
create, destroy, and flush. This change enables receive flow steering
(RFS) for n-tuple based flow rules for the gve driver.

The implementation supports matching ingress IPv4/IPv6 traffic combined
with TCP, UDP, SCTP, ESP, or AH protocols. Supported fields for
matching include IP source/destination addresses, L4 source/destination
ports (for TCP/UDP/SCTP), and SPI (for ESP/AH). The only supported
action is RTE_FLOW_ACTION_TYPE_QUEUE, which steers matching packets to
a specified rx queue.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  27 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  43 ++
 drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |   6 +
 drivers/net/gve/meson.build            |   1 +
 9 files changed, 832 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c

diff --git a/doc/guides/nics/features/gve.ini b/doc/guides/nics/features/gve.ini
index ed040a0..89c97fd 100644
--- a/doc/guides/nics/features/gve.ini
+++ b/doc/guides/nics/features/gve.ini
@@ -19,3 +19,15 @@ Linux                = Y
 x86-32               = Y
 x86-64               = Y
 Usage doc            = Y
+
+[rte_flow items]
+ah                   = Y
+esp                  = Y
+ipv4                 = Y
+ipv6                 = Y
+sctp                 = Y
+tcp                  = Y
+udp                  = Y
+
+[rte_flow actions]
+queue                = Y
diff --git a/doc/guides/nics/gve.rst b/doc/guides/nics/gve.rst
index 6b4d1f7..8367ca9 100644
--- a/doc/guides/nics/gve.rst
+++ b/doc/guides/nics/gve.rst
@@ -103,6 +103,33 @@ the redirection table will be available for querying upon initial hash configura
 When performing redirection table updates,
 it is possible to update individual table entries.
 
+Flow Steering
+^^^^^^^^^^^^^
+
+The driver supports receive flow steering (RFS) via the standard ``rte_flow``
+API. This allows applications to steer traffic to specific queues based on
+5-tuple matching. 3-tuple matching may be supported in future releases.
+
+**Supported Patterns**
+
+L3 Protocols
+  IPv4/IPv6 source and destination addresses.
+L4 Protocols
+  TCP/UDP/SCTP source and destination ports.
+Security Protocols
+  ESP/AH SPI.
+
+**Supported Actions**
+
+- ``RTE_FLOW_ACTION_TYPE_QUEUE``: Steer packets to a specific Rx queue.
+
+**Limitations**
+
+- Flow steering operations are only supported in the primary process.
+- Only ingress flow rules are allowed.
+- Flow priorities are not supported (must be 0).
+- Masking is limited to full matches i.e. 0x00...0 or 0xFF...F.
+
 Application-Initiated Reset
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The driver allows an application to reset the gVNIC device.
diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst
index 1855d90..b643809 100644
--- a/doc/guides/rel_notes/release_26_03.rst
+++ b/doc/guides/rel_notes/release_26_03.rst
@@ -78,6 +78,7 @@ New Features
 * **Updated Google Virtual Ethernet (gve) driver.**
 
   * Added application-initiated device reset.
+  * Added support for receive flow steering.
 
 * **Updated Intel iavf driver.**
 
diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h
index 99514cb..18363fa 100644
--- a/drivers/net/gve/base/gve.h
+++ b/drivers/net/gve/base/gve.h
@@ -50,7 +50,8 @@ enum gve_state_flags_bit {
 	GVE_PRIV_FLAGS_ADMIN_QUEUE_OK		= 1,
 	GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK	= 2,
 	GVE_PRIV_FLAGS_DEVICE_RINGS_OK		= 3,
-	GVE_PRIV_FLAGS_NAPI_ENABLED		= 4,
+	GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK	= 4,
+	GVE_PRIV_FLAGS_NAPI_ENABLED		= 5,
 };
 
 enum gve_rss_hash_algorithm {
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index 5912fec..6ce3ef3 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -510,6 +510,49 @@ gve_free_ptype_lut_dqo(struct gve_priv *priv)
 	}
 }
 
+static int
+gve_setup_flow_subsystem(struct gve_priv *priv)
+{
+	int err;
+
+	priv->flow_rule_bmp_size =
+			rte_bitmap_get_memory_footprint(priv->max_flow_rules);
+	priv->avail_flow_rule_bmp_mem = rte_zmalloc("gve_flow_rule_bmp",
+			priv->flow_rule_bmp_size, 0);
+	if (!priv->avail_flow_rule_bmp_mem) {
+		PMD_DRV_LOG(ERR, "Failed to alloc bitmap for flow rules.");
+		err = -ENOMEM;
+		goto free_flow_rule_bmp;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Failed to initialize flow rule bitmap.");
+		goto free_flow_rule_bmp;
+	}
+
+	TAILQ_INIT(&priv->active_flows);
+	gve_set_flow_subsystem_ok(priv);
+
+	return 0;
+
+free_flow_rule_bmp:
+	gve_flow_free_bmp(priv);
+	return err;
+}
+
+static void
+gve_teardown_flow_subsystem(struct gve_priv *priv)
+{
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+	gve_free_flow_rules(priv);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+}
+
 static void
 gve_teardown_device_resources(struct gve_priv *priv)
 {
@@ -519,7 +562,9 @@ gve_teardown_device_resources(struct gve_priv *priv)
 	if (gve_get_device_resources_ok(priv)) {
 		err = gve_adminq_deconfigure_device_resources(priv);
 		if (err)
-			PMD_DRV_LOG(ERR, "Could not deconfigure device resources: err=%d", err);
+			PMD_DRV_LOG(ERR,
+				"Could not deconfigure device resources: err=%d",
+				err);
 	}
 
 	gve_free_ptype_lut_dqo(priv);
@@ -543,6 +588,11 @@ gve_dev_close(struct rte_eth_dev *dev)
 			PMD_DRV_LOG(ERR, "Failed to stop dev.");
 	}
 
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
+	pthread_mutex_destroy(&priv->flow_rule_lock);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -566,6 +616,9 @@ gve_dev_reset(struct rte_eth_dev *dev)
 	}
 
 	/* Tear down all device resources before re-initializing. */
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -1094,6 +1147,18 @@ gve_rss_reta_query(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+gve_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+
+	if (!gve_get_flow_subsystem_ok(priv))
+		return -ENOTSUP;
+
+	*ops = &gve_flow_ops;
+	return 0;
+}
+
 static const struct eth_dev_ops gve_eth_dev_ops = {
 	.dev_configure        = gve_dev_configure,
 	.dev_start            = gve_dev_start,
@@ -1109,6 +1174,7 @@ static const struct eth_dev_ops gve_eth_dev_ops = {
 	.tx_queue_start       = gve_tx_queue_start,
 	.rx_queue_stop        = gve_rx_queue_stop,
 	.tx_queue_stop        = gve_tx_queue_stop,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1136,6 +1202,7 @@ static const struct eth_dev_ops gve_eth_dev_ops_dqo = {
 	.tx_queue_start       = gve_tx_queue_start_dqo,
 	.rx_queue_stop        = gve_rx_queue_stop_dqo,
 	.tx_queue_stop        = gve_tx_queue_stop_dqo,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1303,6 +1370,14 @@ gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
 		    priv->max_nb_txq, priv->max_nb_rxq);
 
 setup_device:
+	if (priv->max_flow_rules) {
+		err = gve_setup_flow_subsystem(priv);
+		if (err)
+			PMD_DRV_LOG(WARNING,
+				    "Failed to set up flow subsystem: err=%d, flow steering will be disabled.",
+				    err);
+	}
+
 	err = gve_setup_device_resources(priv);
 	if (!err)
 		return 0;
@@ -1318,6 +1393,7 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 	int max_tx_queues, max_rx_queues;
 	struct rte_pci_device *pci_dev;
 	struct gve_registers *reg_bar;
+	pthread_mutexattr_t mutexattr;
 	rte_be32_t *db_bar;
 	int err;
 
@@ -1377,6 +1453,11 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 
 	eth_dev->data->mac_addrs = &priv->dev_addr;
 
+	pthread_mutexattr_init(&mutexattr);
+	pthread_mutexattr_setpshared(&mutexattr, PTHREAD_PROCESS_SHARED);
+	pthread_mutex_init(&priv->flow_rule_lock, &mutexattr);
+	pthread_mutexattr_destroy(&mutexattr);
+
 	return 0;
 }
 
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 4e07ca8..2d570d0 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -9,6 +9,8 @@
 #include <ethdev_pci.h>
 #include <rte_ether.h>
 #include <rte_pci.h>
+#include <pthread.h>
+#include <rte_bitmap.h>
 
 #include "base/gve.h"
 
@@ -252,6 +254,13 @@ struct gve_rx_queue {
 	uint8_t is_gqi_qpl;
 };
 
+struct gve_flow {
+	uint32_t rule_id;
+	TAILQ_ENTRY(gve_flow) list_handle;
+};
+
+extern const struct rte_flow_ops gve_flow_ops;
+
 struct gve_priv {
 	struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
 	const struct rte_memzone *irq_dbs_mz;
@@ -334,7 +343,13 @@ struct gve_priv {
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
 
+	/* Flow rule management */
 	uint32_t max_flow_rules;
+	uint32_t flow_rule_bmp_size;
+	struct rte_bitmap *avail_flow_rule_bmp; /* Tracks available rule IDs (1 = available) */
+	void *avail_flow_rule_bmp_mem; /* Backing memory for the bitmap */
+	pthread_mutex_t flow_rule_lock; /* Lock for bitmap and tailq access */
+	TAILQ_HEAD(, gve_flow) active_flows;
 };
 
 static inline bool
@@ -407,6 +422,34 @@ gve_clear_device_rings_ok(struct gve_priv *priv)
 				&priv->state_flags);
 }
 
+static inline bool
+gve_get_flow_subsystem_ok(struct gve_priv *priv)
+{
+	bool ret;
+
+	ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				      &priv->state_flags);
+	rte_atomic_thread_fence(rte_memory_order_acquire);
+
+	return ret;
+}
+
+static inline void
+gve_set_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				&priv->state_flags);
+}
+
 int
 gve_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_id, uint16_t nb_desc,
 		   unsigned int socket_id, const struct rte_eth_rxconf *conf,
diff --git a/drivers/net/gve/gve_flow_rule.c b/drivers/net/gve/gve_flow_rule.c
new file mode 100644
index 0000000..af75ae8
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.c
@@ -0,0 +1,658 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#include <rte_flow.h>
+#include <rte_flow_driver.h>
+#include "base/gve_adminq.h"
+#include "gve_ethdev.h"
+
+static int
+gve_validate_flow_attr(const struct rte_flow_attr *attr,
+		       struct rte_flow_error *error)
+{
+	if (attr == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, NULL,
+				"Invalid flow attribute");
+		return -EINVAL;
+	}
+	if (attr->egress || attr->transfer) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, attr,
+				"Only ingress is supported");
+		return -EINVAL;
+	}
+	if (!attr->ingress) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, attr,
+				"Ingress attribute must be set");
+		return -EINVAL;
+	}
+	if (attr->priority != 0) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, attr,
+				"Priority levels are not supported");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void
+gve_parse_ipv4(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv4 *spec = item->spec;
+		const struct rte_flow_item_ipv4 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv4_mask;
+
+		rule->key.src_ip[0] = spec->hdr.src_addr;
+		rule->key.dst_ip[0] = spec->hdr.dst_addr;
+		rule->mask.src_ip[0] = mask->hdr.src_addr;
+		rule->mask.dst_ip[0] = mask->hdr.dst_addr;
+	}
+}
+
+static void
+gve_parse_ipv6(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv6 *spec = item->spec;
+		const struct rte_flow_item_ipv6 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv6_mask;
+		const __be32 *src_ip = (const __be32 *)&spec->hdr.src_addr;
+		const __be32 *src_mask = (const __be32 *)&mask->hdr.src_addr;
+		const __be32 *dst_ip = (const __be32 *)&spec->hdr.dst_addr;
+		const __be32 *dst_mask = (const __be32 *)&mask->hdr.dst_addr;
+		int i;
+
+		/*
+		 * The device expects IPv6 addresses as an array of 4 32-bit words
+		 * in reverse word order (the MSB word at index 3 and the LSB word
+		 * at index 0). We must reverse the DPDK network byte order array.
+		 */
+		for (i = 0; i < 4; i++) {
+			rule->key.src_ip[3 - i] = src_ip[i];
+			rule->key.dst_ip[3 - i] = dst_ip[i];
+			rule->mask.src_ip[3 - i] = src_mask[i];
+			rule->mask.dst_ip[3 - i] = dst_mask[i];
+		}
+	}
+}
+
+static void
+gve_parse_udp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_udp *spec = item->spec;
+		const struct rte_flow_item_udp *mask =
+			item->mask ? item->mask : &rte_flow_item_udp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_tcp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_tcp *spec = item->spec;
+		const struct rte_flow_item_tcp *mask =
+			item->mask ? item->mask : &rte_flow_item_tcp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_sctp(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_sctp *spec = item->spec;
+		const struct rte_flow_item_sctp *mask =
+			item->mask ? item->mask : &rte_flow_item_sctp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_esp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_esp *spec = item->spec;
+		const struct rte_flow_item_esp *mask =
+			item->mask ? item->mask : &rte_flow_item_esp_mask;
+
+		rule->key.spi = spec->hdr.spi;
+		rule->mask.spi = mask->hdr.spi;
+	}
+}
+
+static void
+gve_parse_ah(const struct rte_flow_item *item, struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ah *spec = item->spec;
+		const struct rte_flow_item_ah *mask =
+			item->mask ? item->mask : &rte_flow_item_ah_mask;
+
+		rule->key.spi = spec->spi;
+		rule->mask.spi = mask->spi;
+	}
+}
+
+static int
+gve_validate_and_parse_flow_pattern(const struct rte_flow_item pattern[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_item *item = pattern;
+	enum rte_flow_item_type l3_type = RTE_FLOW_ITEM_TYPE_VOID;
+	enum rte_flow_item_type l4_type = RTE_FLOW_ITEM_TYPE_VOID;
+
+	if (pattern == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM_NUM, NULL,
+				"Invalid flow pattern");
+		return -EINVAL;
+	}
+
+	for (; item->type != RTE_FLOW_ITEM_TYPE_END; item++) {
+		if (item->last) {
+			/* Last and range are not supported as match criteria. */
+			rte_flow_error_set(error, EINVAL,
+					   RTE_FLOW_ERROR_TYPE_ITEM,
+					   item,
+					   "No support for range");
+			return -EINVAL;
+		}
+		switch (item->type) {
+		case RTE_FLOW_ITEM_TYPE_VOID:
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV4:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv4(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV6:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv6(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_udp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_tcp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_sctp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_esp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ah(item, rule);
+			l4_type = item->type;
+			break;
+		default:
+			rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Unsupported flow pattern item type");
+			return -EINVAL;
+		}
+	}
+
+	switch (l3_type) {
+	case RTE_FLOW_ITEM_TYPE_IPV4:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	case RTE_FLOW_ITEM_TYPE_IPV6:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	default:
+		goto unsupported_flow;
+	}
+
+	return 0;
+
+unsupported_flow:
+	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
+			   NULL, "Unsupported L3/L4 combination");
+	return -EINVAL;
+}
+
+static int
+gve_validate_and_parse_flow_actions(struct rte_eth_dev *dev,
+				    const struct rte_flow_action actions[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_action_queue *action_queue;
+	const struct rte_flow_action *action = actions;
+	int num_queue_actions = 0;
+
+	if (actions == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM, NULL,
+				   "Invalid flow actions");
+		return -EINVAL;
+	}
+
+	while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_VOID:
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			if (action->conf == NULL) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action,
+						   "QUEUE action config cannot be NULL.");
+				return -EINVAL;
+			}
+
+			action_queue = action->conf;
+			if (action_queue->index >= dev->data->nb_rx_queues) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action, "Invalid Queue ID");
+				return -EINVAL;
+			}
+
+			rule->action = action_queue->index;
+			num_queue_actions++;
+			break;
+		default:
+			rte_flow_error_set(error, ENOTSUP,
+					   RTE_FLOW_ERROR_TYPE_ACTION,
+					   action,
+					   "Unsupported action. Only QUEUE is permitted.");
+			return -ENOTSUP;
+		}
+		action++;
+	}
+
+	if (num_queue_actions == 0) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "A QUEUE action is required.");
+		return -EINVAL;
+	}
+
+	if (num_queue_actions > 1) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "Only a single QUEUE action is allowed.");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+gve_validate_and_parse_flow(struct rte_eth_dev *dev,
+			    const struct rte_flow_attr *attr,
+			    const struct rte_flow_item pattern[],
+			    const struct rte_flow_action actions[],
+			    struct rte_flow_error *error,
+			    struct gve_flow_rule_params *rule)
+{
+	int err;
+
+	err = gve_validate_flow_attr(attr, error);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_pattern(pattern, error, rule);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_actions(dev, actions, error, rule);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+int
+gve_flow_init_bmp(struct gve_priv *priv)
+{
+	priv->avail_flow_rule_bmp = rte_bitmap_init_with_all_set(priv->max_flow_rules,
+			priv->avail_flow_rule_bmp_mem, priv->flow_rule_bmp_size);
+	if (priv->avail_flow_rule_bmp == NULL) {
+		PMD_DRV_LOG(ERR, "Flow subsystem failed: cannot init bitmap.");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+gve_flow_free_bmp(struct gve_priv *priv)
+{
+	rte_free(priv->avail_flow_rule_bmp_mem);
+	priv->avail_flow_rule_bmp_mem = NULL;
+	priv->avail_flow_rule_bmp = NULL;
+}
+
+/*
+ * The caller must acquire the flow rule lock before calling this function.
+ */
+int
+gve_free_flow_rules(struct gve_priv *priv)
+{
+	struct gve_flow *flow;
+	int err = 0;
+
+	if (!TAILQ_EMPTY(&priv->active_flows)) {
+		err = gve_adminq_reset_flow_rules(priv);
+		if (err) {
+			PMD_DRV_LOG(ERR,
+				"Failed to reset flow rules, internal device err=%d",
+				err);
+		}
+
+		/* Free flows even if AQ fails to avoid leaking memory. */
+		while (!TAILQ_EMPTY(&priv->active_flows)) {
+			flow = TAILQ_FIRST(&priv->active_flows);
+			TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+			free(flow);
+		}
+	}
+
+	return err;
+}
+
+static struct rte_flow *
+gve_create_flow_rule(struct rte_eth_dev *dev,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item pattern[],
+		     const struct rte_flow_action actions[],
+		     struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow_rule_params rule = {0};
+	struct gve_flow *flow;
+	uint64_t slab_bits;
+	uint32_t slab_idx;
+	int err;
+
+	err = gve_validate_and_parse_flow(dev, attr, pattern, actions, error,
+					  &rule);
+	if (err)
+		return NULL;
+
+	flow = calloc(1, sizeof(struct gve_flow));
+	if (flow == NULL) {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to allocate memory for flow rule.");
+		return NULL;
+	}
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, flow subsystem not initialized.");
+		goto free_flow_and_unlock;
+	}
+
+	/* Try to allocate a new rule ID from the bitmap. */
+	if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &slab_idx,
+			&slab_bits) == 1) {
+		flow->rule_id = slab_idx + __builtin_ctzll(slab_bits);
+		rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
+	} else {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, could not allocate a new rule ID.");
+		goto free_flow_and_unlock;
+	}
+
+	err = gve_adminq_add_flow_rule(priv, &rule, flow->rule_id);
+	if (err) {
+		rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+		rte_flow_error_set(error, -err,
+				   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				   "Failed to create flow rule, internal device error.");
+		goto free_flow_and_unlock;
+	}
+
+	TAILQ_INSERT_TAIL(&priv->active_flows, flow, list_handle);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return (struct rte_flow *)flow;
+
+free_flow_and_unlock:
+	free(flow);
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return NULL;
+}
+
+static int
+gve_destroy_flow_rule(struct rte_eth_dev *dev, struct rte_flow *flow_handle,
+		      struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow *flow;
+	bool flow_rule_active;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	flow = (struct gve_flow *)flow_handle;
+
+	if (flow == NULL) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, invalid flow provided.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	if (flow->rule_id >= priv->max_flow_rules) {
+		PMD_DRV_LOG(ERR,
+			"Cannot destroy flow rule with invalid ID %d.",
+			flow->rule_id);
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, rule ID is invalid.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	flow_rule_active = !rte_bitmap_get(priv->avail_flow_rule_bmp,
+					   flow->rule_id);
+
+	if (!flow_rule_active) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, handle not found in active list.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	err = gve_adminq_del_flow_rule(priv, flow->rule_id);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, internal device error.");
+		goto unlock;
+	}
+
+	rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+	TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+	free(flow);
+
+	err = 0;
+
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+static int
+gve_flush_flow_rules(struct rte_eth_dev *dev, struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	err = gve_free_flow_rules(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules due to internal device error, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to re-initialize rule ID bitmap, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return 0;
+
+disable_and_free:
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+const struct rte_flow_ops gve_flow_ops = {
+	.create = gve_create_flow_rule,
+	.destroy = gve_destroy_flow_rule,
+	.flush = gve_flush_flow_rules,
+};
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
index 8c17ddd..d597a6c 100644
--- a/drivers/net/gve/gve_flow_rule.h
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -56,4 +56,10 @@ struct gve_flow_rule_params {
 	struct gve_flow_spec mask;
 };
 
+struct gve_priv;
+
+int gve_flow_init_bmp(struct gve_priv *priv);
+void gve_flow_free_bmp(struct gve_priv *priv);
+int gve_free_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_FLOW_RULE_H_ */
diff --git a/drivers/net/gve/meson.build b/drivers/net/gve/meson.build
index c6a9f36..7074988 100644
--- a/drivers/net/gve/meson.build
+++ b/drivers/net/gve/meson.build
@@ -16,5 +16,6 @@ sources = files(
         'gve_ethdev.c',
         'gve_version.c',
         'gve_rss.c',
+        'gve_flow_rule.c',
 )
 includes += include_directories('base')
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/4] net/gve: add flow steering support
  2026-03-03 15:21   ` [PATCH v2 0/4] net/gve: add flow steering support Stephen Hemminger
@ 2026-03-04  1:49     ` Jasper Tran O'Leary
  0 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  1:49 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Joshua Washington

[-- Attachment #1: Type: text/plain, Size: 5422 bytes --]

Thank you for these notes. I submitted a v3 with the following changes.

1. Fixed bit allocation in the flow rule bitmap using the suggestion given.
2. Changed rte_zmalloc/rte_free for struct gve_flow to calloc/free. In
making this change, I realized that the driver currently would not support
flow operations from secondary processes, so I made a note of that in
gve.rst.
3. Changed both verbs to "added" to match the notes from other drivers.

On Tue, Mar 3, 2026 at 7:21 AM Stephen Hemminger <stephen@networkplumber.org>
wrote:

> On Tue,  3 Mar 2026 00:58:00 +0000
> "Jasper Tran O'Leary" <jtranoleary@google.com> wrote:
>
> > This patch series adds flow steering support to the Google Virtual
> > Ethernet (gve) driver. This functionality allows traffic to be directed
> > to specific receive queues based on user-specified flow patterns.
> >
> > The series includes foundational support for extended admin queue
> > commands needed to handle flow rules, the specific adminqueue commands
> > for flow rule management, and the integration with the DPDK rte_flow
> > API. The series adds support flow matching on the following protocols:
> > IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> >
> > Patch Overview:
> >
> > 1. "net/gve: add flow steering device option" checks for and enables
> >    the flow steering capability in the device options during
> >    initialization.
> > 2. "net/gve: introduce extended adminq command" adds infrastructure
> >    for sending extended admin queue commands. These commands use a
> >    flexible buffer descriptor format required for flow rule management.
> > 3. "net/gve: add adminq commands for flow steering" implements the
> >    specific admin queue commands to add and remove flow rules on the
> >    device, including handling of rule IDs and parameters.
> > 4. "net/gve: add rte flow API integration" exposes the flow steering
> >    functionality via the DPDK rte_flow API. This includes strict
> >    pattern validation, rule parsing, and lifecycle management (create,
> >    destroy, flush). It ensures thread-safe access to the flow subsystem
> >    and proper resource cleanup during device reset.
> >
> > Jasper Tran O'Leary (2):
> >   net/gve: add adminq commands for flow steering
> >   net/gve: add rte flow API integration
> >
> > Vee Agarwal (2):
> >   net/gve: add flow steering device option
> >   net/gve: introduce extended adminq command
> >
> >  doc/guides/nics/features/gve.ini       |  12 +
> >  doc/guides/nics/gve.rst                |  26 +
> >  doc/guides/rel_notes/release_26_03.rst |   1 +
> >  drivers/net/gve/base/gve.h             |   3 +-
> >  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
> >  drivers/net/gve/base/gve_adminq.h      |  57 +++
> >  drivers/net/gve/gve_ethdev.c           |  83 +++-
> >  drivers/net/gve/gve_ethdev.h           |  46 ++
> >  drivers/net/gve/gve_flow_rule.c        | 656 +++++++++++++++++++++++++
> >  drivers/net/gve/gve_flow_rule.h        |  65 +++
> >  drivers/net/gve/meson.build            |   1 +
> >  11 files changed, 1063 insertions(+), 5 deletions(-)
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> >
>
> Automated review spotted a few things:
>
> 1. rte_bitmap_scan usage is incorrect.
> 2. Using rte_malloc where not necessary
> 3. Grammar in the release note.
>
> I can fix the last one when merging, the second one is not a big issue
> but would be good to have.
>
> ---
>
>
> Error: Incorrect rte_bitmap_scan usage — wrong rule ID allocation (~85%
> confidence)
>
> In gve_create_flow_rule():
> c
>
> uint64_t bmp_slab __rte_unused;
> ...
> if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &flow->rule_id,
>         &bmp_slab) == 1) {
>     rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
> }
>
> rte_bitmap_scan() writes the slab base position to pos and the slab bit
> pattern to slab. The actual bit position of the first available rule is pos
> + __builtin_ctzll(slab), not pos alone. The __rte_unused annotation on
> bmp_slab confirms the slab value is being ignored entirely.
>
> After the first rule allocation clears bit 0, a subsequent scan returns
> the same slab base (pos=0) with bit 0 now clear in the slab. The code would
> again set rule_id=0 and attempt to clear an already-clear bit, then try to
> create a duplicate rule on the device.
>
> Suggested fix:
> c
>
> uint64_t bmp_slab;
> uint32_t pos;
> ...
> if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &pos,
>         &bmp_slab) == 1) {
>     flow->rule_id = pos + __builtin_ctzll(bmp_slab);
>     rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
> }
>
> Warning: rte_zmalloc used for ordinary control structures
>
> Both avail_flow_rule_bmp_mem and individual gve_flow structures are
> allocated with rte_zmalloc, but they don't require DMA access, NUMA
> placement, or multi-process shared memory. Standard calloc/malloc would be
> more appropriate and wouldn't consume limited hugepage resources.
>
> Warning: Release notes tense inconsistency
> rst
>
> * Added application-initiated device reset.
> + * Add support for receive flow steering.
>
> "Added" (past tense) vs "Add" (imperative). Both entries should use the
> same tense for consistency. The DPDK convention is imperative mood.
>
>

[-- Attachment #2: Type: text/html, Size: 6434 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v3 0/4] net/gve: add flow steering support
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
                       ` (3 preceding siblings ...)
  2026-03-04  1:46     ` [PATCH v3 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
@ 2026-03-04  4:46     ` Jasper Tran O'Leary
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
  5 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:46 UTC (permalink / raw)
  To: jtranoleary; +Cc: dev, joshwash, stephen

Unfortunately some the changes related to rte_bitmap_scan made CI tests fail. Submitting a v4 with fixes.


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 0/4] net/gve: add flow steering support
  2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
                       ` (4 preceding siblings ...)
  2026-03-04  4:46     ` [PATCH v3 0/4] net/gve: add flow steering support Jasper Tran O'Leary
@ 2026-03-04  4:50     ` Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
                         ` (4 more replies)
  5 siblings, 5 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:50 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Joshua Washington

This patch series adds flow steering support to the Google Virtual
Ethernet (gve) driver. This functionality allows traffic to be directed
to specific receive queues based on user-specified flow patterns.

The series includes foundational support for extended admin queue
commands needed to handle flow rules, the specific adminqueue commands
for flow rule management, and the integration with the DPDK rte_flow
API. The series adds support flow matching on the following protocols:
IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.

Patch Overview:

1. "net/gve: add flow steering device option" checks for and enables
   the flow steering capability in the device options during
   initialization.
2. "net/gve: introduce extended adminq command" adds infrastructure
   for sending extended admin queue commands. These commands use a
   flexible buffer descriptor format required for flow rule management.
3. "net/gve: add adminq commands for flow steering" implements the
   specific admin queue commands to add and remove flow rules on the
   device, including handling of rule IDs and parameters.
4. "net/gve: add rte flow API integration" exposes the flow steering
   functionality via the DPDK rte_flow API. This includes strict
   pattern validation, rule parsing, and lifecycle management (create,
   destroy, flush). It ensures thread-safe access to the flow subsystem
   and proper resource cleanup during device reset.

Jasper Tran O'Leary (2):
  net/gve: add adminq commands for flow steering
  net/gve: add rte flow API integration

Vee Agarwal (2):
  net/gve: add flow steering device option
  net/gve: introduce extended adminq command

 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  27 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/base/gve_adminq.c      | 118 ++++-
 drivers/net/gve/base/gve_adminq.h      |  57 +++
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  46 ++
 drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |  65 +++
 drivers/net/gve/meson.build            |   1 +
 11 files changed, 1066 insertions(+), 5 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v4 1/4] net/gve: add flow steering device option
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
@ 2026-03-04  4:50       ` Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
                         ` (3 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:50 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Add a new device option to signal to the driver that the device supports
flow steering. This device option also carries the maximum number of
flow steering rules that the device can store.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 36 ++++++++++++++++++++++++++++---
 drivers/net/gve/base/gve_adminq.h | 11 ++++++++++
 drivers/net/gve/gve_ethdev.h      |  2 ++
 3 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 6bd98d5..64b9468 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -36,6 +36,7 @@ void gve_parse_device_option(struct gve_priv *priv,
 			     struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			     struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			     struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			     struct gve_device_option_flow_steering **dev_op_flow_steering,
 			     struct gve_device_option_modify_ring **dev_op_modify_ring,
 			     struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -109,6 +110,22 @@ void gve_parse_device_option(struct gve_priv *priv,
 		}
 		*dev_op_dqo_rda = RTE_PTR_ADD(option, sizeof(*option));
 		break;
+	case GVE_DEV_OPT_ID_FLOW_STEERING:
+		if (option_length < sizeof(**dev_op_flow_steering) ||
+		    req_feat_mask != GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING) {
+			PMD_DRV_LOG(WARNING, GVE_DEVICE_OPTION_ERROR_FMT,
+				    "Flow Steering", (int)sizeof(**dev_op_flow_steering),
+				    GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING,
+				    option_length, req_feat_mask);
+			break;
+		}
+
+		if (option_length > sizeof(**dev_op_flow_steering)) {
+			PMD_DRV_LOG(WARNING,
+				    GVE_DEVICE_OPTION_TOO_BIG_FMT, "Flow Steering");
+		}
+		*dev_op_flow_steering = RTE_PTR_ADD(option, sizeof(*option));
+		break;
 	case GVE_DEV_OPT_ID_MODIFY_RING:
 		/* Min ring size bound is optional. */
 		if (option_length < (sizeof(**dev_op_modify_ring) -
@@ -167,6 +184,7 @@ gve_process_device_options(struct gve_priv *priv,
 			   struct gve_device_option_gqi_rda **dev_op_gqi_rda,
 			   struct gve_device_option_gqi_qpl **dev_op_gqi_qpl,
 			   struct gve_device_option_dqo_rda **dev_op_dqo_rda,
+			   struct gve_device_option_flow_steering **dev_op_flow_steering,
 			   struct gve_device_option_modify_ring **dev_op_modify_ring,
 			   struct gve_device_option_jumbo_frames **dev_op_jumbo_frames)
 {
@@ -188,8 +206,8 @@ gve_process_device_options(struct gve_priv *priv,
 
 		gve_parse_device_option(priv, dev_opt,
 					dev_op_gqi_rda, dev_op_gqi_qpl,
-					dev_op_dqo_rda, dev_op_modify_ring,
-					dev_op_jumbo_frames);
+					dev_op_dqo_rda, dev_op_flow_steering,
+					dev_op_modify_ring, dev_op_jumbo_frames);
 		dev_opt = next_opt;
 	}
 
@@ -777,9 +795,19 @@ gve_set_max_desc_cnt(struct gve_priv *priv,
 
 static void gve_enable_supported_features(struct gve_priv *priv,
 	u32 supported_features_mask,
+	const struct gve_device_option_flow_steering *dev_op_flow_steering,
 	const struct gve_device_option_modify_ring *dev_op_modify_ring,
 	const struct gve_device_option_jumbo_frames *dev_op_jumbo_frames)
 {
+	if (dev_op_flow_steering &&
+	    (supported_features_mask & GVE_SUP_FLOW_STEERING_MASK) &&
+	    dev_op_flow_steering->max_flow_rules) {
+		priv->max_flow_rules =
+			be32_to_cpu(dev_op_flow_steering->max_flow_rules);
+		PMD_DRV_LOG(INFO,
+			    "FLOW STEERING device option enabled with max rule limit of %u.",
+			    priv->max_flow_rules);
+	}
 	if (dev_op_modify_ring &&
 	    (supported_features_mask & GVE_SUP_MODIFY_RING_MASK)) {
 		PMD_DRV_LOG(INFO, "MODIFY RING device option enabled.");
@@ -802,6 +830,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 {
 	struct gve_device_option_jumbo_frames *dev_op_jumbo_frames = NULL;
 	struct gve_device_option_modify_ring *dev_op_modify_ring = NULL;
+	struct gve_device_option_flow_steering *dev_op_flow_steering = NULL;
 	struct gve_device_option_gqi_rda *dev_op_gqi_rda = NULL;
 	struct gve_device_option_gqi_qpl *dev_op_gqi_qpl = NULL;
 	struct gve_device_option_dqo_rda *dev_op_dqo_rda = NULL;
@@ -829,6 +858,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 
 	err = gve_process_device_options(priv, descriptor, &dev_op_gqi_rda,
 					 &dev_op_gqi_qpl, &dev_op_dqo_rda,
+					 &dev_op_flow_steering,
 					 &dev_op_modify_ring,
 					 &dev_op_jumbo_frames);
 	if (err)
@@ -884,7 +914,7 @@ int gve_adminq_describe_device(struct gve_priv *priv)
 	priv->default_num_queues = be16_to_cpu(descriptor->default_num_queues);
 
 	gve_enable_supported_features(priv, supported_features_mask,
-				      dev_op_modify_ring,
+				      dev_op_flow_steering, dev_op_modify_ring,
 				      dev_op_jumbo_frames);
 
 free_device_descriptor:
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index 6a3d469..e237353 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -117,6 +117,14 @@ struct gve_ring_size_bound {
 
 GVE_CHECK_STRUCT_LEN(4, gve_ring_size_bound);
 
+struct gve_device_option_flow_steering {
+	__be32 supported_features_mask;
+	__be32 reserved;
+	__be32 max_flow_rules;
+};
+
+GVE_CHECK_STRUCT_LEN(12, gve_device_option_flow_steering);
+
 struct gve_device_option_modify_ring {
 	__be32 supported_features_mask;
 	struct gve_ring_size_bound max_ring_size;
@@ -148,6 +156,7 @@ enum gve_dev_opt_id {
 	GVE_DEV_OPT_ID_DQO_RDA = 0x4,
 	GVE_DEV_OPT_ID_MODIFY_RING = 0x6,
 	GVE_DEV_OPT_ID_JUMBO_FRAMES = 0x8,
+	GVE_DEV_OPT_ID_FLOW_STEERING = 0xb,
 };
 
 enum gve_dev_opt_req_feat_mask {
@@ -155,6 +164,7 @@ enum gve_dev_opt_req_feat_mask {
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_RDA = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_GQI_QPL = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_DQO_RDA = 0x0,
+	GVE_DEV_OPT_REQ_FEAT_MASK_FLOW_STEERING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_MODIFY_RING = 0x0,
 	GVE_DEV_OPT_REQ_FEAT_MASK_JUMBO_FRAMES = 0x0,
 };
@@ -162,6 +172,7 @@ enum gve_dev_opt_req_feat_mask {
 enum gve_sup_feature_mask {
 	GVE_SUP_MODIFY_RING_MASK = 1 << 0,
 	GVE_SUP_JUMBO_FRAMES_MASK = 1 << 2,
+	GVE_SUP_FLOW_STEERING_MASK = 1 << 5,
 };
 
 #define GVE_DEV_OPT_LEN_GQI_RAW_ADDRESSING 0x0
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index f7cc781..3a810b6 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -332,6 +332,8 @@ struct gve_priv {
 
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
+
+	uint32_t max_flow_rules;
 };
 
 static inline bool
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 2/4] net/gve: introduce extended adminq command
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
@ 2026-03-04  4:50       ` Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
                         ` (2 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:50 UTC (permalink / raw)
  To: stephen; +Cc: dev, Vee Agarwal, Jasper Tran O'Leary, Joshua Washington

From: Vee Agarwal <veethebee@google.com>

Flow steering adminq commands are too large to fit into a normal adminq
command buffer which accepts at most 56 bytes. As a result, introduce
extended adminq commands which permit larger command buffers using
indirection. Namely, extended command operations point to inner command
buffers allocated at a specified DMA address. As specified with the
device, all extended commands will use inner opcodes larger than 0xFF.

Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 30 ++++++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 16 ++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 64b9468..0cc6d44 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -438,6 +438,8 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 
 	memcpy(cmd, cmd_orig, sizeof(*cmd_orig));
 	opcode = be32_to_cpu(READ_ONCE32(cmd->opcode));
+	if (opcode == GVE_ADMINQ_EXTENDED_COMMAND)
+		opcode = be32_to_cpu(READ_ONCE32(cmd->extended_command.inner_opcode));
 
 	switch (opcode) {
 	case GVE_ADMINQ_DESCRIBE_DEVICE:
@@ -516,6 +518,34 @@ static int gve_adminq_execute_cmd(struct gve_priv *priv,
 	return gve_adminq_kick_and_wait(priv);
 }
 
+static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
+					   size_t cmd_size, void *cmd_orig)
+{
+	union gve_adminq_command cmd;
+	struct gve_dma_mem inner_cmd_dma_mem;
+	void *inner_cmd;
+	int err;
+
+	inner_cmd = gve_alloc_dma_mem(&inner_cmd_dma_mem, cmd_size);
+	if (!inner_cmd)
+		return -ENOMEM;
+
+	memcpy(inner_cmd, cmd_orig, cmd_size);
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.opcode = cpu_to_be32(GVE_ADMINQ_EXTENDED_COMMAND);
+	cmd.extended_command = (struct gve_adminq_extended_command) {
+		.inner_opcode = cpu_to_be32(opcode),
+		.inner_length = cpu_to_be32(cmd_size),
+		.inner_command_addr = cpu_to_be64(inner_cmd_dma_mem.pa),
+	};
+
+	err = gve_adminq_execute_cmd(priv, &cmd);
+
+	gve_free_dma_mem(&inner_cmd_dma_mem);
+	return err;
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index e237353..f52658e 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -25,8 +25,15 @@ enum gve_adminq_opcodes {
 	GVE_ADMINQ_REPORT_LINK_SPEED		= 0xD,
 	GVE_ADMINQ_GET_PTYPE_MAP		= 0xE,
 	GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY	= 0xF,
+	/* For commands that are larger than 56 bytes */
+	GVE_ADMINQ_EXTENDED_COMMAND		= 0xFF,
 };
 
+/* The normal adminq command is restricted to be 56 bytes at maximum. For the
+ * longer adminq command, it is wrapped by GVE_ADMINQ_EXTENDED_COMMAND with
+ * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
+ * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
+ */
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -194,6 +201,14 @@ enum gve_driver_capbility {
 #define GVE_DRIVER_CAPABILITY_FLAGS3 0x0
 #define GVE_DRIVER_CAPABILITY_FLAGS4 0x0
 
+struct gve_adminq_extended_command {
+	__be32 inner_opcode;
+	__be32 inner_length;
+	__be64 inner_command_addr;
+};
+
+GVE_CHECK_STRUCT_LEN(16, gve_adminq_extended_command);
+
 struct gve_driver_info {
 	u8 os_type;	/* 0x05 = DPDK */
 	u8 driver_major;
@@ -440,6 +455,7 @@ union gve_adminq_command {
 			struct gve_adminq_get_ptype_map get_ptype_map;
 			struct gve_adminq_verify_driver_compatibility
 				verify_driver_compatibility;
+			struct gve_adminq_extended_command extended_command;
 		};
 	};
 	u8 reserved[64];
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 3/4] net/gve: add adminq commands for flow steering
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
@ 2026-03-04  4:50       ` Jasper Tran O'Leary
  2026-03-04  4:50       ` [PATCH v4 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
  2026-03-04 15:59       ` [PATCH v4 0/4] net/gve: add flow steering support Stephen Hemminger
  4 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:50 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Add new adminq commands for the driver to configure flow rules that are
stored in the device. For configuring flow rules, 3 sub commands are
supported.
- create: creates a new flow rule with a specific rule_id.
- destroy: deletes an existing flow rule with the specified rule_id.
- flush: clears and deletes all currently active flow rules.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 drivers/net/gve/base/gve_adminq.c | 52 +++++++++++++++++++++++++++
 drivers/net/gve/base/gve_adminq.h | 30 ++++++++++++++++
 drivers/net/gve/gve_ethdev.h      |  1 +
 drivers/net/gve/gve_flow_rule.h   | 59 +++++++++++++++++++++++++++++++
 4 files changed, 142 insertions(+)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h

diff --git a/drivers/net/gve/base/gve_adminq.c b/drivers/net/gve/base/gve_adminq.c
index 0cc6d44..9a94591 100644
--- a/drivers/net/gve/base/gve_adminq.c
+++ b/drivers/net/gve/base/gve_adminq.c
@@ -239,6 +239,7 @@ int gve_adminq_alloc(struct gve_priv *priv)
 	priv->adminq_report_stats_cnt = 0;
 	priv->adminq_report_link_speed_cnt = 0;
 	priv->adminq_get_ptype_map_cnt = 0;
+	priv->adminq_cfg_flow_rule_cnt = 0;
 
 	/* Setup Admin queue with the device */
 	rte_pci_read_config(priv->pci_dev, &pci_rev_id, sizeof(pci_rev_id),
@@ -487,6 +488,9 @@ static int gve_adminq_issue_cmd(struct gve_priv *priv,
 	case GVE_ADMINQ_VERIFY_DRIVER_COMPATIBILITY:
 		priv->adminq_verify_driver_compatibility_cnt++;
 		break;
+	case GVE_ADMINQ_CONFIGURE_FLOW_RULE:
+		priv->adminq_cfg_flow_rule_cnt++;
+		break;
 	default:
 		PMD_DRV_LOG(ERR, "unknown AQ command opcode %d", opcode);
 	}
@@ -546,6 +550,54 @@ static int gve_adminq_execute_extended_cmd(struct gve_priv *priv, u32 opcode,
 	return err;
 }
 
+static int
+gve_adminq_configure_flow_rule(struct gve_priv *priv,
+			       struct gve_adminq_configure_flow_rule *flow_rule_cmd)
+{
+	int err = gve_adminq_execute_extended_cmd(priv,
+			GVE_ADMINQ_CONFIGURE_FLOW_RULE,
+			sizeof(struct gve_adminq_configure_flow_rule),
+			flow_rule_cmd);
+
+	return err;
+}
+
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_ADD),
+		.location = cpu_to_be32(loc),
+		.rule = {
+			.flow_type = cpu_to_be16(rule->flow_type),
+			.action = cpu_to_be16(rule->action),
+			.key = rule->key,
+			.mask = rule->mask,
+		},
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_DEL),
+		.location = cpu_to_be32(loc),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
+int gve_adminq_reset_flow_rules(struct gve_priv *priv)
+{
+	struct gve_adminq_configure_flow_rule flow_rule_cmd = {
+		.opcode = cpu_to_be16(GVE_FLOW_RULE_CFG_RESET),
+	};
+
+	return gve_adminq_configure_flow_rule(priv, &flow_rule_cmd);
+}
+
 /* The device specifies that the management vector can either be the first irq
  * or the last irq. ntfy_blk_msix_base_idx indicates the first irq assigned to
  * the ntfy blks. It if is 0 then the management vector is last, if it is 1 then
diff --git a/drivers/net/gve/base/gve_adminq.h b/drivers/net/gve/base/gve_adminq.h
index f52658e..d8e5e6a 100644
--- a/drivers/net/gve/base/gve_adminq.h
+++ b/drivers/net/gve/base/gve_adminq.h
@@ -7,6 +7,7 @@
 #define _GVE_ADMINQ_H
 
 #include "gve_osdep.h"
+#include "../gve_flow_rule.h"
 
 /* Admin queue opcodes */
 enum gve_adminq_opcodes {
@@ -34,6 +35,10 @@ enum gve_adminq_opcodes {
  * inner opcode of gve_adminq_extended_cmd_opcodes specified. The inner command
  * is written in the dma memory allocated by GVE_ADMINQ_EXTENDED_COMMAND.
  */
+enum gve_adminq_extended_cmd_opcodes {
+	GVE_ADMINQ_CONFIGURE_FLOW_RULE	= 0x101,
+};
+
 /* Admin queue status codes */
 enum gve_adminq_statuses {
 	GVE_ADMINQ_COMMAND_UNSET			= 0x0,
@@ -434,6 +439,26 @@ struct gve_adminq_configure_rss {
 	__be64 indir_addr;
 };
 
+/* Flow rule definition for the admin queue using network byte order (big
+ * endian). This struct represents the hardware wire format and should not be
+ * used outside of admin queue contexts.
+ */
+struct gve_adminq_flow_rule {
+	__be16 flow_type;
+	__be16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+struct gve_adminq_configure_flow_rule {
+	__be16 opcode;
+	u8 padding[2];
+	struct gve_adminq_flow_rule rule;
+	__be32 location;
+};
+
+GVE_CHECK_STRUCT_LEN(92, gve_adminq_configure_flow_rule);
+
 union gve_adminq_command {
 	struct {
 		__be32 opcode;
@@ -499,4 +524,9 @@ int gve_adminq_verify_driver_compatibility(struct gve_priv *priv,
 int gve_adminq_configure_rss(struct gve_priv *priv,
 			     struct gve_rss_config *rss_config);
 
+int gve_adminq_add_flow_rule(struct gve_priv *priv,
+			     struct gve_flow_rule_params *rule, u32 loc);
+int gve_adminq_del_flow_rule(struct gve_priv *priv, u32 loc);
+int gve_adminq_reset_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_ADMINQ_H */
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 3a810b6..4e07ca8 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -314,6 +314,7 @@ struct gve_priv {
 	uint32_t adminq_report_link_speed_cnt;
 	uint32_t adminq_get_ptype_map_cnt;
 	uint32_t adminq_verify_driver_compatibility_cnt;
+	uint32_t adminq_cfg_flow_rule_cnt;
 	volatile uint32_t state_flags;
 
 	/* Gvnic device link speed from hypervisor. */
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
new file mode 100644
index 0000000..8c17ddd
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#ifndef _GVE_FLOW_RULE_H_
+#define _GVE_FLOW_RULE_H_
+
+#include "base/gve_osdep.h"
+
+enum gve_adminq_flow_rule_cfg_opcode {
+	GVE_FLOW_RULE_CFG_ADD	= 0,
+	GVE_FLOW_RULE_CFG_DEL	= 1,
+	GVE_FLOW_RULE_CFG_RESET	= 2,
+};
+
+enum gve_adminq_flow_type {
+	GVE_FLOW_TYPE_TCPV4,
+	GVE_FLOW_TYPE_UDPV4,
+	GVE_FLOW_TYPE_SCTPV4,
+	GVE_FLOW_TYPE_AHV4,
+	GVE_FLOW_TYPE_ESPV4,
+	GVE_FLOW_TYPE_TCPV6,
+	GVE_FLOW_TYPE_UDPV6,
+	GVE_FLOW_TYPE_SCTPV6,
+	GVE_FLOW_TYPE_AHV6,
+	GVE_FLOW_TYPE_ESPV6,
+};
+
+struct gve_flow_spec {
+	__be32 src_ip[4];
+	__be32 dst_ip[4];
+	union {
+		struct {
+			__be16 src_port;
+			__be16 dst_port;
+		};
+		__be32 spi;
+	};
+	union {
+		u8 tos;
+		u8 tclass;
+	};
+};
+
+/* Flow rule parameters using mixed endianness.
+ * - flow_type and action are guest endian.
+ * - key and mask are in network byte order (big endian), matching rte_flow.
+ * This struct is used by the driver when validating and creating flow rules;
+ * guest endian fields are only converted to network byte order within admin
+ * queue functions.
+ */
+struct gve_flow_rule_params {
+	u16 flow_type;
+	u16 action; /* RX queue id */
+	struct gve_flow_spec key;
+	struct gve_flow_spec mask;
+};
+
+#endif /* _GVE_FLOW_RULE_H_ */
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v4 4/4] net/gve: add rte flow API integration
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
                         ` (2 preceding siblings ...)
  2026-03-04  4:50       ` [PATCH v4 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
@ 2026-03-04  4:50       ` Jasper Tran O'Leary
  2026-03-04 15:59       ` [PATCH v4 0/4] net/gve: add flow steering support Stephen Hemminger
  4 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04  4:50 UTC (permalink / raw)
  To: stephen; +Cc: dev, Jasper Tran O'Leary, Vee Agarwal, Joshua Washington

Implement driver callbacks for the following rte flow operations:
create, destroy, and flush. This change enables receive flow steering
(RFS) for n-tuple based flow rules for the gve driver.

The implementation supports matching ingress IPv4/IPv6 traffic combined
with TCP, UDP, SCTP, ESP, or AH protocols. Supported fields for
matching include IP source/destination addresses, L4 source/destination
ports (for TCP/UDP/SCTP), and SPI (for ESP/AH). The only supported
action is RTE_FLOW_ACTION_TYPE_QUEUE, which steers matching packets to
a specified rx queue.

Co-developed-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Vee Agarwal <veethebee@google.com>
Signed-off-by: Jasper Tran O'Leary <jtranoleary@google.com>
Reviewed-by: Joshua Washington <joshwash@google.com>
---
 doc/guides/nics/features/gve.ini       |  12 +
 doc/guides/nics/gve.rst                |  27 +
 doc/guides/rel_notes/release_26_03.rst |   1 +
 drivers/net/gve/base/gve.h             |   3 +-
 drivers/net/gve/gve_ethdev.c           |  83 +++-
 drivers/net/gve/gve_ethdev.h           |  43 ++
 drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
 drivers/net/gve/gve_flow_rule.h        |   6 +
 drivers/net/gve/meson.build            |   1 +
 9 files changed, 832 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c

diff --git a/doc/guides/nics/features/gve.ini b/doc/guides/nics/features/gve.ini
index ed040a0..89c97fd 100644
--- a/doc/guides/nics/features/gve.ini
+++ b/doc/guides/nics/features/gve.ini
@@ -19,3 +19,15 @@ Linux                = Y
 x86-32               = Y
 x86-64               = Y
 Usage doc            = Y
+
+[rte_flow items]
+ah                   = Y
+esp                  = Y
+ipv4                 = Y
+ipv6                 = Y
+sctp                 = Y
+tcp                  = Y
+udp                  = Y
+
+[rte_flow actions]
+queue                = Y
diff --git a/doc/guides/nics/gve.rst b/doc/guides/nics/gve.rst
index 6b4d1f7..8367ca9 100644
--- a/doc/guides/nics/gve.rst
+++ b/doc/guides/nics/gve.rst
@@ -103,6 +103,33 @@ the redirection table will be available for querying upon initial hash configura
 When performing redirection table updates,
 it is possible to update individual table entries.
 
+Flow Steering
+^^^^^^^^^^^^^
+
+The driver supports receive flow steering (RFS) via the standard ``rte_flow``
+API. This allows applications to steer traffic to specific queues based on
+5-tuple matching. 3-tuple matching may be supported in future releases.
+
+**Supported Patterns**
+
+L3 Protocols
+  IPv4/IPv6 source and destination addresses.
+L4 Protocols
+  TCP/UDP/SCTP source and destination ports.
+Security Protocols
+  ESP/AH SPI.
+
+**Supported Actions**
+
+- ``RTE_FLOW_ACTION_TYPE_QUEUE``: Steer packets to a specific Rx queue.
+
+**Limitations**
+
+- Flow steering operations are only supported in the primary process.
+- Only ingress flow rules are allowed.
+- Flow priorities are not supported (must be 0).
+- Masking is limited to full matches i.e. 0x00...0 or 0xFF...F.
+
 Application-Initiated Reset
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The driver allows an application to reset the gVNIC device.
diff --git a/doc/guides/rel_notes/release_26_03.rst b/doc/guides/rel_notes/release_26_03.rst
index 1855d90..b643809 100644
--- a/doc/guides/rel_notes/release_26_03.rst
+++ b/doc/guides/rel_notes/release_26_03.rst
@@ -78,6 +78,7 @@ New Features
 * **Updated Google Virtual Ethernet (gve) driver.**
 
   * Added application-initiated device reset.
+  * Added support for receive flow steering.
 
 * **Updated Intel iavf driver.**
 
diff --git a/drivers/net/gve/base/gve.h b/drivers/net/gve/base/gve.h
index 99514cb..18363fa 100644
--- a/drivers/net/gve/base/gve.h
+++ b/drivers/net/gve/base/gve.h
@@ -50,7 +50,8 @@ enum gve_state_flags_bit {
 	GVE_PRIV_FLAGS_ADMIN_QUEUE_OK		= 1,
 	GVE_PRIV_FLAGS_DEVICE_RESOURCES_OK	= 2,
 	GVE_PRIV_FLAGS_DEVICE_RINGS_OK		= 3,
-	GVE_PRIV_FLAGS_NAPI_ENABLED		= 4,
+	GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK	= 4,
+	GVE_PRIV_FLAGS_NAPI_ENABLED		= 5,
 };
 
 enum gve_rss_hash_algorithm {
diff --git a/drivers/net/gve/gve_ethdev.c b/drivers/net/gve/gve_ethdev.c
index 5912fec..6ce3ef3 100644
--- a/drivers/net/gve/gve_ethdev.c
+++ b/drivers/net/gve/gve_ethdev.c
@@ -510,6 +510,49 @@ gve_free_ptype_lut_dqo(struct gve_priv *priv)
 	}
 }
 
+static int
+gve_setup_flow_subsystem(struct gve_priv *priv)
+{
+	int err;
+
+	priv->flow_rule_bmp_size =
+			rte_bitmap_get_memory_footprint(priv->max_flow_rules);
+	priv->avail_flow_rule_bmp_mem = rte_zmalloc("gve_flow_rule_bmp",
+			priv->flow_rule_bmp_size, 0);
+	if (!priv->avail_flow_rule_bmp_mem) {
+		PMD_DRV_LOG(ERR, "Failed to alloc bitmap for flow rules.");
+		err = -ENOMEM;
+		goto free_flow_rule_bmp;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		PMD_DRV_LOG(ERR, "Failed to initialize flow rule bitmap.");
+		goto free_flow_rule_bmp;
+	}
+
+	TAILQ_INIT(&priv->active_flows);
+	gve_set_flow_subsystem_ok(priv);
+
+	return 0;
+
+free_flow_rule_bmp:
+	gve_flow_free_bmp(priv);
+	return err;
+}
+
+static void
+gve_teardown_flow_subsystem(struct gve_priv *priv)
+{
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+	gve_free_flow_rules(priv);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+}
+
 static void
 gve_teardown_device_resources(struct gve_priv *priv)
 {
@@ -519,7 +562,9 @@ gve_teardown_device_resources(struct gve_priv *priv)
 	if (gve_get_device_resources_ok(priv)) {
 		err = gve_adminq_deconfigure_device_resources(priv);
 		if (err)
-			PMD_DRV_LOG(ERR, "Could not deconfigure device resources: err=%d", err);
+			PMD_DRV_LOG(ERR,
+				"Could not deconfigure device resources: err=%d",
+				err);
 	}
 
 	gve_free_ptype_lut_dqo(priv);
@@ -543,6 +588,11 @@ gve_dev_close(struct rte_eth_dev *dev)
 			PMD_DRV_LOG(ERR, "Failed to stop dev.");
 	}
 
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
+	pthread_mutex_destroy(&priv->flow_rule_lock);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -566,6 +616,9 @@ gve_dev_reset(struct rte_eth_dev *dev)
 	}
 
 	/* Tear down all device resources before re-initializing. */
+	if (gve_get_flow_subsystem_ok(priv))
+		gve_teardown_flow_subsystem(priv);
+
 	gve_free_queues(dev);
 	gve_teardown_device_resources(priv);
 	gve_adminq_free(priv);
@@ -1094,6 +1147,18 @@ gve_rss_reta_query(struct rte_eth_dev *dev,
 	return 0;
 }
 
+static int
+gve_flow_ops_get(struct rte_eth_dev *dev, const struct rte_flow_ops **ops)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+
+	if (!gve_get_flow_subsystem_ok(priv))
+		return -ENOTSUP;
+
+	*ops = &gve_flow_ops;
+	return 0;
+}
+
 static const struct eth_dev_ops gve_eth_dev_ops = {
 	.dev_configure        = gve_dev_configure,
 	.dev_start            = gve_dev_start,
@@ -1109,6 +1174,7 @@ static const struct eth_dev_ops gve_eth_dev_ops = {
 	.tx_queue_start       = gve_tx_queue_start,
 	.rx_queue_stop        = gve_rx_queue_stop,
 	.tx_queue_stop        = gve_tx_queue_stop,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1136,6 +1202,7 @@ static const struct eth_dev_ops gve_eth_dev_ops_dqo = {
 	.tx_queue_start       = gve_tx_queue_start_dqo,
 	.rx_queue_stop        = gve_rx_queue_stop_dqo,
 	.tx_queue_stop        = gve_tx_queue_stop_dqo,
+	.flow_ops_get         = gve_flow_ops_get,
 	.link_update          = gve_link_update,
 	.stats_get            = gve_dev_stats_get,
 	.stats_reset          = gve_dev_stats_reset,
@@ -1303,6 +1370,14 @@ gve_init_priv(struct gve_priv *priv, bool skip_describe_device)
 		    priv->max_nb_txq, priv->max_nb_rxq);
 
 setup_device:
+	if (priv->max_flow_rules) {
+		err = gve_setup_flow_subsystem(priv);
+		if (err)
+			PMD_DRV_LOG(WARNING,
+				    "Failed to set up flow subsystem: err=%d, flow steering will be disabled.",
+				    err);
+	}
+
 	err = gve_setup_device_resources(priv);
 	if (!err)
 		return 0;
@@ -1318,6 +1393,7 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 	int max_tx_queues, max_rx_queues;
 	struct rte_pci_device *pci_dev;
 	struct gve_registers *reg_bar;
+	pthread_mutexattr_t mutexattr;
 	rte_be32_t *db_bar;
 	int err;
 
@@ -1377,6 +1453,11 @@ gve_dev_init(struct rte_eth_dev *eth_dev)
 
 	eth_dev->data->mac_addrs = &priv->dev_addr;
 
+	pthread_mutexattr_init(&mutexattr);
+	pthread_mutexattr_setpshared(&mutexattr, PTHREAD_PROCESS_SHARED);
+	pthread_mutex_init(&priv->flow_rule_lock, &mutexattr);
+	pthread_mutexattr_destroy(&mutexattr);
+
 	return 0;
 }
 
diff --git a/drivers/net/gve/gve_ethdev.h b/drivers/net/gve/gve_ethdev.h
index 4e07ca8..2d570d0 100644
--- a/drivers/net/gve/gve_ethdev.h
+++ b/drivers/net/gve/gve_ethdev.h
@@ -9,6 +9,8 @@
 #include <ethdev_pci.h>
 #include <rte_ether.h>
 #include <rte_pci.h>
+#include <pthread.h>
+#include <rte_bitmap.h>
 
 #include "base/gve.h"
 
@@ -252,6 +254,13 @@ struct gve_rx_queue {
 	uint8_t is_gqi_qpl;
 };
 
+struct gve_flow {
+	uint32_t rule_id;
+	TAILQ_ENTRY(gve_flow) list_handle;
+};
+
+extern const struct rte_flow_ops gve_flow_ops;
+
 struct gve_priv {
 	struct gve_irq_db *irq_dbs; /* array of num_ntfy_blks */
 	const struct rte_memzone *irq_dbs_mz;
@@ -334,7 +343,13 @@ struct gve_priv {
 	struct gve_rss_config rss_config;
 	struct gve_ptype_lut *ptype_lut_dqo;
 
+	/* Flow rule management */
 	uint32_t max_flow_rules;
+	uint32_t flow_rule_bmp_size;
+	struct rte_bitmap *avail_flow_rule_bmp; /* Tracks available rule IDs (1 = available) */
+	void *avail_flow_rule_bmp_mem; /* Backing memory for the bitmap */
+	pthread_mutex_t flow_rule_lock; /* Lock for bitmap and tailq access */
+	TAILQ_HEAD(, gve_flow) active_flows;
 };
 
 static inline bool
@@ -407,6 +422,34 @@ gve_clear_device_rings_ok(struct gve_priv *priv)
 				&priv->state_flags);
 }
 
+static inline bool
+gve_get_flow_subsystem_ok(struct gve_priv *priv)
+{
+	bool ret;
+
+	ret = !!rte_bit_relaxed_get32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				      &priv->state_flags);
+	rte_atomic_thread_fence(rte_memory_order_acquire);
+
+	return ret;
+}
+
+static inline void
+gve_set_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_set32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+			      &priv->state_flags);
+}
+
+static inline void
+gve_clear_flow_subsystem_ok(struct gve_priv *priv)
+{
+	rte_atomic_thread_fence(rte_memory_order_release);
+	rte_bit_relaxed_clear32(GVE_PRIV_FLAGS_FLOW_SUBSYSTEM_OK,
+				&priv->state_flags);
+}
+
 int
 gve_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_id, uint16_t nb_desc,
 		   unsigned int socket_id, const struct rte_eth_rxconf *conf,
diff --git a/drivers/net/gve/gve_flow_rule.c b/drivers/net/gve/gve_flow_rule.c
new file mode 100644
index 0000000..1266e19
--- /dev/null
+++ b/drivers/net/gve/gve_flow_rule.c
@@ -0,0 +1,658 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(C) 2026 Google LLC
+ */
+
+#include <rte_flow.h>
+#include <rte_flow_driver.h>
+#include "base/gve_adminq.h"
+#include "gve_ethdev.h"
+
+static int
+gve_validate_flow_attr(const struct rte_flow_attr *attr,
+		       struct rte_flow_error *error)
+{
+	if (attr == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, NULL,
+				"Invalid flow attribute");
+		return -EINVAL;
+	}
+	if (attr->egress || attr->transfer) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR, attr,
+				"Only ingress is supported");
+		return -EINVAL;
+	}
+	if (!attr->ingress) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_INGRESS, attr,
+				"Ingress attribute must be set");
+		return -EINVAL;
+	}
+	if (attr->priority != 0) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY, attr,
+				"Priority levels are not supported");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static void
+gve_parse_ipv4(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv4 *spec = item->spec;
+		const struct rte_flow_item_ipv4 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv4_mask;
+
+		rule->key.src_ip[0] = spec->hdr.src_addr;
+		rule->key.dst_ip[0] = spec->hdr.dst_addr;
+		rule->mask.src_ip[0] = mask->hdr.src_addr;
+		rule->mask.dst_ip[0] = mask->hdr.dst_addr;
+	}
+}
+
+static void
+gve_parse_ipv6(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ipv6 *spec = item->spec;
+		const struct rte_flow_item_ipv6 *mask =
+			item->mask ? item->mask : &rte_flow_item_ipv6_mask;
+		const __be32 *src_ip = (const __be32 *)&spec->hdr.src_addr;
+		const __be32 *src_mask = (const __be32 *)&mask->hdr.src_addr;
+		const __be32 *dst_ip = (const __be32 *)&spec->hdr.dst_addr;
+		const __be32 *dst_mask = (const __be32 *)&mask->hdr.dst_addr;
+		int i;
+
+		/*
+		 * The device expects IPv6 addresses as an array of 4 32-bit words
+		 * in reverse word order (the MSB word at index 3 and the LSB word
+		 * at index 0). We must reverse the DPDK network byte order array.
+		 */
+		for (i = 0; i < 4; i++) {
+			rule->key.src_ip[3 - i] = src_ip[i];
+			rule->key.dst_ip[3 - i] = dst_ip[i];
+			rule->mask.src_ip[3 - i] = src_mask[i];
+			rule->mask.dst_ip[3 - i] = dst_mask[i];
+		}
+	}
+}
+
+static void
+gve_parse_udp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_udp *spec = item->spec;
+		const struct rte_flow_item_udp *mask =
+			item->mask ? item->mask : &rte_flow_item_udp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_tcp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_tcp *spec = item->spec;
+		const struct rte_flow_item_tcp *mask =
+			item->mask ? item->mask : &rte_flow_item_tcp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_sctp(const struct rte_flow_item *item,
+	       struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_sctp *spec = item->spec;
+		const struct rte_flow_item_sctp *mask =
+			item->mask ? item->mask : &rte_flow_item_sctp_mask;
+
+		rule->key.src_port = spec->hdr.src_port;
+		rule->key.dst_port = spec->hdr.dst_port;
+		rule->mask.src_port = mask->hdr.src_port;
+		rule->mask.dst_port = mask->hdr.dst_port;
+	}
+}
+
+static void
+gve_parse_esp(const struct rte_flow_item *item,
+	      struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_esp *spec = item->spec;
+		const struct rte_flow_item_esp *mask =
+			item->mask ? item->mask : &rte_flow_item_esp_mask;
+
+		rule->key.spi = spec->hdr.spi;
+		rule->mask.spi = mask->hdr.spi;
+	}
+}
+
+static void
+gve_parse_ah(const struct rte_flow_item *item, struct gve_flow_rule_params *rule)
+{
+	if (item->spec) {
+		const struct rte_flow_item_ah *spec = item->spec;
+		const struct rte_flow_item_ah *mask =
+			item->mask ? item->mask : &rte_flow_item_ah_mask;
+
+		rule->key.spi = spec->spi;
+		rule->mask.spi = mask->spi;
+	}
+}
+
+static int
+gve_validate_and_parse_flow_pattern(const struct rte_flow_item pattern[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_item *item = pattern;
+	enum rte_flow_item_type l3_type = RTE_FLOW_ITEM_TYPE_VOID;
+	enum rte_flow_item_type l4_type = RTE_FLOW_ITEM_TYPE_VOID;
+
+	if (pattern == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				RTE_FLOW_ERROR_TYPE_ITEM_NUM, NULL,
+				"Invalid flow pattern");
+		return -EINVAL;
+	}
+
+	for (; item->type != RTE_FLOW_ITEM_TYPE_END; item++) {
+		if (item->last) {
+			/* Last and range are not supported as match criteria. */
+			rte_flow_error_set(error, EINVAL,
+					   RTE_FLOW_ERROR_TYPE_ITEM,
+					   item,
+					   "No support for range");
+			return -EINVAL;
+		}
+		switch (item->type) {
+		case RTE_FLOW_ITEM_TYPE_VOID:
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV4:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv4(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_IPV6:
+			if (l3_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L3 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ipv6(item, rule);
+			l3_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_udp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_tcp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_sctp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_esp(item, rule);
+			l4_type = item->type;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			if (l4_type != RTE_FLOW_ITEM_TYPE_VOID) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ITEM,
+						   item,
+						   "Multiple L4 items not supported");
+				return -EINVAL;
+			}
+			gve_parse_ah(item, rule);
+			l4_type = item->type;
+			break;
+		default:
+			rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ITEM, item,
+				   "Unsupported flow pattern item type");
+			return -EINVAL;
+		}
+	}
+
+	switch (l3_type) {
+	case RTE_FLOW_ITEM_TYPE_IPV4:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV4;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV4;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	case RTE_FLOW_ITEM_TYPE_IPV6:
+		switch (l4_type) {
+		case RTE_FLOW_ITEM_TYPE_TCP:
+			rule->flow_type = GVE_FLOW_TYPE_TCPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_UDP:
+			rule->flow_type = GVE_FLOW_TYPE_UDPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_SCTP:
+			rule->flow_type = GVE_FLOW_TYPE_SCTPV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_AH:
+			rule->flow_type = GVE_FLOW_TYPE_AHV6;
+			break;
+		case RTE_FLOW_ITEM_TYPE_ESP:
+			rule->flow_type = GVE_FLOW_TYPE_ESPV6;
+			break;
+		default:
+			goto unsupported_flow;
+		}
+		break;
+	default:
+		goto unsupported_flow;
+	}
+
+	return 0;
+
+unsupported_flow:
+	rte_flow_error_set(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM,
+			   NULL, "Unsupported L3/L4 combination");
+	return -EINVAL;
+}
+
+static int
+gve_validate_and_parse_flow_actions(struct rte_eth_dev *dev,
+				    const struct rte_flow_action actions[],
+				    struct rte_flow_error *error,
+				    struct gve_flow_rule_params *rule)
+{
+	const struct rte_flow_action_queue *action_queue;
+	const struct rte_flow_action *action = actions;
+	int num_queue_actions = 0;
+
+	if (actions == NULL) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM, NULL,
+				   "Invalid flow actions");
+		return -EINVAL;
+	}
+
+	while (action->type != RTE_FLOW_ACTION_TYPE_END) {
+		switch (action->type) {
+		case RTE_FLOW_ACTION_TYPE_VOID:
+			break;
+		case RTE_FLOW_ACTION_TYPE_QUEUE:
+			if (action->conf == NULL) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action,
+						   "QUEUE action config cannot be NULL.");
+				return -EINVAL;
+			}
+
+			action_queue = action->conf;
+			if (action_queue->index >= dev->data->nb_rx_queues) {
+				rte_flow_error_set(error, EINVAL,
+						   RTE_FLOW_ERROR_TYPE_ACTION_CONF,
+						   action, "Invalid Queue ID");
+				return -EINVAL;
+			}
+
+			rule->action = action_queue->index;
+			num_queue_actions++;
+			break;
+		default:
+			rte_flow_error_set(error, ENOTSUP,
+					   RTE_FLOW_ERROR_TYPE_ACTION,
+					   action,
+					   "Unsupported action. Only QUEUE is permitted.");
+			return -ENOTSUP;
+		}
+		action++;
+	}
+
+	if (num_queue_actions == 0) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "A QUEUE action is required.");
+		return -EINVAL;
+	}
+
+	if (num_queue_actions > 1) {
+		rte_flow_error_set(error, EINVAL,
+				   RTE_FLOW_ERROR_TYPE_ACTION_NUM,
+				   NULL, "Only a single QUEUE action is allowed.");
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int
+gve_validate_and_parse_flow(struct rte_eth_dev *dev,
+			    const struct rte_flow_attr *attr,
+			    const struct rte_flow_item pattern[],
+			    const struct rte_flow_action actions[],
+			    struct rte_flow_error *error,
+			    struct gve_flow_rule_params *rule)
+{
+	int err;
+
+	err = gve_validate_flow_attr(attr, error);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_pattern(pattern, error, rule);
+	if (err)
+		return err;
+
+	err = gve_validate_and_parse_flow_actions(dev, actions, error, rule);
+	if (err)
+		return err;
+
+	return 0;
+}
+
+int
+gve_flow_init_bmp(struct gve_priv *priv)
+{
+	priv->avail_flow_rule_bmp = rte_bitmap_init_with_all_set(priv->max_flow_rules,
+			priv->avail_flow_rule_bmp_mem, priv->flow_rule_bmp_size);
+	if (priv->avail_flow_rule_bmp == NULL) {
+		PMD_DRV_LOG(ERR, "Flow subsystem failed: cannot init bitmap.");
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void
+gve_flow_free_bmp(struct gve_priv *priv)
+{
+	rte_free(priv->avail_flow_rule_bmp_mem);
+	priv->avail_flow_rule_bmp_mem = NULL;
+	priv->avail_flow_rule_bmp = NULL;
+}
+
+/*
+ * The caller must acquire the flow rule lock before calling this function.
+ */
+int
+gve_free_flow_rules(struct gve_priv *priv)
+{
+	struct gve_flow *flow;
+	int err = 0;
+
+	if (!TAILQ_EMPTY(&priv->active_flows)) {
+		err = gve_adminq_reset_flow_rules(priv);
+		if (err) {
+			PMD_DRV_LOG(ERR,
+				"Failed to reset flow rules, internal device err=%d",
+				err);
+		}
+
+		/* Free flows even if AQ fails to avoid leaking memory. */
+		while (!TAILQ_EMPTY(&priv->active_flows)) {
+			flow = TAILQ_FIRST(&priv->active_flows);
+			TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+			free(flow);
+		}
+	}
+
+	return err;
+}
+
+static struct rte_flow *
+gve_create_flow_rule(struct rte_eth_dev *dev,
+		     const struct rte_flow_attr *attr,
+		     const struct rte_flow_item pattern[],
+		     const struct rte_flow_action actions[],
+		     struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow_rule_params rule = {0};
+	uint64_t slab_bits = 0;
+	uint32_t slab_idx = 0;
+	struct gve_flow *flow;
+	int err;
+
+	err = gve_validate_and_parse_flow(dev, attr, pattern, actions, error,
+					  &rule);
+	if (err)
+		return NULL;
+
+	flow = calloc(1, sizeof(struct gve_flow));
+	if (flow == NULL) {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to allocate memory for flow rule.");
+		return NULL;
+	}
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, flow subsystem not initialized.");
+		goto free_flow_and_unlock;
+	}
+
+	/* Try to allocate a new rule ID from the bitmap. */
+	if (rte_bitmap_scan(priv->avail_flow_rule_bmp, &slab_idx,
+			&slab_bits) == 1) {
+		flow->rule_id = slab_idx + rte_ctz64(slab_bits);
+		rte_bitmap_clear(priv->avail_flow_rule_bmp, flow->rule_id);
+	} else {
+		rte_flow_error_set(error, ENOMEM,
+				RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				"Failed to create flow, could not allocate a new rule ID.");
+		goto free_flow_and_unlock;
+	}
+
+	err = gve_adminq_add_flow_rule(priv, &rule, flow->rule_id);
+	if (err) {
+		rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+		rte_flow_error_set(error, -err,
+				   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+				   "Failed to create flow rule, internal device error.");
+		goto free_flow_and_unlock;
+	}
+
+	TAILQ_INSERT_TAIL(&priv->active_flows, flow, list_handle);
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return (struct rte_flow *)flow;
+
+free_flow_and_unlock:
+	free(flow);
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return NULL;
+}
+
+static int
+gve_destroy_flow_rule(struct rte_eth_dev *dev, struct rte_flow *flow_handle,
+		      struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	struct gve_flow *flow;
+	bool flow_rule_active;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	flow = (struct gve_flow *)flow_handle;
+
+	if (flow == NULL) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, invalid flow provided.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	if (flow->rule_id >= priv->max_flow_rules) {
+		PMD_DRV_LOG(ERR,
+			"Cannot destroy flow rule with invalid ID %d.",
+			flow->rule_id);
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, rule ID is invalid.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	flow_rule_active = !rte_bitmap_get(priv->avail_flow_rule_bmp,
+					   flow->rule_id);
+
+	if (!flow_rule_active) {
+		rte_flow_error_set(error, EINVAL,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, handle not found in active list.");
+		err = -EINVAL;
+		goto unlock;
+	}
+
+	err = gve_adminq_del_flow_rule(priv, flow->rule_id);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to destroy flow, internal device error.");
+		goto unlock;
+	}
+
+	rte_bitmap_set(priv->avail_flow_rule_bmp, flow->rule_id);
+	TAILQ_REMOVE(&priv->active_flows, flow, list_handle);
+	free(flow);
+
+	err = 0;
+
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+static int
+gve_flush_flow_rules(struct rte_eth_dev *dev, struct rte_flow_error *error)
+{
+	struct gve_priv *priv = dev->data->dev_private;
+	int err;
+
+	pthread_mutex_lock(&priv->flow_rule_lock);
+
+	if (!gve_get_flow_subsystem_ok(priv)) {
+		rte_flow_error_set(error, ENOTSUP,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules, flow subsystem not initialized.");
+		err = -ENOTSUP;
+		goto unlock;
+	}
+
+	err = gve_free_flow_rules(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to flush rules due to internal device error, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	err = gve_flow_init_bmp(priv);
+	if (err) {
+		rte_flow_error_set(error, -err,
+			RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+			"Failed to re-initialize rule ID bitmap, disabling flow subsystem.");
+		goto disable_and_free;
+	}
+
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+
+	return 0;
+
+disable_and_free:
+	gve_clear_flow_subsystem_ok(priv);
+	gve_flow_free_bmp(priv);
+unlock:
+	pthread_mutex_unlock(&priv->flow_rule_lock);
+	return err;
+}
+
+const struct rte_flow_ops gve_flow_ops = {
+	.create = gve_create_flow_rule,
+	.destroy = gve_destroy_flow_rule,
+	.flush = gve_flush_flow_rules,
+};
diff --git a/drivers/net/gve/gve_flow_rule.h b/drivers/net/gve/gve_flow_rule.h
index 8c17ddd..d597a6c 100644
--- a/drivers/net/gve/gve_flow_rule.h
+++ b/drivers/net/gve/gve_flow_rule.h
@@ -56,4 +56,10 @@ struct gve_flow_rule_params {
 	struct gve_flow_spec mask;
 };
 
+struct gve_priv;
+
+int gve_flow_init_bmp(struct gve_priv *priv);
+void gve_flow_free_bmp(struct gve_priv *priv);
+int gve_free_flow_rules(struct gve_priv *priv);
+
 #endif /* _GVE_FLOW_RULE_H_ */
diff --git a/drivers/net/gve/meson.build b/drivers/net/gve/meson.build
index c6a9f36..7074988 100644
--- a/drivers/net/gve/meson.build
+++ b/drivers/net/gve/meson.build
@@ -16,5 +16,6 @@ sources = files(
         'gve_ethdev.c',
         'gve_version.c',
         'gve_rss.c',
+        'gve_flow_rule.c',
 )
 includes += include_directories('base')
-- 
2.53.0.473.g4a7958ca14-goog


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 0/4] net/gve: add flow steering support
  2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
                         ` (3 preceding siblings ...)
  2026-03-04  4:50       ` [PATCH v4 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
@ 2026-03-04 15:59       ` Stephen Hemminger
  2026-03-04 22:43         ` Jasper Tran O'Leary
  4 siblings, 1 reply; 27+ messages in thread
From: Stephen Hemminger @ 2026-03-04 15:59 UTC (permalink / raw)
  To: Jasper Tran O'Leary; +Cc: dev, Joshua Washington

On Wed,  4 Mar 2026 04:50:29 +0000
"Jasper Tran O'Leary" <jtranoleary@google.com> wrote:

> This patch series adds flow steering support to the Google Virtual
> Ethernet (gve) driver. This functionality allows traffic to be directed
> to specific receive queues based on user-specified flow patterns.
> 
> The series includes foundational support for extended admin queue
> commands needed to handle flow rules, the specific adminqueue commands
> for flow rule management, and the integration with the DPDK rte_flow
> API. The series adds support flow matching on the following protocols:
> IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> 
> Patch Overview:
> 
> 1. "net/gve: add flow steering device option" checks for and enables
>    the flow steering capability in the device options during
>    initialization.
> 2. "net/gve: introduce extended adminq command" adds infrastructure
>    for sending extended admin queue commands. These commands use a
>    flexible buffer descriptor format required for flow rule management.
> 3. "net/gve: add adminq commands for flow steering" implements the
>    specific admin queue commands to add and remove flow rules on the
>    device, including handling of rule IDs and parameters.
> 4. "net/gve: add rte flow API integration" exposes the flow steering
>    functionality via the DPDK rte_flow API. This includes strict
>    pattern validation, rule parsing, and lifecycle management (create,
>    destroy, flush). It ensures thread-safe access to the flow subsystem
>    and proper resource cleanup during device reset.
> 
> Jasper Tran O'Leary (2):
>   net/gve: add adminq commands for flow steering
>   net/gve: add rte flow API integration
> 
> Vee Agarwal (2):
>   net/gve: add flow steering device option
>   net/gve: introduce extended adminq command
> 
>  doc/guides/nics/features/gve.ini       |  12 +
>  doc/guides/nics/gve.rst                |  27 +
>  doc/guides/rel_notes/release_26_03.rst |   1 +
>  drivers/net/gve/base/gve.h             |   3 +-
>  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
>  drivers/net/gve/base/gve_adminq.h      |  57 +++
>  drivers/net/gve/gve_ethdev.c           |  83 +++-
>  drivers/net/gve/gve_ethdev.h           |  46 ++
>  drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
>  drivers/net/gve/gve_flow_rule.h        |  65 +++
>  drivers/net/gve/meson.build            |   1 +
>  11 files changed, 1066 insertions(+), 5 deletions(-)
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
>  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> 


Applied to next-net

The detailed review report if you are interested.

Reviewed the v4 series. Overall this is well-structured — locking
discipline is sound, create/destroy/flush paths handle errors
correctly with proper resource cleanup, and the bitmap slot is
restored on adminq failure. Patches 1/4 and 2/4 are clean.

A few items on 3/4 and 4/4:

Patch 3/4:

  [Warning] struct gve_flow_spec has a padding hole after the
  tos/tclass u8 field (37 bytes of data, padded to 40 by the
  compiler). Callers zero-initialize today so no live bug, but
  consider adding GVE_CHECK_STRUCT_LEN for gve_flow_spec and
  gve_flow_rule_params to guard against future changes, consistent
  with other adminq structures.

Patch 4/4:

  [Warning] In gve_setup_flow_subsystem, the rte_zmalloc failure
  path does goto free_flow_rule_bmp which calls
  gve_flow_free_bmp(priv). This is safe (rte_free(NULL) is a
  no-op) but misleading — the label says "free" when there's
  nothing to free. Cleaner to just return -ENOMEM directly on the
  first failure.

  [Warning] gve_dev_reset tears down the flow subsystem and
  re-initializes via gve_init_priv, but does not destroy/recreate
  flow_rule_lock. This works today because
  gve_teardown_flow_subsystem doesn't destroy the mutex (only
  gve_dev_close does), but it's worth a comment to document this
  invariant.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v4 0/4] net/gve: add flow steering support
  2026-03-04 15:59       ` [PATCH v4 0/4] net/gve: add flow steering support Stephen Hemminger
@ 2026-03-04 22:43         ` Jasper Tran O'Leary
  0 siblings, 0 replies; 27+ messages in thread
From: Jasper Tran O'Leary @ 2026-03-04 22:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Joshua Washington

[-- Attachment #1: Type: text/plain, Size: 4375 bytes --]

Thank you, I will address the additional patch 3/4 and patch 4/4 comments
in a future cleanup patch.

On Wed, Mar 4, 2026 at 7:59 AM Stephen Hemminger <stephen@networkplumber.org>
wrote:

> On Wed,  4 Mar 2026 04:50:29 +0000
> "Jasper Tran O'Leary" <jtranoleary@google.com> wrote:
>
> > This patch series adds flow steering support to the Google Virtual
> > Ethernet (gve) driver. This functionality allows traffic to be directed
> > to specific receive queues based on user-specified flow patterns.
> >
> > The series includes foundational support for extended admin queue
> > commands needed to handle flow rules, the specific adminqueue commands
> > for flow rule management, and the integration with the DPDK rte_flow
> > API. The series adds support flow matching on the following protocols:
> > IPv4, IPv6, TCP, UDP, SCTP, ESP, and AH.
> >
> > Patch Overview:
> >
> > 1. "net/gve: add flow steering device option" checks for and enables
> >    the flow steering capability in the device options during
> >    initialization.
> > 2. "net/gve: introduce extended adminq command" adds infrastructure
> >    for sending extended admin queue commands. These commands use a
> >    flexible buffer descriptor format required for flow rule management.
> > 3. "net/gve: add adminq commands for flow steering" implements the
> >    specific admin queue commands to add and remove flow rules on the
> >    device, including handling of rule IDs and parameters.
> > 4. "net/gve: add rte flow API integration" exposes the flow steering
> >    functionality via the DPDK rte_flow API. This includes strict
> >    pattern validation, rule parsing, and lifecycle management (create,
> >    destroy, flush). It ensures thread-safe access to the flow subsystem
> >    and proper resource cleanup during device reset.
> >
> > Jasper Tran O'Leary (2):
> >   net/gve: add adminq commands for flow steering
> >   net/gve: add rte flow API integration
> >
> > Vee Agarwal (2):
> >   net/gve: add flow steering device option
> >   net/gve: introduce extended adminq command
> >
> >  doc/guides/nics/features/gve.ini       |  12 +
> >  doc/guides/nics/gve.rst                |  27 +
> >  doc/guides/rel_notes/release_26_03.rst |   1 +
> >  drivers/net/gve/base/gve.h             |   3 +-
> >  drivers/net/gve/base/gve_adminq.c      | 118 ++++-
> >  drivers/net/gve/base/gve_adminq.h      |  57 +++
> >  drivers/net/gve/gve_ethdev.c           |  83 +++-
> >  drivers/net/gve/gve_ethdev.h           |  46 ++
> >  drivers/net/gve/gve_flow_rule.c        | 658 +++++++++++++++++++++++++
> >  drivers/net/gve/gve_flow_rule.h        |  65 +++
> >  drivers/net/gve/meson.build            |   1 +
> >  11 files changed, 1066 insertions(+), 5 deletions(-)
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.c
> >  create mode 100644 dpdk/drivers/net/gve/gve_flow_rule.h
> >
>
>
> Applied to next-net
>
> The detailed review report if you are interested.
>
> Reviewed the v4 series. Overall this is well-structured — locking
> discipline is sound, create/destroy/flush paths handle errors
> correctly with proper resource cleanup, and the bitmap slot is
> restored on adminq failure. Patches 1/4 and 2/4 are clean.
>
> A few items on 3/4 and 4/4:
>
> Patch 3/4:
>
>   [Warning] struct gve_flow_spec has a padding hole after the
>   tos/tclass u8 field (37 bytes of data, padded to 40 by the
>   compiler). Callers zero-initialize today so no live bug, but
>   consider adding GVE_CHECK_STRUCT_LEN for gve_flow_spec and
>   gve_flow_rule_params to guard against future changes, consistent
>   with other adminq structures.
>
> Patch 4/4:
>
>   [Warning] In gve_setup_flow_subsystem, the rte_zmalloc failure
>   path does goto free_flow_rule_bmp which calls
>   gve_flow_free_bmp(priv). This is safe (rte_free(NULL) is a
>   no-op) but misleading — the label says "free" when there's
>   nothing to free. Cleaner to just return -ENOMEM directly on the
>   first failure.
>
>   [Warning] gve_dev_reset tears down the flow subsystem and
>   re-initializes via gve_init_priv, but does not destroy/recreate
>   flow_rule_lock. This works today because
>   gve_teardown_flow_subsystem doesn't destroy the mutex (only
>   gve_dev_close does), but it's worth a comment to document this
>   invariant.
>

[-- Attachment #2: Type: text/html, Size: 5292 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2026-03-04 22:43 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27 19:51 [PATCH 0/4] net/gve: add flow steering support Jasper Tran O'Leary
2026-02-27 19:51 ` [PATCH 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
2026-02-27 19:51 ` [PATCH 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
2026-02-27 19:51 ` [PATCH 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
2026-02-27 19:51 ` [PATCH 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
2026-02-27 22:52 ` [PATCH 0/4] net/gve: add flow steering support Stephen Hemminger
2026-03-03  1:00   ` Jasper Tran O'Leary
2026-03-03  0:58 ` [PATCH v2 " Jasper Tran O'Leary
2026-03-03  0:58   ` [PATCH v2 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
2026-03-03  0:58   ` [PATCH v2 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
2026-03-03  0:58   ` [PATCH v2 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
2026-03-03  0:58   ` [PATCH v2 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
2026-03-03 15:21   ` [PATCH v2 0/4] net/gve: add flow steering support Stephen Hemminger
2026-03-04  1:49     ` Jasper Tran O'Leary
2026-03-04  1:46   ` [PATCH v3 " Jasper Tran O'Leary
2026-03-04  1:46     ` [PATCH v3 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
2026-03-04  1:46     ` [PATCH v3 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
2026-03-04  1:46     ` [PATCH v3 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
2026-03-04  1:46     ` [PATCH v3 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
2026-03-04  4:46     ` [PATCH v3 0/4] net/gve: add flow steering support Jasper Tran O'Leary
2026-03-04  4:50     ` [PATCH v4 " Jasper Tran O'Leary
2026-03-04  4:50       ` [PATCH v4 1/4] net/gve: add flow steering device option Jasper Tran O'Leary
2026-03-04  4:50       ` [PATCH v4 2/4] net/gve: introduce extended adminq command Jasper Tran O'Leary
2026-03-04  4:50       ` [PATCH v4 3/4] net/gve: add adminq commands for flow steering Jasper Tran O'Leary
2026-03-04  4:50       ` [PATCH v4 4/4] net/gve: add rte flow API integration Jasper Tran O'Leary
2026-03-04 15:59       ` [PATCH v4 0/4] net/gve: add flow steering support Stephen Hemminger
2026-03-04 22:43         ` Jasper Tran O'Leary

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox