netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules
@ 2023-05-11 19:47 edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 1/4] sfc: release encap match in efx_tc_flow_free() edward.cree
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: edward.cree @ 2023-05-11 19:47 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx

From: Edward Cree <ecree.xilinx@gmail.com>

This series extends the TC offload support on EF100 to support optionally
 matching on the IP ToS and UDP source port of the outer header in rules
 performing tunnel decapsulation.  Both of these fields allow masked
 matches if the underlying hardware supports it (current EF100 hardware
 supports masking on ToS, but only exact-match on source port).
Given that the source port is typically populated from a hash of inner
 header entropy, it's not clear whether filtering on it is useful, but
 since we can support it we may as well expose the capability.

Edward Cree (4):
  sfc: release encap match in efx_tc_flow_free()
  sfc: populate enc_ip_tos matches in MAE outer rules
  sfc: support TC decap rules matching on enc_ip_tos
  sfc: support TC decap rules matching on enc_src_port

 drivers/net/ethernet/sfc/mae.c |  28 ++++-
 drivers/net/ethernet/sfc/mae.h |   1 +
 drivers/net/ethernet/sfc/tc.c  | 205 +++++++++++++++++++++++----------
 drivers/net/ethernet/sfc/tc.h  |  27 +++++
 4 files changed, 197 insertions(+), 64 deletions(-)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 1/4] sfc: release encap match in efx_tc_flow_free()
  2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
@ 2023-05-11 19:47 ` edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 2/4] sfc: populate enc_ip_tos matches in MAE outer rules edward.cree
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: edward.cree @ 2023-05-11 19:47 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx

From: Edward Cree <ecree.xilinx@gmail.com>

When force-freeing leftover entries from our match_action_ht, call
 efx_tc_delete_rule(), which releases all the rule's resources, rather
 than open-coding it.  The open-coded version was missing a call to
 release the rule's encap match (if any).
It probably doesn't matter as everything's being torn down anyway, but
 it's cleaner this way and prevents further error messages potentially
 being logged by efx_tc_encap_match_free() later on.
Move efx_tc_flow_free() further down the file to avoid introducing a
 forward declaration of efx_tc_delete_rule().

Fixes: 17654d84b47c ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
 drivers/net/ethernet/sfc/tc.c | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
index 0327639a628a..236b44a4215e 100644
--- a/drivers/net/ethernet/sfc/tc.c
+++ b/drivers/net/ethernet/sfc/tc.c
@@ -132,23 +132,6 @@ static void efx_tc_free_action_set_list(struct efx_nic *efx,
 	/* Don't kfree, as acts is embedded inside a struct efx_tc_flow_rule */
 }
 
-static void efx_tc_flow_free(void *ptr, void *arg)
-{
-	struct efx_tc_flow_rule *rule = ptr;
-	struct efx_nic *efx = arg;
-
-	netif_err(efx, drv, efx->net_dev,
-		  "tc rule %lx still present at teardown, removing\n",
-		  rule->cookie);
-
-	efx_mae_delete_rule(efx, rule->fw_id);
-
-	/* Release entries in subsidiary tables */
-	efx_tc_free_action_set_list(efx, &rule->acts, true);
-
-	kfree(rule);
-}
-
 /* Boilerplate for the simple 'copy a field' cases */
 #define _MAP_KEY_AND_MASK(_name, _type, _tcget, _tcfield, _field)	\
 if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_##_name)) {		\
@@ -1454,6 +1437,21 @@ static void efx_tc_encap_match_free(void *ptr, void *__unused)
 	kfree(encap);
 }
 
+static void efx_tc_flow_free(void *ptr, void *arg)
+{
+	struct efx_tc_flow_rule *rule = ptr;
+	struct efx_nic *efx = arg;
+
+	netif_err(efx, drv, efx->net_dev,
+		  "tc rule %lx still present at teardown, removing\n",
+		  rule->cookie);
+
+	/* Also releases entries in subsidiary tables */
+	efx_tc_delete_rule(efx, rule);
+
+	kfree(rule);
+}
+
 int efx_init_struct_tc(struct efx_nic *efx)
 {
 	int rc;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 2/4] sfc: populate enc_ip_tos matches in MAE outer rules
  2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 1/4] sfc: release encap match in efx_tc_flow_free() edward.cree
@ 2023-05-11 19:47 ` edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos edward.cree
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: edward.cree @ 2023-05-11 19:47 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx

From: Edward Cree <ecree.xilinx@gmail.com>

Currently tc.c will block them before they get here, but following
 patch will change that.
Use the extack message from efx_mae_check_encap_match_caps() instead
 of writing a new one, since there's now more being fed in than just
 an IP version.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
v2: fix transient build breakage by moving encap->ip_tos[_mask] addition
to this patch.
---
 drivers/net/ethernet/sfc/mae.c | 16 +++++++++++++++-
 drivers/net/ethernet/sfc/mae.h |  1 +
 drivers/net/ethernet/sfc/tc.c  |  9 +++------
 drivers/net/ethernet/sfc/tc.h  |  1 +
 4 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mae.c b/drivers/net/ethernet/sfc/mae.c
index 49706a7b94bf..8f4bb5d36ad8 100644
--- a/drivers/net/ethernet/sfc/mae.c
+++ b/drivers/net/ethernet/sfc/mae.c
@@ -482,12 +482,14 @@ int efx_mae_match_check_caps(struct efx_nic *efx,
 	rc;								       \
 })
 /* Checks that the fields needed for encap-rule matches are supported by the
- * MAE.  All the fields are exact-match.
+ * MAE.  All the fields are exact-match, except possibly ENC_IP_TOS.
  */
 int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
+				   u8 ip_tos_mask,
 				   struct netlink_ext_ack *extack)
 {
 	u8 *supported_fields = efx->tc->caps->outer_rule_fields;
+	enum mask_type typ;
 	int rc;
 
 	if (CHECK(ENC_ETHER_TYPE))
@@ -504,6 +506,14 @@ int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
 	if (CHECK(ENC_L4_DPORT) ||
 	    CHECK(ENC_IP_PROTO))
 		return rc;
+	typ = classify_mask(&ip_tos_mask, sizeof(ip_tos_mask));
+	rc = efx_mae_match_check_cap_typ(supported_fields[MAE_FIELD_ENC_IP_TOS],
+					 typ);
+	if (rc) {
+		NL_SET_ERR_MSG_FMT_MOD(extack, "No support for %s mask in field %s",
+				       mask_type_name(typ), "enc_ip_tos");
+		return rc;
+	}
 	return 0;
 }
 #undef CHECK
@@ -1003,6 +1013,10 @@ int efx_mae_register_encap_match(struct efx_nic *efx,
 				~(__be16)0);
 	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO, IPPROTO_UDP);
 	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO_MASK, ~0);
+	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_TOS,
+			     encap->ip_tos);
+	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_TOS_MASK,
+			     encap->ip_tos_mask);
 	rc = efx_mcdi_rpc(efx, MC_CMD_MAE_OUTER_RULE_INSERT, inbuf,
 			  sizeof(inbuf), outbuf, sizeof(outbuf), &outlen);
 	if (rc)
diff --git a/drivers/net/ethernet/sfc/mae.h b/drivers/net/ethernet/sfc/mae.h
index 9226219491a0..cec61bfde4d4 100644
--- a/drivers/net/ethernet/sfc/mae.h
+++ b/drivers/net/ethernet/sfc/mae.h
@@ -82,6 +82,7 @@ int efx_mae_match_check_caps(struct efx_nic *efx,
 			     const struct efx_tc_match_fields *mask,
 			     struct netlink_ext_ack *extack);
 int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
+				   u8 ip_tos_mask,
 				   struct netlink_ext_ack *extack);
 int efx_mae_check_encap_type_supported(struct efx_nic *efx,
 				       enum efx_encap_type typ);
diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
index 236b44a4215e..c2dda3ae5492 100644
--- a/drivers/net/ethernet/sfc/tc.c
+++ b/drivers/net/ethernet/sfc/tc.c
@@ -410,12 +410,9 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		return -EOPNOTSUPP;
 	}
 
-	rc = efx_mae_check_encap_match_caps(efx, ipv6, extack);
-	if (rc) {
-		NL_SET_ERR_MSG_FMT_MOD(extack, "MAE hw reports no support for IPv%d encap matches",
-				       ipv6 ? 6 : 4);
-		return -EOPNOTSUPP;
-	}
+	rc = efx_mae_check_encap_match_caps(efx, ipv6, match->mask.enc_ip_tos, extack);
+	if (rc)
+		return rc;
 
 	encap = kzalloc(sizeof(*encap), GFP_USER);
 	if (!encap)
diff --git a/drivers/net/ethernet/sfc/tc.h b/drivers/net/ethernet/sfc/tc.h
index 04cced6a2d39..8d2abca26c23 100644
--- a/drivers/net/ethernet/sfc/tc.h
+++ b/drivers/net/ethernet/sfc/tc.h
@@ -78,6 +78,7 @@ struct efx_tc_encap_match {
 	__be32 src_ip, dst_ip;
 	struct in6_addr src_ip6, dst_ip6;
 	__be16 udp_dport;
+	u8 ip_tos, ip_tos_mask;
 	struct rhash_head linkage;
 	enum efx_encap_type tun_type;
 	refcount_t ref;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos
  2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 1/4] sfc: release encap match in efx_tc_flow_free() edward.cree
  2023-05-11 19:47 ` [PATCH v2 net-next 2/4] sfc: populate enc_ip_tos matches in MAE outer rules edward.cree
@ 2023-05-11 19:47 ` edward.cree
  2023-05-12  9:51   ` Simon Horman
  2023-05-11 19:47 ` [PATCH v2 net-next 4/4] sfc: support TC decap rules matching on enc_src_port edward.cree
  2023-05-12 11:50 ` [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules patchwork-bot+netdevbpf
  4 siblings, 1 reply; 8+ messages in thread
From: edward.cree @ 2023-05-11 19:47 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx

From: Edward Cree <ecree.xilinx@gmail.com>

Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
To avoid partially-overlapping Outer Rules (which can lead to undefined
 behaviour in the hardware), store extra "pseudo" entries in our
 encap_match hashtable, which are used to enforce that all Outer Rule
 entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
 equivalent) have the same ip_tos_mask.
The "direct" encap_match entry takes a reference on the "pseudo",
 allowing it to be destroyed when all "direct" entries using it are
 removed.
efx_tc_em_pseudo_type is an enum rather than just a bool because in
 future an additional pseudo-type will be added to support Conntrack
 offload.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
 drivers/net/ethernet/sfc/tc.c | 145 +++++++++++++++++++++++++---------
 drivers/net/ethernet/sfc/tc.h |  24 ++++++
 2 files changed, 133 insertions(+), 36 deletions(-)

diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
index c2dda3ae5492..8e1769d2c4ee 100644
--- a/drivers/net/ethernet/sfc/tc.c
+++ b/drivers/net/ethernet/sfc/tc.c
@@ -202,6 +202,7 @@ static int efx_tc_flower_parse_match(struct efx_nic *efx,
 	      BIT(FLOW_DISSECTOR_KEY_ENC_KEYID) |
 	      BIT(FLOW_DISSECTOR_KEY_ENC_IPV4_ADDRS) |
 	      BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS) |
+	      BIT(FLOW_DISSECTOR_KEY_ENC_IP) |
 	      BIT(FLOW_DISSECTOR_KEY_ENC_PORTS) |
 	      BIT(FLOW_DISSECTOR_KEY_ENC_CONTROL) |
 	      BIT(FLOW_DISSECTOR_KEY_TCP) |
@@ -346,20 +347,47 @@ static int efx_tc_flower_parse_match(struct efx_nic *efx,
 	return 0;
 }
 
+static void efx_tc_flower_release_encap_match(struct efx_nic *efx,
+					      struct efx_tc_encap_match *encap)
+{
+	int rc;
+
+	if (!refcount_dec_and_test(&encap->ref))
+		return; /* still in use */
+
+	if (encap->type == EFX_TC_EM_DIRECT) {
+		rc = efx_mae_unregister_encap_match(efx, encap);
+		if (rc)
+			/* Display message but carry on and remove entry from our
+			 * SW tables, because there's not much we can do about it.
+			 */
+			netif_err(efx, drv, efx->net_dev,
+				  "Failed to release encap match %#x, rc %d\n",
+				  encap->fw_id, rc);
+	}
+	rhashtable_remove_fast(&efx->tc->encap_match_ht, &encap->linkage,
+			       efx_tc_encap_match_ht_params);
+	if (encap->pseudo)
+		efx_tc_flower_release_encap_match(efx, encap->pseudo);
+	kfree(encap);
+}
+
 static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 					    struct efx_tc_match *match,
 					    enum efx_encap_type type,
+					    enum efx_tc_em_pseudo_type em_type,
+					    u8 child_ip_tos_mask,
 					    struct netlink_ext_ack *extack)
 {
-	struct efx_tc_encap_match *encap, *old;
+	struct efx_tc_encap_match *encap, *old, *pseudo = NULL;
 	bool ipv6 = false;
 	int rc;
 
 	/* We require that the socket-defining fields (IP addrs and UDP dest
-	 * port) are present and exact-match.  Other fields are currently not
-	 * allowed.  This meets what OVS will ask for, and means that we don't
-	 * need to handle difficult checks for overlapping matches as could
-	 * come up if we allowed masks or varying sets of match fields.
+	 * port) are present and exact-match.  Other fields may only be used
+	 * if the field-set (and any masks) are the same for all encap
+	 * matches on the same <sip,dip,dport> tuple; this is enforced by
+	 * pseudo encap matches.
 	 */
 	if (match->mask.enc_dst_ip | match->mask.enc_src_ip) {
 		if (!IS_ALL_ONES(match->mask.enc_dst_ip)) {
@@ -402,21 +430,37 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		return -EOPNOTSUPP;
 	}
 	if (match->mask.enc_ip_tos) {
-		NL_SET_ERR_MSG_MOD(extack, "Egress encap match on IP ToS not supported");
-		return -EOPNOTSUPP;
+		struct efx_tc_match pmatch = *match;
+
+		if (em_type == EFX_TC_EM_PSEUDO_MASK) { /* can't happen */
+			NL_SET_ERR_MSG_MOD(extack, "Bad recursion in egress encap match handler");
+			return -EOPNOTSUPP;
+		}
+		pmatch.value.enc_ip_tos = 0;
+		pmatch.mask.enc_ip_tos = 0;
+		rc = efx_tc_flower_record_encap_match(efx, &pmatch, type,
+						      EFX_TC_EM_PSEUDO_MASK,
+						      match->mask.enc_ip_tos,
+						      extack);
+		if (rc)
+			return rc;
+		pseudo = pmatch.encap;
 	}
 	if (match->mask.enc_ip_ttl) {
 		NL_SET_ERR_MSG_MOD(extack, "Egress encap match on IP TTL not supported");
-		return -EOPNOTSUPP;
+		rc = -EOPNOTSUPP;
+		goto fail_pseudo;
 	}
 
 	rc = efx_mae_check_encap_match_caps(efx, ipv6, match->mask.enc_ip_tos, extack);
 	if (rc)
-		return rc;
+		goto fail_pseudo;
 
 	encap = kzalloc(sizeof(*encap), GFP_USER);
-	if (!encap)
-		return -ENOMEM;
+	if (!encap) {
+		rc = -ENOMEM;
+		goto fail_pseudo;
+	}
 	encap->src_ip = match->value.enc_src_ip;
 	encap->dst_ip = match->value.enc_dst_ip;
 #ifdef CONFIG_IPV6
@@ -425,12 +469,56 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 #endif
 	encap->udp_dport = match->value.enc_dport;
 	encap->tun_type = type;
+	encap->ip_tos = match->value.enc_ip_tos;
+	encap->ip_tos_mask = match->mask.enc_ip_tos;
+	encap->child_ip_tos_mask = child_ip_tos_mask;
+	encap->type = em_type;
+	encap->pseudo = pseudo;
 	old = rhashtable_lookup_get_insert_fast(&efx->tc->encap_match_ht,
 						&encap->linkage,
 						efx_tc_encap_match_ht_params);
 	if (old) {
 		/* don't need our new entry */
 		kfree(encap);
+		if (pseudo) /* don't need our new pseudo either */
+			efx_tc_flower_release_encap_match(efx, pseudo);
+		/* check old and new em_types are compatible */
+		switch (old->type) {
+		case EFX_TC_EM_DIRECT:
+			/* old EM is in hardware, so mustn't overlap with a
+			 * pseudo, but may be shared with another direct EM
+			 */
+			if (em_type == EFX_TC_EM_DIRECT)
+				break;
+			NL_SET_ERR_MSG_MOD(extack, "Pseudo encap match conflicts with existing direct entry");
+			return -EEXIST;
+		case EFX_TC_EM_PSEUDO_MASK:
+			/* old EM is protecting a ToS-qualified filter, so may
+			 * only be shared with another pseudo for the same
+			 * ToS mask.
+			 */
+			if (em_type != EFX_TC_EM_PSEUDO_MASK) {
+				NL_SET_ERR_MSG_FMT_MOD(extack,
+						       "%s encap match conflicts with existing pseudo(MASK) entry",
+						       encap->type ? "Pseudo" : "Direct");
+				return -EEXIST;
+			}
+			if (child_ip_tos_mask != old->child_ip_tos_mask) {
+				NL_SET_ERR_MSG_FMT_MOD(extack,
+						       "Pseudo encap match for TOS mask %#04x conflicts with existing pseudo(MASK) entry for TOS mask %#04x",
+						       child_ip_tos_mask,
+						       old->child_ip_tos_mask);
+				return -EEXIST;
+			}
+			break;
+		default: /* Unrecognised pseudo-type.  Just say no */
+			NL_SET_ERR_MSG_FMT_MOD(extack,
+					       "%s encap match conflicts with existing pseudo(%d) entry",
+					       encap->type ? "Pseudo" : "Direct",
+					       old->type);
+			return -EEXIST;
+		}
+		/* check old and new tun_types are compatible */
 		if (old->tun_type != type) {
 			NL_SET_ERR_MSG_FMT_MOD(extack,
 					       "Egress encap match with conflicting tun_type %u != %u",
@@ -442,10 +530,12 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		/* existing entry found */
 		encap = old;
 	} else {
-		rc = efx_mae_register_encap_match(efx, encap);
-		if (rc) {
-			NL_SET_ERR_MSG_MOD(extack, "Failed to record egress encap match in HW");
-			goto fail;
+		if (em_type == EFX_TC_EM_DIRECT) {
+			rc = efx_mae_register_encap_match(efx, encap);
+			if (rc) {
+				NL_SET_ERR_MSG_MOD(extack, "Failed to record egress encap match in HW");
+				goto fail;
+			}
 		}
 		refcount_set(&encap->ref, 1);
 	}
@@ -455,30 +545,12 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 	rhashtable_remove_fast(&efx->tc->encap_match_ht, &encap->linkage,
 			       efx_tc_encap_match_ht_params);
 	kfree(encap);
+fail_pseudo:
+	if (pseudo)
+		efx_tc_flower_release_encap_match(efx, pseudo);
 	return rc;
 }
 
-static void efx_tc_flower_release_encap_match(struct efx_nic *efx,
-					      struct efx_tc_encap_match *encap)
-{
-	int rc;
-
-	if (!refcount_dec_and_test(&encap->ref))
-		return; /* still in use */
-
-	rc = efx_mae_unregister_encap_match(efx, encap);
-	if (rc)
-		/* Display message but carry on and remove entry from our
-		 * SW tables, because there's not much we can do about it.
-		 */
-		netif_err(efx, drv, efx->net_dev,
-			  "Failed to release encap match %#x, rc %d\n",
-			  encap->fw_id, rc);
-	rhashtable_remove_fast(&efx->tc->encap_match_ht, &encap->linkage,
-			       efx_tc_encap_match_ht_params);
-	kfree(encap);
-}
-
 static void efx_tc_delete_rule(struct efx_nic *efx, struct efx_tc_flow_rule *rule)
 {
 	efx_mae_delete_rule(efx, rule->fw_id);
@@ -632,6 +704,7 @@ static int efx_tc_flower_replace_foreign(struct efx_nic *efx,
 		}
 
 		rc = efx_tc_flower_record_encap_match(efx, &match, type,
+						      EFX_TC_EM_DIRECT, 0,
 						      extack);
 		if (rc)
 			goto release;
diff --git a/drivers/net/ethernet/sfc/tc.h b/drivers/net/ethernet/sfc/tc.h
index 8d2abca26c23..0f14481d2d9e 100644
--- a/drivers/net/ethernet/sfc/tc.h
+++ b/drivers/net/ethernet/sfc/tc.h
@@ -74,6 +74,27 @@ static inline bool efx_tc_match_is_encap(const struct efx_tc_match_fields *mask)
 	       mask->enc_ip_ttl || mask->enc_sport || mask->enc_dport;
 }
 
+/**
+ * enum efx_tc_em_pseudo_type - &struct efx_tc_encap_match pseudo type
+ *
+ * These are used to classify "pseudo" encap matches, which don't refer
+ * to an entry in hardware but rather indicate that a section of the
+ * match space is in use by another Outer Rule.
+ *
+ * @EFX_TC_EM_DIRECT: real HW entry in Outer Rule table; not a pseudo.
+ *	Hardware index in &struct efx_tc_encap_match.fw_id is valid.
+ * @EFX_TC_EM_PSEUDO_MASK: registered by an encap match which includes a
+ *	match on an optional field (currently only ip_tos), to prevent an
+ *	overlapping encap match _without_ optional fields.
+ *	The pseudo encap match may be referenced again by an encap match
+ *	with a different ip_tos value, but all ip_tos_mask must match the
+ *	first (stored in our child_ip_tos_mask).
+ */
+enum efx_tc_em_pseudo_type {
+	EFX_TC_EM_DIRECT,
+	EFX_TC_EM_PSEUDO_MASK,
+};
+
 struct efx_tc_encap_match {
 	__be32 src_ip, dst_ip;
 	struct in6_addr src_ip6, dst_ip6;
@@ -81,8 +102,11 @@ struct efx_tc_encap_match {
 	u8 ip_tos, ip_tos_mask;
 	struct rhash_head linkage;
 	enum efx_encap_type tun_type;
+	u8 child_ip_tos_mask;
 	refcount_t ref;
+	enum efx_tc_em_pseudo_type type;
 	u32 fw_id; /* index of this entry in firmware encap match table */
+	struct efx_tc_encap_match *pseudo; /* Referenced pseudo EM if needed */
 };
 
 struct efx_tc_match {

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2 net-next 4/4] sfc: support TC decap rules matching on enc_src_port
  2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
                   ` (2 preceding siblings ...)
  2023-05-11 19:47 ` [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos edward.cree
@ 2023-05-11 19:47 ` edward.cree
  2023-05-12 11:50 ` [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules patchwork-bot+netdevbpf
  4 siblings, 0 replies; 8+ messages in thread
From: edward.cree @ 2023-05-11 19:47 UTC (permalink / raw)
  To: linux-net-drivers, davem, kuba, pabeni, edumazet
  Cc: Edward Cree, netdev, habetsm.xilinx

From: Edward Cree <ecree.xilinx@gmail.com>

Allow efx_tc_encap_match entries to include a udp_sport and a
 udp_sport_mask.  As with enc_ip_tos, use pseudos to enforce that all
 encap matches within a given <src_ip,dst_ip,udp_dport> tuple have
 the same udp_sport_mask.
Note that since we use a single layer of pseudos for both fields, two
 matches that differ in (say) udp_sport value aren't permitted to have
 different ip_tos_mask, even though this would technically be safe.
Current userland TC does not support setting enc_src_port; this patch
 was tested with an iproute2 patched to support it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
---
 drivers/net/ethernet/sfc/mae.c | 14 +++++++++++++-
 drivers/net/ethernet/sfc/mae.h |  2 +-
 drivers/net/ethernet/sfc/tc.c  | 31 +++++++++++++++++++++----------
 drivers/net/ethernet/sfc/tc.h  | 10 ++++++----
 4 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/sfc/mae.c b/drivers/net/ethernet/sfc/mae.c
index 8f4bb5d36ad8..37a4c6925ad4 100644
--- a/drivers/net/ethernet/sfc/mae.c
+++ b/drivers/net/ethernet/sfc/mae.c
@@ -485,7 +485,7 @@ int efx_mae_match_check_caps(struct efx_nic *efx,
  * MAE.  All the fields are exact-match, except possibly ENC_IP_TOS.
  */
 int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
-				   u8 ip_tos_mask,
+				   u8 ip_tos_mask, __be16 udp_sport_mask,
 				   struct netlink_ext_ack *extack)
 {
 	u8 *supported_fields = efx->tc->caps->outer_rule_fields;
@@ -506,6 +506,14 @@ int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
 	if (CHECK(ENC_L4_DPORT) ||
 	    CHECK(ENC_IP_PROTO))
 		return rc;
+	typ = classify_mask((const u8 *)&udp_sport_mask, sizeof(udp_sport_mask));
+	rc = efx_mae_match_check_cap_typ(supported_fields[MAE_FIELD_ENC_L4_SPORT],
+					 typ);
+	if (rc) {
+		NL_SET_ERR_MSG_FMT_MOD(extack, "No support for %s mask in field %s",
+				       mask_type_name(typ), "enc_src_port");
+		return rc;
+	}
 	typ = classify_mask(&ip_tos_mask, sizeof(ip_tos_mask));
 	rc = efx_mae_match_check_cap_typ(supported_fields[MAE_FIELD_ENC_IP_TOS],
 					 typ);
@@ -1011,6 +1019,10 @@ int efx_mae_register_encap_match(struct efx_nic *efx,
 				encap->udp_dport);
 	MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_L4_DPORT_BE_MASK,
 				~(__be16)0);
+	MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_L4_DPORT_BE,
+				encap->udp_sport);
+	MCDI_STRUCT_SET_WORD_BE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_L4_DPORT_BE_MASK,
+				encap->udp_sport_mask);
 	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO, IPPROTO_UDP);
 	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_PROTO_MASK, ~0);
 	MCDI_STRUCT_SET_BYTE(match_crit, MAE_ENC_FIELD_PAIRS_ENC_IP_TOS,
diff --git a/drivers/net/ethernet/sfc/mae.h b/drivers/net/ethernet/sfc/mae.h
index cec61bfde4d4..1cf8dfeb0c28 100644
--- a/drivers/net/ethernet/sfc/mae.h
+++ b/drivers/net/ethernet/sfc/mae.h
@@ -82,7 +82,7 @@ int efx_mae_match_check_caps(struct efx_nic *efx,
 			     const struct efx_tc_match_fields *mask,
 			     struct netlink_ext_ack *extack);
 int efx_mae_check_encap_match_caps(struct efx_nic *efx, bool ipv6,
-				   u8 ip_tos_mask,
+				   u8 ip_tos_mask, __be16 udp_sport_mask,
 				   struct netlink_ext_ack *extack);
 int efx_mae_check_encap_type_supported(struct efx_nic *efx,
 				       enum efx_encap_type typ);
diff --git a/drivers/net/ethernet/sfc/tc.c b/drivers/net/ethernet/sfc/tc.c
index 8e1769d2c4ee..da684b4b7211 100644
--- a/drivers/net/ethernet/sfc/tc.c
+++ b/drivers/net/ethernet/sfc/tc.c
@@ -377,6 +377,7 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 					    enum efx_encap_type type,
 					    enum efx_tc_em_pseudo_type em_type,
 					    u8 child_ip_tos_mask,
+					    __be16 child_udp_sport_mask,
 					    struct netlink_ext_ack *extack)
 {
 	struct efx_tc_encap_match *encap, *old, *pseudo = NULL;
@@ -425,11 +426,7 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		NL_SET_ERR_MSG_MOD(extack, "Egress encap match is not exact on dst UDP port");
 		return -EOPNOTSUPP;
 	}
-	if (match->mask.enc_sport) {
-		NL_SET_ERR_MSG_MOD(extack, "Egress encap match on src UDP port not supported");
-		return -EOPNOTSUPP;
-	}
-	if (match->mask.enc_ip_tos) {
+	if (match->mask.enc_sport || match->mask.enc_ip_tos) {
 		struct efx_tc_match pmatch = *match;
 
 		if (em_type == EFX_TC_EM_PSEUDO_MASK) { /* can't happen */
@@ -438,9 +435,12 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		}
 		pmatch.value.enc_ip_tos = 0;
 		pmatch.mask.enc_ip_tos = 0;
+		pmatch.value.enc_sport = 0;
+		pmatch.mask.enc_sport = 0;
 		rc = efx_tc_flower_record_encap_match(efx, &pmatch, type,
 						      EFX_TC_EM_PSEUDO_MASK,
 						      match->mask.enc_ip_tos,
+						      match->mask.enc_sport,
 						      extack);
 		if (rc)
 			return rc;
@@ -452,7 +452,8 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 		goto fail_pseudo;
 	}
 
-	rc = efx_mae_check_encap_match_caps(efx, ipv6, match->mask.enc_ip_tos, extack);
+	rc = efx_mae_check_encap_match_caps(efx, ipv6, match->mask.enc_ip_tos,
+					    match->mask.enc_sport, extack);
 	if (rc)
 		goto fail_pseudo;
 
@@ -472,6 +473,9 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 	encap->ip_tos = match->value.enc_ip_tos;
 	encap->ip_tos_mask = match->mask.enc_ip_tos;
 	encap->child_ip_tos_mask = child_ip_tos_mask;
+	encap->udp_sport = match->value.enc_sport;
+	encap->udp_sport_mask = match->mask.enc_sport;
+	encap->child_udp_sport_mask = child_udp_sport_mask;
 	encap->type = em_type;
 	encap->pseudo = pseudo;
 	old = rhashtable_lookup_get_insert_fast(&efx->tc->encap_match_ht,
@@ -493,9 +497,9 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 			NL_SET_ERR_MSG_MOD(extack, "Pseudo encap match conflicts with existing direct entry");
 			return -EEXIST;
 		case EFX_TC_EM_PSEUDO_MASK:
-			/* old EM is protecting a ToS-qualified filter, so may
-			 * only be shared with another pseudo for the same
-			 * ToS mask.
+			/* old EM is protecting a ToS- or src port-qualified
+			 * filter, so may only be shared with another pseudo
+			 * for the same ToS and src port masks.
 			 */
 			if (em_type != EFX_TC_EM_PSEUDO_MASK) {
 				NL_SET_ERR_MSG_FMT_MOD(extack,
@@ -510,6 +514,13 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
 						       old->child_ip_tos_mask);
 				return -EEXIST;
 			}
+			if (child_udp_sport_mask != old->child_udp_sport_mask) {
+				NL_SET_ERR_MSG_FMT_MOD(extack,
+						       "Pseudo encap match for UDP src port mask %#x conflicts with existing pseudo(MASK) entry for mask %#x",
+						       child_udp_sport_mask,
+						       old->child_udp_sport_mask);
+				return -EEXIST;
+			}
 			break;
 		default: /* Unrecognised pseudo-type.  Just say no */
 			NL_SET_ERR_MSG_FMT_MOD(extack,
@@ -704,7 +715,7 @@ static int efx_tc_flower_replace_foreign(struct efx_nic *efx,
 		}
 
 		rc = efx_tc_flower_record_encap_match(efx, &match, type,
-						      EFX_TC_EM_DIRECT, 0,
+						      EFX_TC_EM_DIRECT, 0, 0,
 						      extack);
 		if (rc)
 			goto release;
diff --git a/drivers/net/ethernet/sfc/tc.h b/drivers/net/ethernet/sfc/tc.h
index 0f14481d2d9e..24e9640c74e9 100644
--- a/drivers/net/ethernet/sfc/tc.h
+++ b/drivers/net/ethernet/sfc/tc.h
@@ -84,11 +84,11 @@ static inline bool efx_tc_match_is_encap(const struct efx_tc_match_fields *mask)
  * @EFX_TC_EM_DIRECT: real HW entry in Outer Rule table; not a pseudo.
  *	Hardware index in &struct efx_tc_encap_match.fw_id is valid.
  * @EFX_TC_EM_PSEUDO_MASK: registered by an encap match which includes a
- *	match on an optional field (currently only ip_tos), to prevent an
- *	overlapping encap match _without_ optional fields.
+ *	match on an optional field (currently ip_tos and/or udp_sport),
+ *	to prevent an overlapping encap match _without_ optional fields.
  *	The pseudo encap match may be referenced again by an encap match
- *	with a different ip_tos value, but all ip_tos_mask must match the
- *	first (stored in our child_ip_tos_mask).
+ *	with different values for these fields, but all masks must match the
+ *	first (stored in our child_* fields).
  */
 enum efx_tc_em_pseudo_type {
 	EFX_TC_EM_DIRECT,
@@ -99,10 +99,12 @@ struct efx_tc_encap_match {
 	__be32 src_ip, dst_ip;
 	struct in6_addr src_ip6, dst_ip6;
 	__be16 udp_dport;
+	__be16 udp_sport, udp_sport_mask;
 	u8 ip_tos, ip_tos_mask;
 	struct rhash_head linkage;
 	enum efx_encap_type tun_type;
 	u8 child_ip_tos_mask;
+	__be16 child_udp_sport_mask;
 	refcount_t ref;
 	enum efx_tc_em_pseudo_type type;
 	u32 fw_id; /* index of this entry in firmware encap match table */

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos
  2023-05-11 19:47 ` [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos edward.cree
@ 2023-05-12  9:51   ` Simon Horman
  2023-05-12 14:27     ` Edward Cree
  0 siblings, 1 reply; 8+ messages in thread
From: Simon Horman @ 2023-05-12  9:51 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, Edward Cree,
	netdev, habetsm.xilinx

On Thu, May 11, 2023 at 08:47:30PM +0100, edward.cree@amd.com wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
> To avoid partially-overlapping Outer Rules (which can lead to undefined
>  behaviour in the hardware), store extra "pseudo" entries in our
>  encap_match hashtable, which are used to enforce that all Outer Rule
>  entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
>  equivalent) have the same ip_tos_mask.
> The "direct" encap_match entry takes a reference on the "pseudo",
>  allowing it to be destroyed when all "direct" entries using it are
>  removed.
> efx_tc_em_pseudo_type is an enum rather than just a bool because in
>  future an additional pseudo-type will be added to support Conntrack
>  offload.
> 
> Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>

...

> @@ -425,12 +469,56 @@ static int efx_tc_flower_record_encap_match(struct efx_nic *efx,
>  #endif
>  	encap->udp_dport = match->value.enc_dport;
>  	encap->tun_type = type;
> +	encap->ip_tos = match->value.enc_ip_tos;
> +	encap->ip_tos_mask = match->mask.enc_ip_tos;
> +	encap->child_ip_tos_mask = child_ip_tos_mask;
> +	encap->type = em_type;
> +	encap->pseudo = pseudo;
>  	old = rhashtable_lookup_get_insert_fast(&efx->tc->encap_match_ht,
>  						&encap->linkage,
>  						efx_tc_encap_match_ht_params);
>  	if (old) {
>  		/* don't need our new entry */
>  		kfree(encap);

Hi Ed,

encap is freed here.

> +		if (pseudo) /* don't need our new pseudo either */
> +			efx_tc_flower_release_encap_match(efx, pseudo);
> +		/* check old and new em_types are compatible */
> +		switch (old->type) {
> +		case EFX_TC_EM_DIRECT:
> +			/* old EM is in hardware, so mustn't overlap with a
> +			 * pseudo, but may be shared with another direct EM
> +			 */
> +			if (em_type == EFX_TC_EM_DIRECT)
> +				break;
> +			NL_SET_ERR_MSG_MOD(extack, "Pseudo encap match conflicts with existing direct entry");
> +			return -EEXIST;
> +		case EFX_TC_EM_PSEUDO_MASK:
> +			/* old EM is protecting a ToS-qualified filter, so may
> +			 * only be shared with another pseudo for the same
> +			 * ToS mask.
> +			 */
> +			if (em_type != EFX_TC_EM_PSEUDO_MASK) {
> +				NL_SET_ERR_MSG_FMT_MOD(extack,
> +						       "%s encap match conflicts with existing pseudo(MASK) entry",
> +						       encap->type ? "Pseudo" : "Direct");

But dereferenced here.

> +				return -EEXIST;
> +			}
> +			if (child_ip_tos_mask != old->child_ip_tos_mask) {
> +				NL_SET_ERR_MSG_FMT_MOD(extack,
> +						       "Pseudo encap match for TOS mask %#04x conflicts with existing pseudo(MASK) entry for TOS mask %#04x",
> +						       child_ip_tos_mask,
> +						       old->child_ip_tos_mask);
> +				return -EEXIST;
> +			}
> +			break;
> +		default: /* Unrecognised pseudo-type.  Just say no */
> +			NL_SET_ERR_MSG_FMT_MOD(extack,
> +					       "%s encap match conflicts with existing pseudo(%d) entry",
> +					       encap->type ? "Pseudo" : "Direct",

And here.

> +					       old->type);
> +			return -EEXIST;
> +		}
> +		/* check old and new tun_types are compatible */
>  		if (old->tun_type != type) {
>  			NL_SET_ERR_MSG_FMT_MOD(extack,
>  					       "Egress encap match with conflicting tun_type %u != %u",

...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules
  2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
                   ` (3 preceding siblings ...)
  2023-05-11 19:47 ` [PATCH v2 net-next 4/4] sfc: support TC decap rules matching on enc_src_port edward.cree
@ 2023-05-12 11:50 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-05-12 11:50 UTC (permalink / raw)
  To: edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, ecree.xilinx,
	netdev, habetsm.xilinx

Hello:

This series was applied to netdev/net-next.git (main)
by David S. Miller <davem@davemloft.net>:

On Thu, 11 May 2023 20:47:27 +0100 you wrote:
> From: Edward Cree <ecree.xilinx@gmail.com>
> 
> This series extends the TC offload support on EF100 to support optionally
>  matching on the IP ToS and UDP source port of the outer header in rules
>  performing tunnel decapsulation.  Both of these fields allow masked
>  matches if the underlying hardware supports it (current EF100 hardware
>  supports masking on ToS, but only exact-match on source port).
> Given that the source port is typically populated from a hash of inner
>  header entropy, it's not clear whether filtering on it is useful, but
>  since we can support it we may as well expose the capability.
> 
> [...]

Here is the summary with links:
  - [v2,net-next,1/4] sfc: release encap match in efx_tc_flow_free()
    https://git.kernel.org/netdev/net-next/c/28fa3ac487c6
  - [v2,net-next,2/4] sfc: populate enc_ip_tos matches in MAE outer rules
    https://git.kernel.org/netdev/net-next/c/56beb35d85e2
  - [v2,net-next,3/4] sfc: support TC decap rules matching on enc_ip_tos
    https://git.kernel.org/netdev/net-next/c/3c9561c0a5b9
  - [v2,net-next,4/4] sfc: support TC decap rules matching on enc_src_port
    https://git.kernel.org/netdev/net-next/c/b6583d5e9e94

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos
  2023-05-12  9:51   ` Simon Horman
@ 2023-05-12 14:27     ` Edward Cree
  0 siblings, 0 replies; 8+ messages in thread
From: Edward Cree @ 2023-05-12 14:27 UTC (permalink / raw)
  To: Simon Horman, edward.cree
  Cc: linux-net-drivers, davem, kuba, pabeni, edumazet, netdev,
	habetsm.xilinx

On 12/05/2023 10:51, Simon Horman wrote:
> On Thu, May 11, 2023 at 08:47:30PM +0100, edward.cree@amd.com wrote:
>>  	if (old) {
>>  		/* don't need our new entry */
>>  		kfree(encap);
> 
> Hi Ed,
> 
> encap is freed here.
> 
...
>> +			if (em_type != EFX_TC_EM_PSEUDO_MASK) {
>> +				NL_SET_ERR_MSG_FMT_MOD(extack,
>> +						       "%s encap match conflicts with existing pseudo(MASK) entry",
>> +						       encap->type ? "Pseudo" : "Direct");
> 
> But dereferenced here.
> 
...
>> +			NL_SET_ERR_MSG_FMT_MOD(extack,
>> +					       "%s encap match conflicts with existing pseudo(%d) entry",
>> +					       encap->type ? "Pseudo" : "Direct",
> 
> And here.

Good catch, thanks.  We should use em_type instead.
Dave already applied so I'll send a follow-up fix shortly.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-05-12 14:27 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-11 19:47 [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules edward.cree
2023-05-11 19:47 ` [PATCH v2 net-next 1/4] sfc: release encap match in efx_tc_flow_free() edward.cree
2023-05-11 19:47 ` [PATCH v2 net-next 2/4] sfc: populate enc_ip_tos matches in MAE outer rules edward.cree
2023-05-11 19:47 ` [PATCH v2 net-next 3/4] sfc: support TC decap rules matching on enc_ip_tos edward.cree
2023-05-12  9:51   ` Simon Horman
2023-05-12 14:27     ` Edward Cree
2023-05-11 19:47 ` [PATCH v2 net-next 4/4] sfc: support TC decap rules matching on enc_src_port edward.cree
2023-05-12 11:50 ` [PATCH v2 net-next 0/4] sfc: more flexible encap matches on TC decap rules patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).