netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting.
@ 2024-06-03 18:56 Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 1/9] net: psample: add user cookie Adrian Moreno
                   ` (8 more replies)
  0 siblings, 9 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev; +Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno

** Background **
Currently, OVS supports several packet sampling mechanisms (sFlow,
per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
userspace action that needs to be handled by ovs-vswitchd's handler
threads only to be forwarded to some third party application that
will somehow process the sample and provide observability on the
datapath.

A particularly interesting use-case is controller-driven
per-flow IPFIX sampling where the OpenFlow controller can add metadata
to samples (via two 32bit integers) and this metadata is then available
to the sample-collecting system for correlation.

** Problem **
The fact that sampled traffic share netlink sockets and handler thread
time with upcalls, apart from being a performance bottleneck in the
sample extraction itself, can severely compromise the datapath,
yielding this solution unfit for highly loaded production systems.

Users are left with little options other than guessing what sampling
rate will be OK for their traffic pattern and system load and dealing
with the lost accuracy.

Looking at available infrastructure, an obvious candidated would be
to use psample. However, it's current state does not help with the
use-case at stake because sampled packets do not contain user-defined
metadata.

** Proposal **
This series is an attempt to fix this situation by extending the
existing psample infrastructure to carry a variable length
user-defined cookie.

The main existing user of psample is tc's act_sample. It is also
extended to forward the action's cookie to psample.

Finally, a new OVS action (OVS_SAMPLE_ATTR_EMIT_SAMPLE) is created.
It accepts a group and an optional cookie and uses psample to
multicast the packet and the metadata.

--
v1 -> v2:
- Create a new action ("emit_sample") rather than reuse existing
  "sample" one.
- Add probability semantics to psample's sampling rate.
- Store sampling probability in skb's cb area and use it in emit_sample.
- Test combining "emit_sample" with "trunc"
- Drop group_id filtering and tracepoint in psample.

rfc_v2 -> v1:
- Accomodate Ilya's comments.
- Split OVS's attribute in two attributes and simplify internal
handling of psample arguments.
- Extend psample and tc with a user-defined cookie.
- Add a tracepoint to psample to facilitate troubleshooting.

rfc_v1 -> rfc_v2:
- Use psample instead of a new OVS-only multicast group.
- Extend psample and tc with a user-defined cookie.

Adrian Moreno (9):
  net: psample: add user cookie
  net: sched: act_sample: add action cookie to sample
  net: psample: skip packet copy if no listeners
  net: psample: allow using rate as probability
  net: openvswitch: add emit_sample action
  net: openvswitch: store sampling probability in cb.
  net: openvswitch: do not notify drops inside sample
  selftests: openvswitch: add emit_sample action
  selftests: openvswitch: add emit_sample test

 Documentation/netlink/specs/ovs_flow.yaml     |  17 ++
 include/net/psample.h                         |   5 +-
 include/uapi/linux/openvswitch.h              |  28 +-
 include/uapi/linux/psample.h                  |   5 +
 include/uapi/linux/tc_act/tc_sample.h         |   1 +
 net/openvswitch/actions.c                     |  86 +++++-
 net/openvswitch/datapath.h                    |   3 +
 net/openvswitch/flow_netlink.c                |  33 ++-
 net/openvswitch/vport.c                       |   1 +
 net/psample/psample.c                         |  16 +-
 net/sched/act_sample.c                        |  12 +
 .../selftests/net/openvswitch/openvswitch.sh  |  99 ++++++-
 .../selftests/net/openvswitch/ovs-dpctl.py    | 274 +++++++++++++++++-
 13 files changed, 564 insertions(+), 16 deletions(-)

-- 
2.45.1


^ permalink raw reply	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 1/9] net: psample: add user cookie
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-14 16:13   ` Simon Horman
  2024-06-03 18:56 ` [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample Adrian Moreno
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Yotam Gigi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Add a user cookie to the sample metadata so that sample emitters can
provide more contextual information to samples.

If present, send the user cookie in a new attribute:
PSAMPLE_ATTR_USER_COOKIE.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 include/net/psample.h        | 2 ++
 include/uapi/linux/psample.h | 1 +
 net/psample/psample.c        | 9 ++++++++-
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/net/psample.h b/include/net/psample.h
index 0509d2d6be67..2ac71260a546 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -25,6 +25,8 @@ struct psample_metadata {
 	   out_tc_occ_valid:1,
 	   latency_valid:1,
 	   unused:5;
+	const u8 *user_cookie;
+	u32 user_cookie_len;
 };
 
 struct psample_group *psample_group_get(struct net *net, u32 group_num);
diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
index e585db5bf2d2..e80637e1d97b 100644
--- a/include/uapi/linux/psample.h
+++ b/include/uapi/linux/psample.h
@@ -19,6 +19,7 @@ enum {
 	PSAMPLE_ATTR_LATENCY,		/* u64, nanoseconds */
 	PSAMPLE_ATTR_TIMESTAMP,		/* u64, nanoseconds */
 	PSAMPLE_ATTR_PROTO,		/* u16 */
+	PSAMPLE_ATTR_USER_COOKIE,	/* binary, user provided data */
 
 	__PSAMPLE_ATTR_MAX
 };
diff --git a/net/psample/psample.c b/net/psample/psample.c
index a5d9b8446f77..b37488f426bc 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -386,7 +386,9 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 		   nla_total_size(sizeof(u32)) +	/* group_num */
 		   nla_total_size(sizeof(u32)) +	/* seq */
 		   nla_total_size_64bit(sizeof(u64)) +	/* timestamp */
-		   nla_total_size(sizeof(u16));		/* protocol */
+		   nla_total_size(sizeof(u16)) +	/* protocol */
+		   (md->user_cookie_len ?
+		    nla_total_size(md->user_cookie_len) : 0); /* user cookie */
 
 #ifdef CONFIG_INET
 	tun_info = skb_tunnel_info(skb);
@@ -486,6 +488,11 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 	}
 #endif
 
+	if (md->user_cookie && md->user_cookie_len &&
+	    nla_put(nl_skb, PSAMPLE_ATTR_USER_COOKIE, md->user_cookie_len,
+		    md->user_cookie))
+		goto error;
+
 	genlmsg_end(nl_skb, data);
 	genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
 				PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 1/9] net: psample: add user cookie Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-14 16:14   ` Simon Horman
  2024-06-17 10:00   ` Ilya Maximets
  2024-06-03 18:56 ` [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners Adrian Moreno
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

If the action has a user_cookie, pass it along to the sample so it can
be easily identified.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 net/sched/act_sample.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index a69b53d54039..5c3f86ec964a 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
 				     const struct tc_action *a,
 				     struct tcf_result *res)
 {
+	u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
 	struct tcf_sample *s = to_sample(a);
 	struct psample_group *psample_group;
 	struct psample_metadata md = {};
+	struct tc_cookie *user_cookie;
 	int retval;
 
 	tcf_lastuse_update(&s->tcf_tm);
@@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
 		if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
 			skb_push(skb, skb->mac_len);
 
+		rcu_read_lock();
+		user_cookie = rcu_dereference(a->user_cookie);
+		if (user_cookie) {
+			memcpy(cookie_data, user_cookie->data,
+			       user_cookie->len);
+			md.user_cookie = cookie_data;
+			md.user_cookie_len = user_cookie->len;
+		}
+		rcu_read_unlock();
+
 		md.trunc_size = s->truncate ? s->trunc_size : skb->len;
 		psample_sample_packet(psample_group, skb, s->rate, &md);
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 1/9] net: psample: add user cookie Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-14 16:15   ` Simon Horman
  2024-06-03 18:56 ` [PATCH net-next v2 4/9] net: psample: allow using rate as probability Adrian Moreno
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Yotam Gigi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

If nobody is listening on the multicast group, generating the sample,
which involves copying packet data, seems completely unnecessary.

Return fast in this case.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 net/psample/psample.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/psample/psample.c b/net/psample/psample.c
index b37488f426bc..1c76f3e48dcd 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -376,6 +376,10 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 	void *data;
 	int ret;
 
+	if (!genl_has_listeners(&psample_nl_family, group->net,
+				PSAMPLE_NL_MCGRP_SAMPLE))
+		return;
+
 	meta_len = (in_ifindex ? nla_total_size(sizeof(u16)) : 0) +
 		   (out_ifindex ? nla_total_size(sizeof(u16)) : 0) +
 		   (md->out_tc_valid ? nla_total_size(sizeof(u16)) : 0) +
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 4/9] net: psample: allow using rate as probability
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (2 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-14 16:11   ` Simon Horman
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Yotam Gigi, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Jamal Hadi Salim, Cong Wang, Jiri Pirko,
	linux-kernel

Although not explicitly documented in the psample module itself, the
definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.

Quoting tc-sample(8):
"RATE of 100 will lead to an average of one sampled packet out of every
100 observed."

With this semantics, the rates that we can express with an unsigned
32-bits number are very unevenly distributed and concentrated towards
"sampling few packets".
For example, we can express a probability of 2.32E-8% but we
cannot express anything between 100% and 50%.

For sampling applications that are capable of sampling a decent
amount of packets, this sampling rate semantics is not very useful.

Add a new flag to the uAPI that indicates that the sampling rate is
expressed in scaled probability, this is:
- 0 is 0% probability, no packets get sampled.
- U32_MAX is 100% probability, all packets get sampled.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 include/net/psample.h                 | 3 ++-
 include/uapi/linux/psample.h          | 4 ++++
 include/uapi/linux/tc_act/tc_sample.h | 1 +
 net/psample/psample.c                 | 3 +++
 4 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/include/net/psample.h b/include/net/psample.h
index 2ac71260a546..c52e9ebd88dd 100644
--- a/include/net/psample.h
+++ b/include/net/psample.h
@@ -24,7 +24,8 @@ struct psample_metadata {
 	u8 out_tc_valid:1,
 	   out_tc_occ_valid:1,
 	   latency_valid:1,
-	   unused:5;
+	   rate_as_probability:1,
+	   unused:4;
 	const u8 *user_cookie;
 	u32 user_cookie_len;
 };
diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
index e80637e1d97b..8b069e75beab 100644
--- a/include/uapi/linux/psample.h
+++ b/include/uapi/linux/psample.h
@@ -20,6 +20,10 @@ enum {
 	PSAMPLE_ATTR_TIMESTAMP,		/* u64, nanoseconds */
 	PSAMPLE_ATTR_PROTO,		/* u16 */
 	PSAMPLE_ATTR_USER_COOKIE,	/* binary, user provided data */
+	PSAMPLE_ATTR_SAMPLE_PROBABILITY,/* no argument, interpret rate in
+					 * PSAMPLE_ATTR_SAMPLE_RATE as a
+					 * probability scaled 0 - U32_MAX.
+					 */
 
 	__PSAMPLE_ATTR_MAX
 };
diff --git a/include/uapi/linux/tc_act/tc_sample.h b/include/uapi/linux/tc_act/tc_sample.h
index fee1bcc20793..7ee0735e7b38 100644
--- a/include/uapi/linux/tc_act/tc_sample.h
+++ b/include/uapi/linux/tc_act/tc_sample.h
@@ -18,6 +18,7 @@ enum {
 	TCA_SAMPLE_TRUNC_SIZE,
 	TCA_SAMPLE_PSAMPLE_GROUP,
 	TCA_SAMPLE_PAD,
+	TCA_SAMPLE_PROBABILITY,
 	__TCA_SAMPLE_MAX
 };
 #define TCA_SAMPLE_MAX (__TCA_SAMPLE_MAX - 1)
diff --git a/net/psample/psample.c b/net/psample/psample.c
index 1c76f3e48dcd..f48b5b9cd409 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -497,6 +497,9 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
 		    md->user_cookie))
 		goto error;
 
+	if (md->rate_as_probability)
+		nla_put_flag(skb, PSAMPLE_ATTR_SAMPLE_PROBABILITY);
+
 	genlmsg_end(nl_skb, data);
 	genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
 				PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (3 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 4/9] net: psample: allow using rate as probability Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-05  0:29   ` kernel test robot
                     ` (4 more replies)
  2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
                   ` (3 subsequent siblings)
  8 siblings, 5 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Donald Hunter, Jakub Kicinski, David S. Miller, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, linux-kernel

Add support for a new action: emit_sample.

This action accepts a u32 group id and a variable-length cookie and uses
the psample multicast group to make the packet available for
observability.

The maximum length of the user-defined cookie is set to 16, same as
tc_cookie, to discourage using cookies that will not be offloadable.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
 include/uapi/linux/openvswitch.h          | 25 ++++++++++++
 net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
 net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
 4 files changed, 124 insertions(+), 1 deletion(-)

diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
index 4fdfc6b5cae9..a7ab5593a24f 100644
--- a/Documentation/netlink/specs/ovs_flow.yaml
+++ b/Documentation/netlink/specs/ovs_flow.yaml
@@ -727,6 +727,12 @@ attribute-sets:
         name: dec-ttl
         type: nest
         nested-attributes: dec-ttl-attrs
+      -
+        name: emit-sample
+        type: nest
+        nested-attributes: emit-sample-attrs
+        doc: |
+          Sends a packet sample to psample for external observation.
   -
     name: tunnel-key-attrs
     enum-name: ovs-tunnel-key-attr
@@ -938,6 +944,17 @@ attribute-sets:
       -
         name: gbp
         type: u32
+  -
+    name: emit-sample-attrs
+    enum-name: ovs-emit-sample-attr
+    name-prefix: ovs-emit-sample-attr-
+    attributes:
+      -
+        name: group
+        type: u32
+      -
+        name: cookie
+        type: binary
 
 operations:
   name-prefix: ovs-flow-cmd-
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index efc82c318fa2..a0e9dde0584a 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -914,6 +914,30 @@ struct check_pkt_len_arg {
 };
 #endif
 
+#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
+/**
+ * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
+ * action.
+ *
+ * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
+ * sample.
+ * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
+ * user-defined metadata. The maximum length is 16 bytes.
+ *
+ * Sends the packet to the psample multicast group with the specified group and
+ * cookie. It is possible to combine this action with the
+ * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
+ */
+enum ovs_emit_sample_attr {
+	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
+	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
+	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
+	__OVS_EMIT_SAMPLE_ATTR_MAX
+};
+
+#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
+
+
 /**
  * enum ovs_action_attr - Action types.
  *
@@ -1004,6 +1028,7 @@ enum ovs_action_attr {
 	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
 	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
 	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
+	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
 
 	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
 				       * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 964225580824..3b4dba0ded59 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -24,6 +24,11 @@
 #include <net/checksum.h>
 #include <net/dsfield.h>
 #include <net/mpls.h>
+
+#if IS_ENABLED(CONFIG_PSAMPLE)
+#include <net/psample.h>
+#endif
+
 #include <net/sctp/checksum.h>
 
 #include "datapath.h"
@@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
 	return 0;
 }
 
+static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
+			       const struct sw_flow_key *key,
+			       const struct nlattr *attr)
+{
+#if IS_ENABLED(CONFIG_PSAMPLE)
+	struct psample_group psample_group = {};
+	struct psample_metadata md = {};
+	struct vport *input_vport;
+	const struct nlattr *a;
+	int rem;
+
+	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
+	     a = nla_next(a, &rem)) {
+		switch (nla_type(a)) {
+		case OVS_EMIT_SAMPLE_ATTR_GROUP:
+			psample_group.group_num = nla_get_u32(a);
+			break;
+
+		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
+			md.user_cookie = nla_data(a);
+			md.user_cookie_len = nla_len(a);
+			break;
+		}
+	}
+
+	psample_group.net = ovs_dp_get_net(dp);
+
+	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
+	if (!input_vport)
+		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
+
+	md.in_ifindex = input_vport->dev->ifindex;
+	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
+
+	psample_sample_packet(&psample_group, skb, 0, &md);
+#endif
+
+	return 0;
+}
+
 /* Execute a list of actions against 'skb'. */
 static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 			      struct sw_flow_key *key,
@@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 			ovs_kfree_skb_reason(skb, reason);
 			return 0;
 		}
+
+		case OVS_ACTION_ATTR_EMIT_SAMPLE:
+			err = execute_emit_sample(dp, skb, key, a);
+			OVS_CB(skb)->cutlen = 0;
+			break;
 		}
 
 		if (unlikely(err)) {
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index f224d9bcea5e..eb59ff9c8154 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
 		case OVS_ACTION_ATTR_TRUNC:
 		case OVS_ACTION_ATTR_USERSPACE:
 		case OVS_ACTION_ATTR_DROP:
+		case OVS_ACTION_ATTR_EMIT_SAMPLE:
 			break;
 
 		case OVS_ACTION_ATTR_CT:
@@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
 	/* Whenever new actions are added, the need to update this
 	 * function should be considered.
 	 */
-	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
+	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
 
 	if (!actions)
 		return;
@@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
 	return 0;
 }
 
+static int validate_emit_sample(const struct nlattr *attr)
+{
+	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
+		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
+		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
+			.type = NLA_BINARY,
+			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
+		},
+	};
+	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
+	int err;
+
+	if (!IS_ENABLED(CONFIG_PSAMPLE))
+		return -EOPNOTSUPP;
+
+	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
+			       NULL);
+	if (err)
+		return err;
+
+	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
+}
+
 static int copy_action(const struct nlattr *from,
 		       struct sw_flow_actions **sfa, bool log)
 {
@@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
 			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
 			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
+			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
 		};
 		const struct ovs_action_push_vlan *vlan;
 		int type = nla_type(a);
@@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 				return -EINVAL;
 			break;
 
+		case OVS_ACTION_ATTR_EMIT_SAMPLE:
+			err = validate_emit_sample(a);
+			if (err)
+				return err;
+			break;
+
 		default:
 			OVS_NLERR(log, "Unknown Action type %d", type);
 			return -EINVAL;
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (4 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-04  6:09   ` kernel test robot
                     ` (2 more replies)
  2024-06-03 18:56 ` [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample Adrian Moreno
                   ` (2 subsequent siblings)
  8 siblings, 3 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

The behavior of actions might not be the exact same if they are being
executed inside a nested sample action. Store the probability of the
parent sample action in the skb's cb area.

Use the probability in emit_sample to pass it down to psample.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 include/uapi/linux/openvswitch.h |  3 ++-
 net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
 net/openvswitch/datapath.h       |  3 +++
 net/openvswitch/vport.c          |  1 +
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a0e9dde0584a..9d675725fa2b 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -649,7 +649,8 @@ enum ovs_flow_attr {
  * Actions are passed as nested attributes.
  *
  * Executes the specified actions with the given probability on a per-packet
- * basis.
+ * basis. Nested actions will be able to access the probability value of the
+ * parent @OVS_ACTION_ATTR_SAMPLE.
  */
 enum ovs_sample_attr {
 	OVS_SAMPLE_ATTR_UNSPEC,
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 3b4dba0ded59..33f6d93ba5e4 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
 	struct nlattr *sample_arg;
 	int rem = nla_len(attr);
 	const struct sample_arg *arg;
+	u32 init_probability;
 	bool clone_flow_key;
+	int err;
 
 	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
 	sample_arg = nla_data(attr);
 	arg = nla_data(sample_arg);
 	actions = nla_next(sample_arg, &rem);
+	init_probability = OVS_CB(skb)->probability;
 
 	if ((arg->probability != U32_MAX) &&
 	    (!arg->probability || get_random_u32() > arg->probability)) {
@@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
 		return 0;
 	}
 
+	if (init_probability) {
+		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
+					    arg->probability / U32_MAX);
+	} else {
+		OVS_CB(skb)->probability = arg->probability;
+	}
+
 	clone_flow_key = !arg->exec;
-	return clone_execute(dp, skb, key, 0, actions, rem, last,
-			     clone_flow_key);
+	err = clone_execute(dp, skb, key, 0, actions, rem, last,
+			    clone_flow_key);
+
+	if (!last)
+		OVS_CB(skb)->probability = init_probability;
+
+	return err;
 }
 
 /* When 'last' is true, clone() should always consume the 'skb'.
@@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
 	struct psample_metadata md = {};
 	struct vport *input_vport;
 	const struct nlattr *a;
+	u32 rate;
 	int rem;
 
 	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
@@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
 
 	md.in_ifindex = input_vport->dev->ifindex;
 	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
+	md.rate_as_probability = 1;
+
+	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
 
-	psample_sample_packet(&psample_group, skb, 0, &md);
+	psample_sample_packet(&psample_group, skb, rate, &md);
 #endif
 
 	return 0;
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 0cd29971a907..9ca6231ea647 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -115,12 +115,15 @@ struct datapath {
  * fragmented.
  * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
+ * @probability: The sampling probability that was applied to this skb; 0 means
+ * no sampling has occurred; U32_MAX means 100% probability.
  */
 struct ovs_skb_cb {
 	struct vport		*input_vport;
 	u16			mru;
 	u16			acts_origlen;
 	u32			cutlen;
+	u32			probability;
 };
 #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
 
diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
index 972ae01a70f7..8732f6e51ae5 100644
--- a/net/openvswitch/vport.c
+++ b/net/openvswitch/vport.c
@@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
 	OVS_CB(skb)->input_vport = vport;
 	OVS_CB(skb)->mru = 0;
 	OVS_CB(skb)->cutlen = 0;
+	OVS_CB(skb)->probability = 0;
 	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
 		u32 mark;
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (5 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-14 16:17   ` Simon Horman
  2024-06-17 11:55   ` Ilya Maximets
  2024-06-03 18:56 ` [PATCH net-next v2 8/9] selftests: openvswitch: add emit_sample action Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test Adrian Moreno
  8 siblings, 2 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

The OVS_ACTION_ATTR_SAMPLE action is, in essence,
observability-oriented.

Apart from some corner case in which it's used a replacement of clone()
for old kernels, it's really only used for sFlow, IPFIX and now,
local emit_sample.

With this in mind, it doesn't make much sense to report
OVS_DROP_LAST_ACTION inside sample actions.

For instance, if the flow:

  actions:sample(..,emit_sample(..)),2

triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
confusing for users since the packet did reach its destination.

This patch makes internal action execution silently consume the skb
instead of notifying a drop for this case.

Unfortunately, this patch does not remove all potential sources of
confusion since, if the sample action itself is the last action, e.g:

    actions:sample(..,emit_sample(..))

we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.

Sadly, this case is difficult to solve without breaking the
optimization by which the skb is not cloned on last sample actions.
But, given explicit drop actions are now supported, OVS can just add one
after the last sample() and rewrite the flow as:

    actions:sample(..,emit_sample(..)),drop

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 net/openvswitch/actions.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 33f6d93ba5e4..54fc1abcff95 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
 static struct action_flow_keys __percpu *flow_keys;
 static DEFINE_PER_CPU(int, exec_actions_level);
 
+static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
+{
+	/* Do not emit packet drops inside sample(). */
+	if (OVS_CB(skb)->probability)
+		consume_skb(skb);
+	else
+		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+}
+
 /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
  * space. Return NULL if out of key spaces.
  */
@@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
 	if ((arg->probability != U32_MAX) &&
 	    (!arg->probability || get_random_u32() > arg->probability)) {
 		if (last)
-			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+			ovs_drop_skb_last_action(skb);
 		return 0;
 	}
 
@@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 		}
 	}
 
-	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
+	ovs_drop_skb_last_action(skb);
 	return 0;
 }
 
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 8/9] selftests: openvswitch: add emit_sample action
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (6 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-03 18:56 ` [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test Adrian Moreno
  8 siblings, 0 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Shuah Khan, linux-kselftest, linux-kernel

Add sample and emit_sample action support to ovs-dpctl.py.

Refactor common attribute parsing logic into an external function.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 .../selftests/net/openvswitch/ovs-dpctl.py    | 162 +++++++++++++++++-
 1 file changed, 161 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index a2395c3f37a1..f8b5362aac8c 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -8,6 +8,7 @@ import argparse
 import errno
 import ipaddress
 import logging
+import math
 import multiprocessing
 import re
 import struct
@@ -58,6 +59,7 @@ OVS_FLOW_CMD_DEL = 2
 OVS_FLOW_CMD_GET = 3
 OVS_FLOW_CMD_SET = 4
 
+UINT32_MAX = 0xFFFFFFFF
 
 def macstr(mac):
     outstr = ":".join(["%02X" % i for i in mac])
@@ -267,6 +269,75 @@ def parse_extract_field(
     return str_skipped, data
 
 
+def parse_attrs(actstr, attr_desc):
+    """Parses the given action string and returns a list of netlink
+    attributes based on a list of attribute descriptions.
+
+    Each element in the attribute description list is a tuple such as:
+        (name, attr_name, parse_func)
+    where:
+        name: is the string representing the attribute
+        attr_name: is the name of the attribute as defined in the uAPI.
+        parse_func: is a callable accepting a string and returning either
+            a single object (the parsed attribute value) or a tuple of
+            two values (the parsed attribute value and the remaining string)
+
+    Returns a list of attributes and the remaining string.
+    """
+    def parse_attr(actstr, key, func):
+        actstr = actstr[len(key) :]
+
+        if not func:
+            return None, actstr
+
+        delim = actstr[0]
+        actstr = actstr[1:]
+
+        if delim == "=":
+            pos = strcspn(actstr, ",)")
+            ret = func(actstr[:pos])
+        else:
+            ret = func(actstr)
+
+        if isinstance(ret, tuple):
+            (datum, actstr) = ret
+        else:
+            datum = ret
+            actstr = actstr[strcspn(actstr, ",)"):]
+
+        if delim == "(":
+            if not actstr or actstr[0] != ")":
+                raise ValueError("Action contains unbalanced parentheses")
+
+            actstr = actstr[1:]
+
+        actstr = actstr[strspn(actstr, ", ") :]
+
+        return datum, actstr
+
+    attrs = []
+    attr_desc = list(attr_desc)
+    while actstr and actstr[0] != ")" and attr_desc:
+        found = False
+        for i, (key, attr, func) in enumerate(attr_desc):
+            if actstr.startswith(key):
+                datum, actstr = parse_attr(actstr, key, func)
+                attrs.append([attr, datum])
+                found = True
+                del attr_desc[i]
+
+        if not found:
+            raise ValueError("Unknown attribute: '%s'" % actstr)
+
+        actstr = actstr[strspn(actstr, ", ") :]
+
+    if actstr[0] != ")":
+        raise ValueError("Action string contains extra garbage or has "
+                         "unbalanced parenthesis: '%s'" % actstr)
+
+    return attrs, actstr[1:]
+
+
 class ovs_dp_msg(genlmsg):
     # include the OVS version
     # We need a custom header rather than just being able to rely on
@@ -285,7 +356,7 @@ class ovsactions(nla):
         ("OVS_ACTION_ATTR_SET", "none"),
         ("OVS_ACTION_ATTR_PUSH_VLAN", "none"),
         ("OVS_ACTION_ATTR_POP_VLAN", "flag"),
-        ("OVS_ACTION_ATTR_SAMPLE", "none"),
+        ("OVS_ACTION_ATTR_SAMPLE", "sample"),
         ("OVS_ACTION_ATTR_RECIRC", "uint32"),
         ("OVS_ACTION_ATTR_HASH", "none"),
         ("OVS_ACTION_ATTR_PUSH_MPLS", "none"),
@@ -304,8 +375,85 @@ class ovsactions(nla):
         ("OVS_ACTION_ATTR_ADD_MPLS", "none"),
         ("OVS_ACTION_ATTR_DEC_TTL", "none"),
         ("OVS_ACTION_ATTR_DROP", "uint32"),
+        ("OVS_ACTION_ATTR_EMIT_SAMPLE", "emit_sample"),
     )
 
+    class emit_sample(nla):
+        nla_flags = NLA_F_NESTED
+
+        nla_map = (
+            ("OVS_EMIT_SAMPLE_ATTR_UNSPEC", "none"),
+            ("OVS_EMIT_SAMPLE_ATTR_GROUP", "uint32"),
+            ("OVS_EMIT_SAMPLE_ATTR_COOKIE", "array(uint8)"),
+        )
+
+        def dpstr(self, more=False):
+            args = "group=%d" % self.get_attr("OVS_EMIT_SAMPLE_ATTR_GROUP")
+
+            cookie = self.get_attr("OVS_EMIT_SAMPLE_ATTR_COOKIE")
+            if cookie:
+                args += ",cookie(%s)" % \
+                        "".join(format(x, "02x") for x in cookie)
+
+            return "emit_sample(%s)" % args
+
+        def parse(self, actstr):
+            desc = (
+                ("group", "OVS_EMIT_SAMPLE_ATTR_GROUP", int),
+                ("cookie", "OVS_EMIT_SAMPLE_ATTR_COOKIE",
+                    lambda x: list(bytearray.fromhex(x)))
+            )
+
+            attrs, actstr = parse_attrs(actstr, desc)
+
+            for attr in attrs:
+                self["attrs"].append(attr)
+
+            return actstr
+
+    class sample(nla):
+        nla_flags = NLA_F_NESTED
+
+        nla_map = (
+            ("OVS_SAMPLE_ATTR_UNSPEC", "none"),
+            ("OVS_SAMPLE_ATTR_PROBABILITY", "uint32"),
+            ("OVS_SAMPLE_ATTR_ACTIONS", "ovsactions"),
+        )
+
+        def dpstr(self, more=False):
+            args = []
+
+            args.append("sample={:.2f}%".format(
+                100 * self.get_attr("OVS_SAMPLE_ATTR_PROBABILITY") /
+                UINT32_MAX))
+
+            actions = self.get_attr("OVS_SAMPLE_ATTR_ACTIONS")
+            if actions:
+                args.append("actions(%s)" % actions.dpstr(more))
+
+            return "sample(%s)" % ",".join(args)
+
+        def parse(self, actstr):
+            def parse_nested_actions(actstr):
+                subacts = ovsactions()
+                parsed_len = subacts.parse(actstr)
+                return subacts, actstr[parsed_len :]
+
+            def percent_to_rate(percent):
+                percent = float(percent.strip('%'))
+                return int(math.floor(UINT32_MAX * (percent / 100.0) + .5))
+
+            desc = (
+                ("sample", "OVS_SAMPLE_ATTR_PROBABILITY", percent_to_rate),
+                ("actions", "OVS_SAMPLE_ATTR_ACTIONS", parse_nested_actions),
+            )
+            attrs, actstr = parse_attrs(actstr, desc)
+
+            for attr in attrs:
+                self["attrs"].append(attr)
+
+            return actstr
+
     class ctact(nla):
         nla_flags = NLA_F_NESTED
 
@@ -643,6 +791,18 @@ class ovsactions(nla):
                 self["attrs"].append(["OVS_ACTION_ATTR_CT", ctact])
                 parsed = True
 
+            elif parse_starts_block(actstr, "sample(", False):
+                sampleact = self.sample()
+                actstr = sampleact.parse(actstr[len("sample(") : ])
+                self["attrs"].append(["OVS_ACTION_ATTR_SAMPLE", sampleact])
+                parsed = True
+
+            elif parse_starts_block(actstr, "emit_sample(", False):
+                emitact = self.emit_sample()
+                actstr = emitact.parse(actstr[len("emit_sample(") : ])
+                self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
+                parsed = True
+
             actstr = actstr[strspn(actstr, ", ") :]
             while parencount > 0:
                 parencount -= 1
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
                   ` (7 preceding siblings ...)
  2024-06-03 18:56 ` [PATCH net-next v2 8/9] selftests: openvswitch: add emit_sample action Adrian Moreno
@ 2024-06-03 18:56 ` Adrian Moreno
  2024-06-05 19:43   ` Simon Horman
  2024-06-14 17:07   ` Aaron Conole
  8 siblings, 2 replies; 57+ messages in thread
From: Adrian Moreno @ 2024-06-03 18:56 UTC (permalink / raw)
  To: netdev
  Cc: aconole, echaudro, horms, i.maximets, dev, Adrian Moreno,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Shuah Khan, linux-kselftest, linux-kernel

Add a test to verify sampling packets via psample works.

In order to do that, create a subcommand in ovs-dpctl.py to listen to
on the psample multicast group and print samples.

In order to also test simultaneous sFlow and psample actions and
packet truncation, add missing parsing support for "userspace" and
"trunc" actions.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 .../selftests/net/openvswitch/openvswitch.sh  |  99 +++++++++++++++-
 .../selftests/net/openvswitch/ovs-dpctl.py    | 112 +++++++++++++++++-
 2 files changed, 204 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
index 5cae53543849..f6e0ae3f6424 100755
--- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
+++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
@@ -20,7 +20,8 @@ tests="
 	nat_related_v4				ip4-nat-related: ICMP related matches work with SNAT
 	netlink_checks				ovsnl: validate netlink attrs and settings
 	upcall_interfaces			ovs: test the upcall interfaces
-	drop_reason				drop: test drop reasons are emitted"
+	drop_reason				drop: test drop reasons are emitted
+	emit_sample 				emit_sample: Sampling packets with psample"
 
 info() {
     [ $VERBOSE = 0 ] || echo $*
@@ -170,6 +171,19 @@ ovs_drop_reason_count()
 	return `echo "$perf_output" | grep "$pattern" | wc -l`
 }
 
+ovs_test_flow_fails () {
+	ERR_MSG="Flow actions may not be safe on all matching packets"
+
+	PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
+	ovs_add_flow $@ &> /dev/null $@ && return 1
+	POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
+
+	if [ "$PRE_TEST" == "$POST_TEST" ]; then
+		return 1
+	fi
+	return 0
+}
+
 usage() {
 	echo
 	echo "$0 [OPTIONS] [TEST]..."
@@ -184,6 +198,89 @@ usage() {
 	exit 1
 }
 
+
+# emit_sample test
+# - use emit_sample to observe packets
+test_emit_sample() {
+	sbx_add "test_emit_sample" || return $?
+
+	# Add a datapath with per-vport dispatching.
+	ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
+
+	info "create namespaces"
+	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
+		client c0 c1 172.31.110.10/24 -u || return 1
+	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
+		server s0 s1 172.31.110.20/24 -u || return 1
+
+	# Check if emit_sample actions can be configured.
+	ovs_add_flow "test_emit_sample" emit_sample \
+	'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
+	if [ $? == 1 ]; then
+		info "no support for emit_sample - skipping"
+		ovs_exit_sig
+		return $ksft_skip
+	fi
+
+	ovs_del_flows "test_emit_sample" emit_sample
+
+	# Allow ARP
+	ovs_add_flow "test_emit_sample" emit_sample \
+		'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
+	ovs_add_flow "test_emit_sample" emit_sample \
+		'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
+
+	# Test action verification.
+	OLDIFS=$IFS
+	IFS='*'
+	min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
+	for testcase in \
+		"cookie to large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
+		"no group with cookie"*"emit_sample(cookie=abcd)" \
+		"no group"*"sample()";
+	do
+		set -- $testcase;
+		ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
+		if [ $? == 1 ]; then
+			info "failed - $1"
+			return 1
+		fi
+	done
+	IFS=$OLDIFS
+
+	# Sample first 14 bytes of all traffic.
+	ovs_add_flow "test_emit_sample" emit_sample \
+	"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" "trunc(14),emit_sample(group=1,cookie=c0ffee),2"
+
+	# Sample all traffic. In this case, use a sample() action with both
+	# emit_sample and an upcall emulating simultaneous local sampling and
+	# sFlow / IPFIX.
+	nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | cut -d ":" -f 2 | tr -d ' ')
+	ovs_add_flow "test_emit_sample" emit_sample \
+	"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" "sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
+
+	# Record emit_sample data.
+	python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 2>$ovs_dir/psample.err &
+	pid=$!
+	on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"
+
+	# Send a single ping.
+	sleep 1
+	ovs_sbx "test_emit_sample" ip netns exec client ping -I c1 172.31.110.20 -c 1 || return 1
+	sleep 1
+
+	# We should have received one userspace action upcall and 2 psample packets.
+	grep -E "userspace action command" $ovs_dir/s0.out >/dev/null 2>&1 || return 1
+
+	# client -> server samples should only contain the first 14 bytes of the packet.
+	grep -E "rate:4294967295,group:1,cookie:c0ffee data:[0-9a-f]{28}$" \
+			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
+	grep -E "rate:4294967295,group:2,cookie:eeff0c" \
+			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
+
+	return 0
+}
+
 # drop_reason test
 # - drop packets and verify the right drop reason is reported
 test_drop_reason() {
diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
index f8b5362aac8c..44fdeb9491a2 100644
--- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
+++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
@@ -27,8 +27,10 @@ try:
     from pyroute2.netlink import genlmsg
     from pyroute2.netlink import nla
     from pyroute2.netlink import nlmsg_atoms
-    from pyroute2.netlink.exceptions import NetlinkError
+    from pyroute2.netlink.event import EventSocket
     from pyroute2.netlink.generic import GenericNetlinkSocket
+    from pyroute2.netlink.nlsocket import Marshal
+    from pyroute2.netlink.exceptions import NetlinkError
     import pyroute2
 
 except ModuleNotFoundError:
@@ -575,13 +577,27 @@ class ovsactions(nla):
                 print_str += "userdata="
                 for f in self.get_attr("OVS_USERSPACE_ATTR_USERDATA"):
                     print_str += "%x." % f
-            if self.get_attr("OVS_USERSPACE_ATTR_TUN_PORT") is not None:
+            if self.get_attr("OVS_USERSPACE_ATTR_EGRESS_TUN_PORT") is not None:
                 print_str += "egress_tun_port=%d" % self.get_attr(
-                    "OVS_USERSPACE_ATTR_TUN_PORT"
+                    "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT"
                 )
             print_str += ")"
             return print_str
 
+        def parse(self, actstr):
+            attrs_desc = (
+                ("pid", "OVS_USERSPACE_ATTR_PID", int),
+                ("userdata", "OVS_USERSPACE_ATTR_USERDATA",
+                    lambda x: list(bytearray.fromhex(x))),
+                ("egress_tun_port", "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT", int)
+            )
+
+            attrs, actstr = parse_attrs(actstr, attrs_desc)
+            for attr in attrs:
+                self["attrs"].append(attr)
+
+            return actstr
+
     def dpstr(self, more=False):
         print_str = ""
 
@@ -803,6 +819,25 @@ class ovsactions(nla):
                 self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
                 parsed = True
 
+            elif parse_starts_block(actstr, "userspace(", False):
+                uact = self.userspace()
+                actstr = uact.parse(actstr[len("userpsace(") : ])
+                self["attrs"].append(["OVS_ACTION_ATTR_USERSPACE", uact])
+                parsed = True
+
+            elif parse_starts_block(actstr, "trunc", False):
+                parencount += 1
+                actstr, val = parse_extract_field(
+                    actstr,
+                    "trunc(",
+                    r"([0-9]+)",
+                    int,
+                    False,
+                    None,
+                )
+                self["attrs"].append(["OVS_ACTION_ATTR_TRUNC", val])
+                parsed = True
+
             actstr = actstr[strspn(actstr, ", ") :]
             while parencount > 0:
                 parencount -= 1
@@ -2184,10 +2219,70 @@ class OvsFlow(GenericNetlinkSocket):
         print("MISS upcall[%d/%s]: %s" % (seq, pktpres, keystr), flush=True)
 
     def execute(self, packetmsg):
-        print("userspace execute command")
+        print("userspace execute command", flush=True)
 
     def action(self, packetmsg):
-        print("userspace action command")
+        print("userspace action command", flush=True)
+
+
+class psample_sample(genlmsg):
+    nla_map = (
+        ("PSAMPLE_ATTR_IIFINDEX", "none"),
+        ("PSAMPLE_ATTR_OIFINDEX", "none"),
+        ("PSAMPLE_ATTR_ORIGSIZE", "none"),
+        ("PSAMPLE_ATTR_SAMPLE_GROUP", "uint32"),
+        ("PSAMPLE_ATTR_GROUP_SEQ", "none"),
+        ("PSAMPLE_ATTR_SAMPLE_RATE", "uint32"),
+        ("PSAMPLE_ATTR_DATA", "array(uint8)"),
+        ("PSAMPLE_ATTR_GROUP_REFCOUNT", "none"),
+        ("PSAMPLE_ATTR_TUNNEL", "none"),
+        ("PSAMPLE_ATTR_PAD", "none"),
+        ("PSAMPLE_ATTR_OUT_TC", "none"),
+        ("PSAMPLE_ATTR_OUT_TC_OCC", "none"),
+        ("PSAMPLE_ATTR_LATENCY", "none"),
+        ("PSAMPLE_ATTR_TIMESTAMP", "none"),
+        ("PSAMPLE_ATTR_PROTO", "none"),
+        ("PSAMPLE_ATTR_USER_COOKIE", "array(uint8)"),
+    )
+
+    def dpstr(self):
+        fields = []
+        data = ""
+        for (attr, value) in self["attrs"]:
+            if attr == "PSAMPLE_ATTR_SAMPLE_GROUP":
+                fields.append("group:%d" % value)
+            if attr == "PSAMPLE_ATTR_SAMPLE_RATE":
+                fields.append("rate:%d" % value)
+            if attr == "PSAMPLE_ATTR_USER_COOKIE":
+                value = "".join(format(x, "02x") for x in value)
+                fields.append("cookie:%s" % value)
+            if attr == "PSAMPLE_ATTR_DATA" and len(value) > 0:
+                data = "data:%s" % "".join(format(x, "02x") for x in value)
+
+        return ("%s %s" % (",".join(fields), data)).strip()
+
+
+class psample_msg(Marshal):
+    PSAMPLE_CMD_SAMPLE = 0
+    PSAMPLE_CMD_GET_GROUP = 1
+    PSAMPLE_CMD_NEW_GROUP = 2
+    PSAMPLE_CMD_DEL_GROUP = 3
+    PSAMPLE_CMD_SET_FILTER = 4
+    msg_map = {PSAMPLE_CMD_SAMPLE: psample_sample}
+
+
+class Psample(EventSocket):
+    genl_family = "psample"
+    mcast_groups = ["packets"]
+    marshal_class = psample_msg
+
+    def read_samples(self):
+        while True:
+            try:
+                for msg in self.get():
+                    print(msg.dpstr(), flush=True)
+            except NetlinkError as ne:
+                raise ne
 
 
 def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()):
@@ -2247,7 +2342,7 @@ def main(argv):
         help="Increment 'verbose' output counter.",
         default=0,
     )
-    subparsers = parser.add_subparsers()
+    subparsers = parser.add_subparsers(dest="subcommand")
 
     showdpcmd = subparsers.add_parser("show")
     showdpcmd.add_argument(
@@ -2304,6 +2399,8 @@ def main(argv):
     delfscmd = subparsers.add_parser("del-flows")
     delfscmd.add_argument("flsbr", help="Datapath name")
 
+    subparsers.add_parser("psample")
+
     args = parser.parse_args()
 
     if args.verbose > 0:
@@ -2318,6 +2415,9 @@ def main(argv):
 
     sys.setrecursionlimit(100000)
 
+    if args.subcommand == "psample":
+        Psample().read_samples()
+
     if hasattr(args, "showdp"):
         found = False
         for iface in ndb.interfaces:
-- 
2.45.1


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
@ 2024-06-04  6:09   ` kernel test robot
  2024-06-04  8:49   ` kernel test robot
  2024-06-14 16:55   ` Aaron Conole
  2 siblings, 0 replies; 57+ messages in thread
From: kernel test robot @ 2024-06-04  6:09 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: oe-kbuild-all, aconole, echaudro, horms, i.maximets, dev,
	Adrian Moreno, Pravin B Shelar, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
config: m68k-allmodconfig (https://download.01.org/0day-ci/archive/20240604/202406041339.ytdRh41V-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041339.ytdRh41V-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406041339.ytdRh41V-lkp@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-zydacron.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-viewsonic.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-waltop.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hid/hid-winwing.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/of/of_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/fbtft/fbtft.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-bootrom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-spilib.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-hid.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-light.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-log.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-loopback.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-power-supply.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-raw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-vibrator.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-audio-manager.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-gbphy.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-i2c.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-pwm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-sdio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-spi.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-uart.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/staging/greybus/gb-usb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/goldfish/goldfish_pipe.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/platform/chrome/cros_kunit_proto_test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mailbox/mtk-cmdq-mailbox.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_simpleondemand.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_performance.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_powersave.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/devfreq/governor_userspace.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwtracing/intel_th/intel_th_msu_sink.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem-apple-efuses.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem_brcm_nvram.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/nvmem/nvmem_u-boot-env.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mm-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mq-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mn-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/interconnect/imx/imx8mp-interconnect.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hte/hte-tegra194-test.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vdpa/vdpa.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/parport/parport.o
WARNING: modpost: drivers/parport/parport_amiga: section mismatch in reference: amiga_parallel_driver+0x8 (section: .data) -> amiga_parallel_remove (section: .exit.text)
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/parsers/brcm_u-boot.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/parsers/tplink_safeloader.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/chips/cfi_util.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/chips/cfi_cmdset_0020.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/mtd/maps/map_funcs.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/hisi-spmi-controller.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/spmi/spmi-pmic-arb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/uio/uio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pcmcia/pcmcia_rsrc.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pcmcia/i82365.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwmon/corsair-cpro.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/hwmon/mr75203.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/vhost/vringh.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/greybus/greybus.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/greybus/gb-es2.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/rpmsg/rpmsg_char.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/adc/ingenic-adc.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/adc/xilinx-ams.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/iio/buffer/kfifo_buf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-hub.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-aspeed.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-master-ast-cf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/fsi/fsi-scom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/siox/siox-bus-gpio.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/counter/ftm-quaddec.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/oss/dmasound/dmasound_core.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/codecs/snd-soc-wm-adsp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/fsl/imx-pcm-dma.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/mxs/snd-soc-mxs-pcm.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/qcom/snd-soc-qcom-sdw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-intel-atom.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-acpi-intel-byt.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/intel/snd-sof-acpi-intel-bdw.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8m.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/snd-sof-imx8ulp.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/imx/imx-common.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mtk-adsp-common.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mt8195/snd-sof-mt8195.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/mediatek/mt8186/snd-sof-mt8186.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-utils.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-acpi.o
WARNING: modpost: missing MODULE_DESCRIPTION() in sound/soc/sof/snd-sof-of.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mtty.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mdpy.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mdpy-fb.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/vfio-mdev/mbochs.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/configfs/configfs_sample.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/bytestream-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/dma-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/inttype-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kfifo/record-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kobject/kobject-example.o
WARNING: modpost: missing MODULE_DESCRIPTION() in samples/kobject/kset-example.o
>> ERROR: modpost: "__udivdi3" [net/openvswitch/openvswitch.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
  2024-06-04  6:09   ` kernel test robot
@ 2024-06-04  8:49   ` kernel test robot
  2024-06-05 19:34     ` Adrián Moreno
  2024-06-14 16:55   ` Aaron Conole
  2 siblings, 1 reply; 57+ messages in thread
From: kernel test robot @ 2024-06-04  8:49 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: oe-kbuild-all, aconole, echaudro, horms, i.maximets, dev,
	Adrian Moreno, Pravin B Shelar, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406041623.ycwsuP85-lkp@intel.com/

All errors (new ones prefixed by >>):

   m68k-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
>> actions.c:(.text+0x214e): undefined reference to `__udivdi3'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
@ 2024-06-05  0:29   ` kernel test robot
  2024-06-05 19:31     ` Adrián Moreno
  2024-06-05 19:51   ` Simon Horman
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 57+ messages in thread
From: kernel test robot @ 2024-06-05  0:29 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: llvm, oe-kbuild-all, aconole, echaudro, horms, i.maximets, dev,
	Adrian Moreno, Donald Hunter, Jakub Kicinski, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, linux-kernel

Hi Adrian,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
base:   net-next/main
patch link:    https://lore.kernel.org/r/20240603185647.2310748-6-amorenoz%40redhat.com
patch subject: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
config: s390-randconfig-002-20240605 (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project d7d2d4f53fc79b4b58e8d8d08151b577c3699d4a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202406050852.hDtfskO0-lkp@intel.com/

All errors (new ones prefixed by >>):

   s390x-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
>> actions.c:(.text+0x1d5c): undefined reference to `psample_sample_packet'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-05  0:29   ` kernel test robot
@ 2024-06-05 19:31     ` Adrián Moreno
  2024-06-05 20:06       ` Simon Horman
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-05 19:31 UTC (permalink / raw)
  To: kernel test robot
  Cc: netdev, llvm, oe-kbuild-all, aconole, echaudro, horms, i.maximets,
	dev, Donald Hunter, Jakub Kicinski, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On Wed, Jun 05, 2024 at 08:29:22AM GMT, kernel test robot wrote:
> Hi Adrian,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on net-next/main]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
> base:   net-next/main
> patch link:    https://lore.kernel.org/r/20240603185647.2310748-6-amorenoz%40redhat.com
> patch subject: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
> config: s390-randconfig-002-20240605 (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/config)
> compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project d7d2d4f53fc79b4b58e8d8d08151b577c3699d4a)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202406050852.hDtfskO0-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    s390x-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
> >> actions.c:(.text+0x1d5c): undefined reference to `psample_sample_packet'
>

Thanks robot!

OK, I think I know what's wrong. There is an optional dependency with
PSAMPLE. Openvswitch module does compile without PSAMPLE but there is a
link error if OPENVSWITCH=y and PSAMPLE=m.

Looking into how to express this in the Kconfig, I'm planning to add the
following to the next version of the series.

diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
index 29a7081858cd..2535f3f9f462 100644
--- a/net/openvswitch/Kconfig
+++ b/net/openvswitch/Kconfig
@@ -10,6 +10,7 @@ config OPENVSWITCH
 		   (NF_CONNTRACK && ((!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6) && \
 				     (!NF_NAT || NF_NAT) && \
 				     (!NETFILTER_CONNCOUNT || NETFILTER_CONNCOUNT)))
+	depends on PSAMPLE || !PSAMPLE
 	select LIBCRC32C
 	select MPLS
 	select NET_MPLS_GSO


^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-04  8:49   ` kernel test robot
@ 2024-06-05 19:34     ` Adrián Moreno
  0 siblings, 0 replies; 57+ messages in thread
From: Adrián Moreno @ 2024-06-05 19:34 UTC (permalink / raw)
  To: kernel test robot
  Cc: netdev, oe-kbuild-all, aconole, echaudro, horms, i.maximets, dev,
	Pravin B Shelar, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Tue, Jun 04, 2024 at 04:49:39PM GMT, kernel test robot wrote:
> Hi Adrian,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on net-next/main]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
> base:   net-next/main
> patch link:    https://lore.kernel.org/r/20240603185647.2310748-7-amorenoz%40redhat.com
> patch subject: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
> config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/config)
> compiler: m68k-linux-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240604/202406041623.ycwsuP85-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202406041623.ycwsuP85-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    m68k-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
> >> actions.c:(.text+0x214e): undefined reference to `__udivdi3'
>

I forgot about architectures that don't have native u64 division. I will
use "do_div" in the next version of the series.

> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-03 18:56 ` [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test Adrian Moreno
@ 2024-06-05 19:43   ` Simon Horman
  2024-06-10  9:20     ` Adrián Moreno
  2024-06-14 17:07   ` Aaron Conole
  1 sibling, 1 reply; 57+ messages in thread
From: Simon Horman @ 2024-06-05 19:43 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

On Mon, Jun 03, 2024 at 08:56:43PM +0200, Adrian Moreno wrote:
> Add a test to verify sampling packets via psample works.
> 
> In order to do that, create a subcommand in ovs-dpctl.py to listen to
> on the psample multicast group and print samples.
> 
> In order to also test simultaneous sFlow and psample actions and
> packet truncation, add missing parsing support for "userspace" and
> "trunc" actions.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

...

> @@ -803,6 +819,25 @@ class ovsactions(nla):
>                  self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
>                  parsed = True
>  
> +            elif parse_starts_block(actstr, "userspace(", False):
> +                uact = self.userspace()
> +                actstr = uact.parse(actstr[len("userpsace(") : ])

nit: userspace

     Flagged by checkpatch.pl --codespell

> +                self["attrs"].append(["OVS_ACTION_ATTR_USERSPACE", uact])
> +                parsed = True
> +
> +            elif parse_starts_block(actstr, "trunc", False):
> +                parencount += 1
> +                actstr, val = parse_extract_field(
> +                    actstr,
> +                    "trunc(",
> +                    r"([0-9]+)",
> +                    int,
> +                    False,
> +                    None,
> +                )
> +                self["attrs"].append(["OVS_ACTION_ATTR_TRUNC", val])
> +                parsed = True
> +
>              actstr = actstr[strspn(actstr, ", ") :]
>              while parencount > 0:
>                  parencount -= 1

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
  2024-06-05  0:29   ` kernel test robot
@ 2024-06-05 19:51   ` Simon Horman
  2024-06-06  8:42     ` Adrián Moreno
  2024-06-10 15:46   ` [ovs-dev] " Aaron Conole
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 57+ messages in thread
From: Simon Horman @ 2024-06-05 19:51 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On Mon, Jun 03, 2024 at 08:56:39PM +0200, Adrian Moreno wrote:
> Add support for a new action: emit_sample.
> 
> This action accepts a u32 group id and a variable-length cookie and uses
> the psample multicast group to make the packet available for
> observability.
> 
> The maximum length of the user-defined cookie is set to 16, same as
> tc_cookie, to discourage using cookies that will not be offloadable.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Hi Adrian,

Some minor nits from my side.

...

> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index efc82c318fa2..a0e9dde0584a 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>  };
>  #endif
>  
> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> +/**
> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> + * action.
> + *
> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> + * sample.
> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> + * user-defined metadata. The maximum length is 16 bytes.
> + *
> + * Sends the packet to the psample multicast group with the specified group and
> + * cookie. It is possible to combine this action with the
> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> + */
> +enum ovs_emit_sample_attr {
> +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> +	__OVS_EMIT_SAMPLE_ATTR_MAX
> +};
> +
> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> +
> +

nit: One blank line is enough.

     Flagged by checkpatch.pl

>  /**
>   * enum ovs_action_attr - Action types.
>   *
> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */

nit: Please add OVS_ACTION_ATTR_EMIT_SAMPLE to the Kenrel doc
     for this structure.

>  
>  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>  				       * from userspace. */

...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-05 19:31     ` Adrián Moreno
@ 2024-06-05 20:06       ` Simon Horman
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-05 20:06 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: kernel test robot, netdev, llvm, oe-kbuild-all, aconole, echaudro,
	i.maximets, dev, Donald Hunter, Jakub Kicinski, Eric Dumazet,
	Paolo Abeni, Pravin B Shelar, linux-kernel

On Wed, Jun 05, 2024 at 07:31:55PM +0000, Adrián Moreno wrote:
> On Wed, Jun 05, 2024 at 08:29:22AM GMT, kernel test robot wrote:
> > Hi Adrian,
> >
> > kernel test robot noticed the following build errors:
> >
> > [auto build test ERROR on net-next/main]
> >
> > url:    https://github.com/intel-lab-lkp/linux/commits/Adrian-Moreno/net-psample-add-user-cookie/20240604-030055
> > base:   net-next/main
> > patch link:    https://lore.kernel.org/r/20240603185647.2310748-6-amorenoz%40redhat.com
> > patch subject: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
> > config: s390-randconfig-002-20240605 (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/config)
> > compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project d7d2d4f53fc79b4b58e8d8d08151b577c3699d4a)
> > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240605/202406050852.hDtfskO0-lkp@intel.com/reproduce)
> >
> > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > the same patch/commit), kindly add following tags
> > | Reported-by: kernel test robot <lkp@intel.com>
> > | Closes: https://lore.kernel.org/oe-kbuild-all/202406050852.hDtfskO0-lkp@intel.com/
> >
> > All errors (new ones prefixed by >>):
> >
> >    s390x-linux-ld: net/openvswitch/actions.o: in function `do_execute_actions':
> > >> actions.c:(.text+0x1d5c): undefined reference to `psample_sample_packet'
> >
> 
> Thanks robot!
> 
> OK, I think I know what's wrong. There is an optional dependency with
> PSAMPLE. Openvswitch module does compile without PSAMPLE but there is a
> link error if OPENVSWITCH=y and PSAMPLE=m.
> 
> Looking into how to express this in the Kconfig, I'm planning to add the
> following to the next version of the series.
> 
> diff --git a/net/openvswitch/Kconfig b/net/openvswitch/Kconfig
> index 29a7081858cd..2535f3f9f462 100644
> --- a/net/openvswitch/Kconfig
> +++ b/net/openvswitch/Kconfig
> @@ -10,6 +10,7 @@ config OPENVSWITCH
>  		   (NF_CONNTRACK && ((!NF_DEFRAG_IPV6 || NF_DEFRAG_IPV6) && \
>  				     (!NF_NAT || NF_NAT) && \
>  				     (!NETFILTER_CONNCOUNT || NETFILTER_CONNCOUNT)))
> +	depends on PSAMPLE || !PSAMPLE
>  	select LIBCRC32C
>  	select MPLS
>  	select NET_MPLS_GSO
> 

Thanks Adrián,

I both agree that should work, and tested with the config at the link above
and found that it does work.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-05 19:51   ` Simon Horman
@ 2024-06-06  8:42     ` Adrián Moreno
  0 siblings, 0 replies; 57+ messages in thread
From: Adrián Moreno @ 2024-06-06  8:42 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, aconole, echaudro, i.maximets, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On Wed, Jun 05, 2024 at 08:51:17PM GMT, Simon Horman wrote:
> On Mon, Jun 03, 2024 at 08:56:39PM +0200, Adrian Moreno wrote:
> > Add support for a new action: emit_sample.
> >
> > This action accepts a u32 group id and a variable-length cookie and uses
> > the psample multicast group to make the packet available for
> > observability.
> >
> > The maximum length of the user-defined cookie is set to 16, same as
> > tc_cookie, to discourage using cookies that will not be offloadable.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>
> Hi Adrian,
>
> Some minor nits from my side.
>
> ...
>
> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> > index efc82c318fa2..a0e9dde0584a 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
> >  };
> >  #endif
> >
> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> > +/**
> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> > + * action.
> > + *
> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> > + * sample.
> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> > + * user-defined metadata. The maximum length is 16 bytes.
> > + *
> > + * Sends the packet to the psample multicast group with the specified group and
> > + * cookie. It is possible to combine this action with the
> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> > + */
> > +enum ovs_emit_sample_attr {
> > +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> > +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> > +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> > +	__OVS_EMIT_SAMPLE_ATTR_MAX
> > +};
> > +
> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> > +
> > +
>
> nit: One blank line is enough.
>

Ack.

>      Flagged by checkpatch.pl
>
> >  /**
> >   * enum ovs_action_attr - Action types.
> >   *
> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
> >  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
> >  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
> >  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> > +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>
> nit: Please add OVS_ACTION_ATTR_EMIT_SAMPLE to the Kenrel doc
>      for this structure.
>

Thanks for spotting this. Will do.


> >
> >  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
> >  				       * from userspace. */
>
> ...
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-05 19:43   ` Simon Horman
@ 2024-06-10  9:20     ` Adrián Moreno
  0 siblings, 0 replies; 57+ messages in thread
From: Adrián Moreno @ 2024-06-10  9:20 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, aconole, echaudro, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

On Wed, Jun 05, 2024 at 08:43:14PM GMT, Simon Horman wrote:
> On Mon, Jun 03, 2024 at 08:56:43PM +0200, Adrian Moreno wrote:
> > Add a test to verify sampling packets via psample works.
> >
> > In order to do that, create a subcommand in ovs-dpctl.py to listen to
> > on the psample multicast group and print samples.
> >
> > In order to also test simultaneous sFlow and psample actions and
> > packet truncation, add missing parsing support for "userspace" and
> > "trunc" actions.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>
> ...
>
> > @@ -803,6 +819,25 @@ class ovsactions(nla):
> >                  self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
> >                  parsed = True
> >
> > +            elif parse_starts_block(actstr, "userspace(", False):
> > +                uact = self.userspace()
> > +                actstr = uact.parse(actstr[len("userpsace(") : ])
>
> nit: userspace
>
>      Flagged by checkpatch.pl --codespell
>

Thanks. Will fix it.

> > +                self["attrs"].append(["OVS_ACTION_ATTR_USERSPACE", uact])
> > +                parsed = True
> > +
> > +            elif parse_starts_block(actstr, "trunc", False):
> > +                parencount += 1
> > +                actstr, val = parse_extract_field(
> > +                    actstr,
> > +                    "trunc(",
> > +                    r"([0-9]+)",
> > +                    int,
> > +                    False,
> > +                    None,
> > +                )
> > +                self["attrs"].append(["OVS_ACTION_ATTR_TRUNC", val])
> > +                parsed = True
> > +
> >              actstr = actstr[strspn(actstr, ", ") :]
> >              while parencount > 0:
> >                  parencount -= 1
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
  2024-06-05  0:29   ` kernel test robot
  2024-06-05 19:51   ` Simon Horman
@ 2024-06-10 15:46   ` Aaron Conole
  2024-06-11  8:39     ` Adrián Moreno
  2024-06-14 16:13   ` Simon Horman
  2024-06-17 10:44   ` Ilya Maximets
  4 siblings, 1 reply; 57+ messages in thread
From: Aaron Conole @ 2024-06-10 15:46 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, dev, Paolo Abeni, Donald Hunter, linux-kernel, i.maximets,
	Eric Dumazet, horms, Jakub Kicinski, David S. Miller

Adrian Moreno <amorenoz@redhat.com> writes:

> Add support for a new action: emit_sample.
>
> This action accepts a u32 group id and a variable-length cookie and uses
> the psample multicast group to make the packet available for
> observability.
>
> The maximum length of the user-defined cookie is set to 16, same as
> tc_cookie, to discourage using cookies that will not be offloadable.
>
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---

I saw some of the nits Simon raised - I'll add one more below.

I haven't gone through the series thoroughly enough to make a detailed
review.

>  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
>  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
>  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
>  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
>  4 files changed, 124 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
> index 4fdfc6b5cae9..a7ab5593a24f 100644
> --- a/Documentation/netlink/specs/ovs_flow.yaml
> +++ b/Documentation/netlink/specs/ovs_flow.yaml
> @@ -727,6 +727,12 @@ attribute-sets:
>          name: dec-ttl
>          type: nest
>          nested-attributes: dec-ttl-attrs
> +      -
> +        name: emit-sample
> +        type: nest
> +        nested-attributes: emit-sample-attrs
> +        doc: |
> +          Sends a packet sample to psample for external observation.
>    -
>      name: tunnel-key-attrs
>      enum-name: ovs-tunnel-key-attr
> @@ -938,6 +944,17 @@ attribute-sets:
>        -
>          name: gbp
>          type: u32
> +  -
> +    name: emit-sample-attrs
> +    enum-name: ovs-emit-sample-attr
> +    name-prefix: ovs-emit-sample-attr-
> +    attributes:
> +      -
> +        name: group
> +        type: u32
> +      -
> +        name: cookie
> +        type: binary
>  
>  operations:
>    name-prefix: ovs-flow-cmd-
> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index efc82c318fa2..a0e9dde0584a 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>  };
>  #endif
>  
> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> +/**
> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> + * action.
> + *
> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> + * sample.
> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> + * user-defined metadata. The maximum length is 16 bytes.
> + *
> + * Sends the packet to the psample multicast group with the specified group and
> + * cookie. It is possible to combine this action with the
> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> + */
> +enum ovs_emit_sample_attr {
> +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> +	__OVS_EMIT_SAMPLE_ATTR_MAX
> +};
> +
> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> +
> +
>  /**
>   * enum ovs_action_attr - Action types.
>   *
> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>  
>  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>  				       * from userspace. */
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 964225580824..3b4dba0ded59 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -24,6 +24,11 @@
>  #include <net/checksum.h>
>  #include <net/dsfield.h>
>  #include <net/mpls.h>
> +
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +#include <net/psample.h>
> +#endif
> +
>  #include <net/sctp/checksum.h>
>  
>  #include "datapath.h"
> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>  	return 0;
>  }
>  
> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> +			       const struct sw_flow_key *key,
> +			       const struct nlattr *attr)
> +{
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +	struct psample_group psample_group = {};
> +	struct psample_metadata md = {};
> +	struct vport *input_vport;
> +	const struct nlattr *a;
> +	int rem;
> +
> +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> +	     a = nla_next(a, &rem)) {
> +		switch (nla_type(a)) {
> +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> +			psample_group.group_num = nla_get_u32(a);
> +			break;
> +
> +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> +			md.user_cookie = nla_data(a);
> +			md.user_cookie_len = nla_len(a);
> +			break;
> +		}
> +	}
> +
> +	psample_group.net = ovs_dp_get_net(dp);
> +
> +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> +	if (!input_vport)
> +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
> +
> +	md.in_ifindex = input_vport->dev->ifindex;
> +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> +
> +	psample_sample_packet(&psample_group, skb, 0, &md);
> +#endif
> +
> +	return 0;

Why this return here?  Doesn't seem used anywhere else.

> +}
> +
>  /* Execute a list of actions against 'skb'. */
>  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			      struct sw_flow_key *key,
> @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			ovs_kfree_skb_reason(skb, reason);
>  			return 0;
>  		}
> +
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> +			err = execute_emit_sample(dp, skb, key, a);
> +			OVS_CB(skb)->cutlen = 0;
> +			break;
>  		}
>  
>  		if (unlikely(err)) {
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index f224d9bcea5e..eb59ff9c8154 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
>  		case OVS_ACTION_ATTR_TRUNC:
>  		case OVS_ACTION_ATTR_USERSPACE:
>  		case OVS_ACTION_ATTR_DROP:
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
>  			break;
>  
>  		case OVS_ACTION_ATTR_CT:
> @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
>  	/* Whenever new actions are added, the need to update this
>  	 * function should be considered.
>  	 */
> -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
> +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
>  
>  	if (!actions)
>  		return;
> @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
>  	return 0;
>  }
>  
> +static int validate_emit_sample(const struct nlattr *attr)
> +{
> +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
> +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
> +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
> +			.type = NLA_BINARY,
> +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
> +		},
> +	};
> +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
> +	int err;
> +
> +	if (!IS_ENABLED(CONFIG_PSAMPLE))
> +		return -EOPNOTSUPP;
> +
> +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
> +			       NULL);
> +	if (err)
> +		return err;
> +
> +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
> +}
> +
>  static int copy_action(const struct nlattr *from,
>  		       struct sw_flow_actions **sfa, bool log)
>  {
> @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
>  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
>  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
> +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
>  		};
>  		const struct ovs_action_push_vlan *vlan;
>  		int type = nla_type(a);
> @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>  				return -EINVAL;
>  			break;
>  
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> +			err = validate_emit_sample(a);
> +			if (err)
> +				return err;
> +			break;
> +
>  		default:
>  			OVS_NLERR(log, "Unknown Action type %d", type);
>  			return -EINVAL;


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-10 15:46   ` [ovs-dev] " Aaron Conole
@ 2024-06-11  8:39     ` Adrián Moreno
  2024-06-11 13:54       ` Aaron Conole
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-11  8:39 UTC (permalink / raw)
  To: Aaron Conole
  Cc: netdev, dev, Paolo Abeni, Donald Hunter, linux-kernel, i.maximets,
	Eric Dumazet, horms, Jakub Kicinski, David S. Miller

On Mon, Jun 10, 2024 at 11:46:14AM GMT, Aaron Conole wrote:
> Adrian Moreno <amorenoz@redhat.com> writes:
>
> > Add support for a new action: emit_sample.
> >
> > This action accepts a u32 group id and a variable-length cookie and uses
> > the psample multicast group to make the packet available for
> > observability.
> >
> > The maximum length of the user-defined cookie is set to 16, same as
> > tc_cookie, to discourage using cookies that will not be offloadable.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
>
> I saw some of the nits Simon raised - I'll add one more below.
>
> I haven't gone through the series thoroughly enough to make a detailed
> review.
>
> >  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
> >  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
> >  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
> >  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
> >  4 files changed, 124 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
> > index 4fdfc6b5cae9..a7ab5593a24f 100644
> > --- a/Documentation/netlink/specs/ovs_flow.yaml
> > +++ b/Documentation/netlink/specs/ovs_flow.yaml
> > @@ -727,6 +727,12 @@ attribute-sets:
> >          name: dec-ttl
> >          type: nest
> >          nested-attributes: dec-ttl-attrs
> > +      -
> > +        name: emit-sample
> > +        type: nest
> > +        nested-attributes: emit-sample-attrs
> > +        doc: |
> > +          Sends a packet sample to psample for external observation.
> >    -
> >      name: tunnel-key-attrs
> >      enum-name: ovs-tunnel-key-attr
> > @@ -938,6 +944,17 @@ attribute-sets:
> >        -
> >          name: gbp
> >          type: u32
> > +  -
> > +    name: emit-sample-attrs
> > +    enum-name: ovs-emit-sample-attr
> > +    name-prefix: ovs-emit-sample-attr-
> > +    attributes:
> > +      -
> > +        name: group
> > +        type: u32
> > +      -
> > +        name: cookie
> > +        type: binary
> >
> >  operations:
> >    name-prefix: ovs-flow-cmd-
> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> > index efc82c318fa2..a0e9dde0584a 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
> >  };
> >  #endif
> >
> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> > +/**
> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> > + * action.
> > + *
> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> > + * sample.
> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> > + * user-defined metadata. The maximum length is 16 bytes.
> > + *
> > + * Sends the packet to the psample multicast group with the specified group and
> > + * cookie. It is possible to combine this action with the
> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> > + */
> > +enum ovs_emit_sample_attr {
> > +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> > +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> > +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> > +	__OVS_EMIT_SAMPLE_ATTR_MAX
> > +};
> > +
> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> > +
> > +
> >  /**
> >   * enum ovs_action_attr - Action types.
> >   *
> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
> >  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
> >  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
> >  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> > +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
> >
> >  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
> >  				       * from userspace. */
> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 964225580824..3b4dba0ded59 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -24,6 +24,11 @@
> >  #include <net/checksum.h>
> >  #include <net/dsfield.h>
> >  #include <net/mpls.h>
> > +
> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> > +#include <net/psample.h>
> > +#endif
> > +
> >  #include <net/sctp/checksum.h>
> >
> >  #include "datapath.h"
> > @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
> >  	return 0;
> >  }
> >
> > +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> > +			       const struct sw_flow_key *key,
> > +			       const struct nlattr *attr)
> > +{
> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> > +	struct psample_group psample_group = {};
> > +	struct psample_metadata md = {};
> > +	struct vport *input_vport;
> > +	const struct nlattr *a;
> > +	int rem;
> > +
> > +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> > +	     a = nla_next(a, &rem)) {
> > +		switch (nla_type(a)) {
> > +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> > +			psample_group.group_num = nla_get_u32(a);
> > +			break;
> > +
> > +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> > +			md.user_cookie = nla_data(a);
> > +			md.user_cookie_len = nla_len(a);
> > +			break;
> > +		}
> > +	}
> > +
> > +	psample_group.net = ovs_dp_get_net(dp);
> > +
> > +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> > +	if (!input_vport)
> > +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
> > +
> > +	md.in_ifindex = input_vport->dev->ifindex;
> > +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> > +
> > +	psample_sample_packet(&psample_group, skb, 0, &md);
> > +#endif
> > +
> > +	return 0;
>
> Why this return here?  Doesn't seem used anywhere else.
>

It is being used in "do_execute_actions", right?
All non-skb-consuming actions set the value of "err" and break from the
switch-case so that the the packet is dropped with OVS_DROP_ACTION_ERROR reason.

Am i missing something?

> > +}
> > +
> >  /* Execute a list of actions against 'skb'. */
> >  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >  			      struct sw_flow_key *key,
> > @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >  			ovs_kfree_skb_reason(skb, reason);
> >  			return 0;
> >  		}
> > +
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> > +			err = execute_emit_sample(dp, skb, key, a);
> > +			OVS_CB(skb)->cutlen = 0;
> > +			break;
> >  		}
> >
> >  		if (unlikely(err)) {
> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> > index f224d9bcea5e..eb59ff9c8154 100644
> > --- a/net/openvswitch/flow_netlink.c
> > +++ b/net/openvswitch/flow_netlink.c
> > @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
> >  		case OVS_ACTION_ATTR_TRUNC:
> >  		case OVS_ACTION_ATTR_USERSPACE:
> >  		case OVS_ACTION_ATTR_DROP:
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> >  			break;
> >
> >  		case OVS_ACTION_ATTR_CT:
> > @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
> >  	/* Whenever new actions are added, the need to update this
> >  	 * function should be considered.
> >  	 */
> > -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
> > +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
> >
> >  	if (!actions)
> >  		return;
> > @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
> >  	return 0;
> >  }
> >
> > +static int validate_emit_sample(const struct nlattr *attr)
> > +{
> > +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
> > +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
> > +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
> > +			.type = NLA_BINARY,
> > +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
> > +		},
> > +	};
> > +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
> > +	int err;
> > +
> > +	if (!IS_ENABLED(CONFIG_PSAMPLE))
> > +		return -EOPNOTSUPP;
> > +
> > +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
> > +			       NULL);
> > +	if (err)
> > +		return err;
> > +
> > +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
> > +}
> > +
> >  static int copy_action(const struct nlattr *from,
> >  		       struct sw_flow_actions **sfa, bool log)
> >  {
> > @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
> >  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
> >  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
> > +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
> >  		};
> >  		const struct ovs_action_push_vlan *vlan;
> >  		int type = nla_type(a);
> > @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >  				return -EINVAL;
> >  			break;
> >
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> > +			err = validate_emit_sample(a);
> > +			if (err)
> > +				return err;
> > +			break;
> > +
> >  		default:
> >  			OVS_NLERR(log, "Unknown Action type %d", type);
> >  			return -EINVAL;
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-11  8:39     ` Adrián Moreno
@ 2024-06-11 13:54       ` Aaron Conole
  2024-06-11 15:42         ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Aaron Conole @ 2024-06-11 13:54 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: netdev, dev, Paolo Abeni, Donald Hunter, linux-kernel, i.maximets,
	Eric Dumazet, horms, Jakub Kicinski, David S. Miller

Adrián Moreno <amorenoz@redhat.com> writes:

> On Mon, Jun 10, 2024 at 11:46:14AM GMT, Aaron Conole wrote:
>> Adrian Moreno <amorenoz@redhat.com> writes:
>>
>> > Add support for a new action: emit_sample.
>> >
>> > This action accepts a u32 group id and a variable-length cookie and uses
>> > the psample multicast group to make the packet available for
>> > observability.
>> >
>> > The maximum length of the user-defined cookie is set to 16, same as
>> > tc_cookie, to discourage using cookies that will not be offloadable.
>> >
>> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> > ---
>>
>> I saw some of the nits Simon raised - I'll add one more below.
>>
>> I haven't gone through the series thoroughly enough to make a detailed
>> review.
>>
>> >  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
>> >  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
>> >  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
>> >  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
>> >  4 files changed, 124 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
>> > index 4fdfc6b5cae9..a7ab5593a24f 100644
>> > --- a/Documentation/netlink/specs/ovs_flow.yaml
>> > +++ b/Documentation/netlink/specs/ovs_flow.yaml
>> > @@ -727,6 +727,12 @@ attribute-sets:
>> >          name: dec-ttl
>> >          type: nest
>> >          nested-attributes: dec-ttl-attrs
>> > +      -
>> > +        name: emit-sample
>> > +        type: nest
>> > +        nested-attributes: emit-sample-attrs
>> > +        doc: |
>> > +          Sends a packet sample to psample for external observation.
>> >    -
>> >      name: tunnel-key-attrs
>> >      enum-name: ovs-tunnel-key-attr
>> > @@ -938,6 +944,17 @@ attribute-sets:
>> >        -
>> >          name: gbp
>> >          type: u32
>> > +  -
>> > +    name: emit-sample-attrs
>> > +    enum-name: ovs-emit-sample-attr
>> > +    name-prefix: ovs-emit-sample-attr-
>> > +    attributes:
>> > +      -
>> > +        name: group
>> > +        type: u32
>> > +      -
>> > +        name: cookie
>> > +        type: binary
>> >
>> >  operations:
>> >    name-prefix: ovs-flow-cmd-
>> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
>> > index efc82c318fa2..a0e9dde0584a 100644
>> > --- a/include/uapi/linux/openvswitch.h
>> > +++ b/include/uapi/linux/openvswitch.h
>> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>> >  };
>> >  #endif
>> >
>> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
>> > +/**
>> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
>> > + * action.
>> > + *
>> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
>> > + * sample.
>> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
>> > + * user-defined metadata. The maximum length is 16 bytes.
>> > + *
>> > + * Sends the packet to the psample multicast group with the specified group and
>> > + * cookie. It is possible to combine this action with the
>> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
>> > + */
>> > +enum ovs_emit_sample_attr {
>> > +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
>> > +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
>> > +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
>> > +	__OVS_EMIT_SAMPLE_ATTR_MAX
>> > +};
>> > +
>> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
>> > +
>> > +
>> >  /**
>> >   * enum ovs_action_attr - Action types.
>> >   *
>> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>> >  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>> >  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>> >  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
>> > +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>> >
>> >  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>> >  				       * from userspace. */
>> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> > index 964225580824..3b4dba0ded59 100644
>> > --- a/net/openvswitch/actions.c
>> > +++ b/net/openvswitch/actions.c
>> > @@ -24,6 +24,11 @@
>> >  #include <net/checksum.h>
>> >  #include <net/dsfield.h>
>> >  #include <net/mpls.h>
>> > +
>> > +#if IS_ENABLED(CONFIG_PSAMPLE)
>> > +#include <net/psample.h>
>> > +#endif
>> > +
>> >  #include <net/sctp/checksum.h>
>> >
>> >  #include "datapath.h"
>> > @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>> >  	return 0;
>> >  }
>> >
>> > +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>> > +			       const struct sw_flow_key *key,
>> > +			       const struct nlattr *attr)
>> > +{
>> > +#if IS_ENABLED(CONFIG_PSAMPLE)
>> > +	struct psample_group psample_group = {};
>> > +	struct psample_metadata md = {};
>> > +	struct vport *input_vport;
>> > +	const struct nlattr *a;
>> > +	int rem;
>> > +
>> > +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>> > +	     a = nla_next(a, &rem)) {
>> > +		switch (nla_type(a)) {
>> > +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
>> > +			psample_group.group_num = nla_get_u32(a);
>> > +			break;
>> > +
>> > +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
>> > +			md.user_cookie = nla_data(a);
>> > +			md.user_cookie_len = nla_len(a);
>> > +			break;
>> > +		}
>> > +	}
>> > +
>> > +	psample_group.net = ovs_dp_get_net(dp);
>> > +
>> > +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
>> > +	if (!input_vport)
>> > +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
>> > +
>> > +	md.in_ifindex = input_vport->dev->ifindex;
>> > +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
>> > +
>> > +	psample_sample_packet(&psample_group, skb, 0, &md);
>> > +#endif
>> > +
>> > +	return 0;
>>
>> Why this return here?  Doesn't seem used anywhere else.
>>
>
> It is being used in "do_execute_actions", right?
> All non-skb-consuming actions set the value of "err" and break from the
> switch-case so that the the packet is dropped with OVS_DROP_ACTION_ERROR reason.
>
> Am i missing something?

I think so.  For example, it isn't used when the function cannot
possibly error.

see the following cases:

OVS_ACTION_ATTR_HASH
OVS_ACTION_ATTR_TRUNC

As you note, these can consume SKB so also don't bother setting err,
because they will need to return anyway:

OVS_ACTION_ATTR_USERSPACE
OVS_ACTION_ATTR_OUTPUT
OVS_ACTION_ATTR_DROP

And even the following does a weird thing:

OVS_ACTION_ATTR_CT

because sometimes it will consume, and sometimes not.

I think if there isn't a possibility of error being generated (and I
guess from the code I see there isn't), then it shouldn't return a
useless code, since err will be 0 on each iteration of the loop.

>> > +}
>> > +
>> >  /* Execute a list of actions against 'skb'. */
>> >  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>> >  			      struct sw_flow_key *key,
>> > @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>> >  			ovs_kfree_skb_reason(skb, reason);
>> >  			return 0;
>> >  		}
>> > +
>> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
>> > +			err = execute_emit_sample(dp, skb, key, a);
>> > +			OVS_CB(skb)->cutlen = 0;
>> > +			break;
>> >  		}
>> >
>> >  		if (unlikely(err)) {
>> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
>> > index f224d9bcea5e..eb59ff9c8154 100644
>> > --- a/net/openvswitch/flow_netlink.c
>> > +++ b/net/openvswitch/flow_netlink.c
>> > @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
>> >  		case OVS_ACTION_ATTR_TRUNC:
>> >  		case OVS_ACTION_ATTR_USERSPACE:
>> >  		case OVS_ACTION_ATTR_DROP:
>> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
>> >  			break;
>> >
>> >  		case OVS_ACTION_ATTR_CT:
>> > @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
>> >  	/* Whenever new actions are added, the need to update this
>> >  	 * function should be considered.
>> >  	 */
>> > -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
>> > +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
>> >
>> >  	if (!actions)
>> >  		return;
>> > @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
>> >  	return 0;
>> >  }
>> >
>> > +static int validate_emit_sample(const struct nlattr *attr)
>> > +{
>> > +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
>> > +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
>> > +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
>> > +			.type = NLA_BINARY,
>> > +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
>> > +		},
>> > +	};
>> > +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
>> > +	int err;
>> > +
>> > +	if (!IS_ENABLED(CONFIG_PSAMPLE))
>> > +		return -EOPNOTSUPP;
>> > +
>> > +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
>> > +			       NULL);
>> > +	if (err)
>> > +		return err;
>> > +
>> > +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
>> > +}
>> > +
>> >  static int copy_action(const struct nlattr *from,
>> >  		       struct sw_flow_actions **sfa, bool log)
>> >  {
>> > @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>> >  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
>> >  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
>> >  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
>> > +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
>> >  		};
>> >  		const struct ovs_action_push_vlan *vlan;
>> >  		int type = nla_type(a);
>> > @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>> >  				return -EINVAL;
>> >  			break;
>> >
>> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
>> > +			err = validate_emit_sample(a);
>> > +			if (err)
>> > +				return err;
>> > +			break;
>> > +
>> >  		default:
>> >  			OVS_NLERR(log, "Unknown Action type %d", type);
>> >  			return -EINVAL;
>>
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [ovs-dev] [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-11 13:54       ` Aaron Conole
@ 2024-06-11 15:42         ` Adrián Moreno
  0 siblings, 0 replies; 57+ messages in thread
From: Adrián Moreno @ 2024-06-11 15:42 UTC (permalink / raw)
  To: Aaron Conole
  Cc: netdev, dev, Paolo Abeni, Donald Hunter, linux-kernel, i.maximets,
	Eric Dumazet, horms, Jakub Kicinski, David S. Miller

On Tue, Jun 11, 2024 at 09:54:49AM GMT, Aaron Conole wrote:
> Adrián Moreno <amorenoz@redhat.com> writes:
>
> > On Mon, Jun 10, 2024 at 11:46:14AM GMT, Aaron Conole wrote:
> >> Adrian Moreno <amorenoz@redhat.com> writes:
> >>
> >> > Add support for a new action: emit_sample.
> >> >
> >> > This action accepts a u32 group id and a variable-length cookie and uses
> >> > the psample multicast group to make the packet available for
> >> > observability.
> >> >
> >> > The maximum length of the user-defined cookie is set to 16, same as
> >> > tc_cookie, to discourage using cookies that will not be offloadable.
> >> >
> >> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >> > ---
> >>
> >> I saw some of the nits Simon raised - I'll add one more below.
> >>
> >> I haven't gone through the series thoroughly enough to make a detailed
> >> review.
> >>
> >> >  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
> >> >  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
> >> >  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
> >> >  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
> >> >  4 files changed, 124 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
> >> > index 4fdfc6b5cae9..a7ab5593a24f 100644
> >> > --- a/Documentation/netlink/specs/ovs_flow.yaml
> >> > +++ b/Documentation/netlink/specs/ovs_flow.yaml
> >> > @@ -727,6 +727,12 @@ attribute-sets:
> >> >          name: dec-ttl
> >> >          type: nest
> >> >          nested-attributes: dec-ttl-attrs
> >> > +      -
> >> > +        name: emit-sample
> >> > +        type: nest
> >> > +        nested-attributes: emit-sample-attrs
> >> > +        doc: |
> >> > +          Sends a packet sample to psample for external observation.
> >> >    -
> >> >      name: tunnel-key-attrs
> >> >      enum-name: ovs-tunnel-key-attr
> >> > @@ -938,6 +944,17 @@ attribute-sets:
> >> >        -
> >> >          name: gbp
> >> >          type: u32
> >> > +  -
> >> > +    name: emit-sample-attrs
> >> > +    enum-name: ovs-emit-sample-attr
> >> > +    name-prefix: ovs-emit-sample-attr-
> >> > +    attributes:
> >> > +      -
> >> > +        name: group
> >> > +        type: u32
> >> > +      -
> >> > +        name: cookie
> >> > +        type: binary
> >> >
> >> >  operations:
> >> >    name-prefix: ovs-flow-cmd-
> >> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> >> > index efc82c318fa2..a0e9dde0584a 100644
> >> > --- a/include/uapi/linux/openvswitch.h
> >> > +++ b/include/uapi/linux/openvswitch.h
> >> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
> >> >  };
> >> >  #endif
> >> >
> >> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> >> > +/**
> >> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> >> > + * action.
> >> > + *
> >> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> >> > + * sample.
> >> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> >> > + * user-defined metadata. The maximum length is 16 bytes.
> >> > + *
> >> > + * Sends the packet to the psample multicast group with the specified group and
> >> > + * cookie. It is possible to combine this action with the
> >> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> >> > + */
> >> > +enum ovs_emit_sample_attr {
> >> > +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> >> > +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> >> > +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> >> > +	__OVS_EMIT_SAMPLE_ATTR_MAX
> >> > +};
> >> > +
> >> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> >> > +
> >> > +
> >> >  /**
> >> >   * enum ovs_action_attr - Action types.
> >> >   *
> >> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
> >> >  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
> >> >  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
> >> >  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> >> > +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
> >> >
> >> >  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
> >> >  				       * from userspace. */
> >> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >> > index 964225580824..3b4dba0ded59 100644
> >> > --- a/net/openvswitch/actions.c
> >> > +++ b/net/openvswitch/actions.c
> >> > @@ -24,6 +24,11 @@
> >> >  #include <net/checksum.h>
> >> >  #include <net/dsfield.h>
> >> >  #include <net/mpls.h>
> >> > +
> >> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> >> > +#include <net/psample.h>
> >> > +#endif
> >> > +
> >> >  #include <net/sctp/checksum.h>
> >> >
> >> >  #include "datapath.h"
> >> > @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
> >> >  	return 0;
> >> >  }
> >> >
> >> > +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >> > +			       const struct sw_flow_key *key,
> >> > +			       const struct nlattr *attr)
> >> > +{
> >> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> >> > +	struct psample_group psample_group = {};
> >> > +	struct psample_metadata md = {};
> >> > +	struct vport *input_vport;
> >> > +	const struct nlattr *a;
> >> > +	int rem;
> >> > +
> >> > +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> >> > +	     a = nla_next(a, &rem)) {
> >> > +		switch (nla_type(a)) {
> >> > +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> >> > +			psample_group.group_num = nla_get_u32(a);
> >> > +			break;
> >> > +
> >> > +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> >> > +			md.user_cookie = nla_data(a);
> >> > +			md.user_cookie_len = nla_len(a);
> >> > +			break;
> >> > +		}
> >> > +	}
> >> > +
> >> > +	psample_group.net = ovs_dp_get_net(dp);
> >> > +
> >> > +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> >> > +	if (!input_vport)
> >> > +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
> >> > +
> >> > +	md.in_ifindex = input_vport->dev->ifindex;
> >> > +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> >> > +
> >> > +	psample_sample_packet(&psample_group, skb, 0, &md);
> >> > +#endif
> >> > +
> >> > +	return 0;
> >>
> >> Why this return here?  Doesn't seem used anywhere else.
> >>
> >
> > It is being used in "do_execute_actions", right?
> > All non-skb-consuming actions set the value of "err" and break from the
> > switch-case so that the the packet is dropped with OVS_DROP_ACTION_ERROR reason.
> >
> > Am i missing something?
>
> I think so.  For example, it isn't used when the function cannot
> possibly error.
>
> see the following cases:
>
> OVS_ACTION_ATTR_HASH
> OVS_ACTION_ATTR_TRUNC
>
> As you note, these can consume SKB so also don't bother setting err,
> because they will need to return anyway:
>
> OVS_ACTION_ATTR_USERSPACE
> OVS_ACTION_ATTR_OUTPUT
> OVS_ACTION_ATTR_DROP
>
> And even the following does a weird thing:
>
> OVS_ACTION_ATTR_CT
>
> because sometimes it will consume, and sometimes not.
>
> I think if there isn't a possibility of error being generated (and I
> guess from the code I see there isn't), then it shouldn't return a
> useless code, since err will be 0 on each iteration of the loop.
>

Oh, so you meant it's actualy not being set. Now I get you.
Yes. I figured that could change in the future so I left the structure
of returning an error just in case, but it's true that currently the
function cannot fail.

I'll get rid of it.

> >> > +}
> >> > +
> >> >  /* Execute a list of actions against 'skb'. */
> >> >  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >> >  			      struct sw_flow_key *key,
> >> > @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >> >  			ovs_kfree_skb_reason(skb, reason);
> >> >  			return 0;
> >> >  		}
> >> > +
> >> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> >> > +			err = execute_emit_sample(dp, skb, key, a);
> >> > +			OVS_CB(skb)->cutlen = 0;
> >> > +			break;
> >> >  		}
> >> >
> >> >  		if (unlikely(err)) {
> >> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> >> > index f224d9bcea5e..eb59ff9c8154 100644
> >> > --- a/net/openvswitch/flow_netlink.c
> >> > +++ b/net/openvswitch/flow_netlink.c
> >> > @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
> >> >  		case OVS_ACTION_ATTR_TRUNC:
> >> >  		case OVS_ACTION_ATTR_USERSPACE:
> >> >  		case OVS_ACTION_ATTR_DROP:
> >> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> >> >  			break;
> >> >
> >> >  		case OVS_ACTION_ATTR_CT:
> >> > @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
> >> >  	/* Whenever new actions are added, the need to update this
> >> >  	 * function should be considered.
> >> >  	 */
> >> > -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
> >> > +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
> >> >
> >> >  	if (!actions)
> >> >  		return;
> >> > @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
> >> >  	return 0;
> >> >  }
> >> >
> >> > +static int validate_emit_sample(const struct nlattr *attr)
> >> > +{
> >> > +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
> >> > +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
> >> > +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
> >> > +			.type = NLA_BINARY,
> >> > +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
> >> > +		},
> >> > +	};
> >> > +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
> >> > +	int err;
> >> > +
> >> > +	if (!IS_ENABLED(CONFIG_PSAMPLE))
> >> > +		return -EOPNOTSUPP;
> >> > +
> >> > +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
> >> > +			       NULL);
> >> > +	if (err)
> >> > +		return err;
> >> > +
> >> > +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
> >> > +}
> >> > +
> >> >  static int copy_action(const struct nlattr *from,
> >> >  		       struct sw_flow_actions **sfa, bool log)
> >> >  {
> >> > @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >> >  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
> >> >  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
> >> >  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
> >> > +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
> >> >  		};
> >> >  		const struct ovs_action_push_vlan *vlan;
> >> >  		int type = nla_type(a);
> >> > @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >> >  				return -EINVAL;
> >> >  			break;
> >> >
> >> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> >> > +			err = validate_emit_sample(a);
> >> > +			if (err)
> >> > +				return err;
> >> > +			break;
> >> > +
> >> >  		default:
> >> >  			OVS_NLERR(log, "Unknown Action type %d", type);
> >> >  			return -EINVAL;
> >>
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 4/9] net: psample: allow using rate as probability
  2024-06-03 18:56 ` [PATCH net-next v2 4/9] net: psample: allow using rate as probability Adrian Moreno
@ 2024-06-14 16:11   ` Simon Horman
  2024-06-17  6:32     ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:11 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Yotam Gigi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, linux-kernel

On Mon, Jun 03, 2024 at 08:56:38PM +0200, Adrian Moreno wrote:
> Although not explicitly documented in the psample module itself, the
> definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.
> 
> Quoting tc-sample(8):
> "RATE of 100 will lead to an average of one sampled packet out of every
> 100 observed."
> 
> With this semantics, the rates that we can express with an unsigned
> 32-bits number are very unevenly distributed and concentrated towards
> "sampling few packets".
> For example, we can express a probability of 2.32E-8% but we
> cannot express anything between 100% and 50%.
> 
> For sampling applications that are capable of sampling a decent
> amount of packets, this sampling rate semantics is not very useful.
> 
> Add a new flag to the uAPI that indicates that the sampling rate is
> expressed in scaled probability, this is:
> - 0 is 0% probability, no packets get sampled.
> - U32_MAX is 100% probability, all packets get sampled.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Hi Adrian,

Would it be possible to add appropriate documentation for
rate - both the original ratio variant, and the new probability
variant - somewhere?

That aside, this looks good to me.

...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
                     ` (2 preceding siblings ...)
  2024-06-10 15:46   ` [ovs-dev] " Aaron Conole
@ 2024-06-14 16:13   ` Simon Horman
  2024-06-17 10:44   ` Ilya Maximets
  4 siblings, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:13 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On Mon, Jun 03, 2024 at 08:56:39PM +0200, Adrian Moreno wrote:
> Add support for a new action: emit_sample.
> 
> This action accepts a u32 group id and a variable-length cookie and uses
> the psample multicast group to make the packet available for
> observability.
> 
> The maximum length of the user-defined cookie is set to 16, same as
> tc_cookie, to discourage using cookies that will not be offloadable.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

...

> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c

...

> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>  	return 0;
>  }
>  
> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> +			       const struct sw_flow_key *key,
> +			       const struct nlattr *attr)
> +{
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +	struct psample_group psample_group = {};
> +	struct psample_metadata md = {};
> +	struct vport *input_vport;
> +	const struct nlattr *a;
> +	int rem;
> +
> +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> +	     a = nla_next(a, &rem)) {
> +		switch (nla_type(a)) {
> +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> +			psample_group.group_num = nla_get_u32(a);
> +			break;
> +
> +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> +			md.user_cookie = nla_data(a);
> +			md.user_cookie_len = nla_len(a);
> +			break;
> +		}
> +	}
> +
> +	psample_group.net = ovs_dp_get_net(dp);
> +
> +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> +	if (!input_vport)
> +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
> +
> +	md.in_ifindex = input_vport->dev->ifindex;
> +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> +
> +	psample_sample_packet(&psample_group, skb, 0, &md);
> +#endif
> +
> +	return 0;
> +}
> +
>  /* Execute a list of actions against 'skb'. */
>  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			      struct sw_flow_key *key,
> @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			ovs_kfree_skb_reason(skb, reason);
>  			return 0;
>  		}
> +
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> +			err = execute_emit_sample(dp, skb, key, a);
> +			OVS_CB(skb)->cutlen = 0;
> +			break;
>  		}

Hi Adrian,

execute_emit_sample always returns 0, and it seems that err will always
be 0 when the code above is executed. So perhaps the return type
of execute_emit_sample could be changed to void and the code above be
updated not to set err.

Other than that, which I don't feel particularly strongly about,
this looks good to me.

...

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 1/9] net: psample: add user cookie
  2024-06-03 18:56 ` [PATCH net-next v2 1/9] net: psample: add user cookie Adrian Moreno
@ 2024-06-14 16:13   ` Simon Horman
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:13 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Yotam Gigi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Mon, Jun 03, 2024 at 08:56:35PM +0200, Adrian Moreno wrote:
> Add a user cookie to the sample metadata so that sample emitters can
> provide more contextual information to samples.
> 
> If present, send the user cookie in a new attribute:
> PSAMPLE_ATTR_USER_COOKIE.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>


Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample
  2024-06-03 18:56 ` [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample Adrian Moreno
@ 2024-06-14 16:14   ` Simon Horman
  2024-06-17 10:00   ` Ilya Maximets
  1 sibling, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:14 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-kernel

On Mon, Jun 03, 2024 at 08:56:36PM +0200, Adrian Moreno wrote:
> If the action has a user_cookie, pass it along to the sample so it can
> be easily identified.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners
  2024-06-03 18:56 ` [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners Adrian Moreno
@ 2024-06-14 16:15   ` Simon Horman
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:15 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Yotam Gigi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Mon, Jun 03, 2024 at 08:56:37PM +0200, Adrian Moreno wrote:
> If nobody is listening on the multicast group, generating the sample,
> which involves copying packet data, seems completely unnecessary.
> 
> Return fast in this case.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-03 18:56 ` [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample Adrian Moreno
@ 2024-06-14 16:17   ` Simon Horman
  2024-06-17 11:55   ` Ilya Maximets
  1 sibling, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-14 16:17 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Mon, Jun 03, 2024 at 08:56:41PM +0200, Adrian Moreno wrote:
> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> observability-oriented.
> 
> Apart from some corner case in which it's used a replacement of clone()
> for old kernels, it's really only used for sFlow, IPFIX and now,
> local emit_sample.
> 
> With this in mind, it doesn't make much sense to report
> OVS_DROP_LAST_ACTION inside sample actions.
> 
> For instance, if the flow:
> 
>   actions:sample(..,emit_sample(..)),2
> 
> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> confusing for users since the packet did reach its destination.
> 
> This patch makes internal action execution silently consume the skb
> instead of notifying a drop for this case.
> 
> Unfortunately, this patch does not remove all potential sources of
> confusion since, if the sample action itself is the last action, e.g:
> 
>     actions:sample(..,emit_sample(..))
> 
> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> 
> Sadly, this case is difficult to solve without breaking the
> optimization by which the skb is not cloned on last sample actions.
> But, given explicit drop actions are now supported, OVS can just add one
> after the last sample() and rewrite the flow as:
> 
>     actions:sample(..,emit_sample(..)),drop
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
  2024-06-04  6:09   ` kernel test robot
  2024-06-04  8:49   ` kernel test robot
@ 2024-06-14 16:55   ` Aaron Conole
  2024-06-17  7:08     ` Adrián Moreno
  2 siblings, 1 reply; 57+ messages in thread
From: Aaron Conole @ 2024-06-14 16:55 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

Adrian Moreno <amorenoz@redhat.com> writes:

> The behavior of actions might not be the exact same if they are being
> executed inside a nested sample action. Store the probability of the
> parent sample action in the skb's cb area.

What does that mean?

> Use the probability in emit_sample to pass it down to psample.
>
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  include/uapi/linux/openvswitch.h |  3 ++-
>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
>  net/openvswitch/datapath.h       |  3 +++
>  net/openvswitch/vport.c          |  1 +
>  4 files changed, 28 insertions(+), 4 deletions(-)
>
> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index a0e9dde0584a..9d675725fa2b 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
>   * Actions are passed as nested attributes.
>   *
>   * Executes the specified actions with the given probability on a per-packet
> - * basis.
> + * basis. Nested actions will be able to access the probability value of the
> + * parent @OVS_ACTION_ATTR_SAMPLE.
>   */
>  enum ovs_sample_attr {
>  	OVS_SAMPLE_ATTR_UNSPEC,
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 3b4dba0ded59..33f6d93ba5e4 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  	struct nlattr *sample_arg;
>  	int rem = nla_len(attr);
>  	const struct sample_arg *arg;
> +	u32 init_probability;
>  	bool clone_flow_key;
> +	int err;
>  
>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
>  	sample_arg = nla_data(attr);
>  	arg = nla_data(sample_arg);
>  	actions = nla_next(sample_arg, &rem);
> +	init_probability = OVS_CB(skb)->probability;
>  
>  	if ((arg->probability != U32_MAX) &&
>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  		return 0;
>  	}
>  
> +	if (init_probability) {
> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> +					    arg->probability / U32_MAX);
> +	} else {
> +		OVS_CB(skb)->probability = arg->probability;
> +	}
> +

I'm confused by this.  Eventually, integer arithmetic will practically
guarantee that nested sample() calls will go to 0.  So eventually, the
test above will be impossible to meet mathematically.

OTOH, you could argue that a 1% of 50% is low anyway, but it still would
have a positive probability count, and still be possible for
get_random_u32() call to match.

I'm not sure about this particular change.  Why do we need it?

>  	clone_flow_key = !arg->exec;
> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> -			     clone_flow_key);
> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> +			    clone_flow_key);
> +
> +	if (!last)

Is this right?  Don't we only want to set the probability on the last
action?  Should the test be 'if (last)'?

> +		OVS_CB(skb)->probability = init_probability;
> +
> +	return err;
>  }
>  
>  /* When 'last' is true, clone() should always consume the 'skb'.
> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>  	struct psample_metadata md = {};
>  	struct vport *input_vport;
>  	const struct nlattr *a;
> +	u32 rate;
>  	int rem;
>  
>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>  
>  	md.in_ifindex = input_vport->dev->ifindex;
>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> +	md.rate_as_probability = 1;
> +
> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
>  
> -	psample_sample_packet(&psample_group, skb, 0, &md);
> +	psample_sample_packet(&psample_group, skb, rate, &md);
>  #endif
>  
>  	return 0;
> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> index 0cd29971a907..9ca6231ea647 100644
> --- a/net/openvswitch/datapath.h
> +++ b/net/openvswitch/datapath.h
> @@ -115,12 +115,15 @@ struct datapath {
>   * fragmented.
>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
>   * @cutlen: The number of bytes from the packet end to be removed.
> + * @probability: The sampling probability that was applied to this skb; 0 means
> + * no sampling has occurred; U32_MAX means 100% probability.
>   */
>  struct ovs_skb_cb {
>  	struct vport		*input_vport;
>  	u16			mru;
>  	u16			acts_origlen;
>  	u32			cutlen;
> +	u32			probability;
>  };
>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
>  
> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> index 972ae01a70f7..8732f6e51ae5 100644
> --- a/net/openvswitch/vport.c
> +++ b/net/openvswitch/vport.c
> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
>  	OVS_CB(skb)->input_vport = vport;
>  	OVS_CB(skb)->mru = 0;
>  	OVS_CB(skb)->cutlen = 0;
> +	OVS_CB(skb)->probability = 0;
>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
>  		u32 mark;


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-03 18:56 ` [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test Adrian Moreno
  2024-06-05 19:43   ` Simon Horman
@ 2024-06-14 17:07   ` Aaron Conole
  2024-06-17  7:18     ` Adrián Moreno
  1 sibling, 1 reply; 57+ messages in thread
From: Aaron Conole @ 2024-06-14 17:07 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

Adrian Moreno <amorenoz@redhat.com> writes:

> Add a test to verify sampling packets via psample works.
>
> In order to do that, create a subcommand in ovs-dpctl.py to listen to
> on the psample multicast group and print samples.
>
> In order to also test simultaneous sFlow and psample actions and
> packet truncation, add missing parsing support for "userspace" and
> "trunc" actions.

Maybe split that into a separate patch.  This has a bugfix and 3
features being pushed in.  I know it's already getting long as a series,
so maybe it's okay to fold the userspace attribute bugfix with the parse
support (since it wasn't really usable before).

> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  .../selftests/net/openvswitch/openvswitch.sh  |  99 +++++++++++++++-
>  .../selftests/net/openvswitch/ovs-dpctl.py    | 112 +++++++++++++++++-
>  2 files changed, 204 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> index 5cae53543849..f6e0ae3f6424 100755
> --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
> +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> @@ -20,7 +20,8 @@ tests="
>  	nat_related_v4				ip4-nat-related: ICMP related matches work with SNAT
>  	netlink_checks				ovsnl: validate netlink attrs and settings
>  	upcall_interfaces			ovs: test the upcall interfaces
> -	drop_reason				drop: test drop reasons are emitted"
> +	drop_reason				drop: test drop reasons are emitted
> +	emit_sample 				emit_sample: Sampling packets with psample"
>  
>  info() {
>      [ $VERBOSE = 0 ] || echo $*
> @@ -170,6 +171,19 @@ ovs_drop_reason_count()
>  	return `echo "$perf_output" | grep "$pattern" | wc -l`
>  }
>  
> +ovs_test_flow_fails () {
> +	ERR_MSG="Flow actions may not be safe on all matching packets"
> +
> +	PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
> +	ovs_add_flow $@ &> /dev/null $@ && return 1
> +	POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
> +
> +	if [ "$PRE_TEST" == "$POST_TEST" ]; then
> +		return 1
> +	fi
> +	return 0
> +}
> +
>  usage() {
>  	echo
>  	echo "$0 [OPTIONS] [TEST]..."
> @@ -184,6 +198,89 @@ usage() {
>  	exit 1
>  }
>  
> +
> +# emit_sample test
> +# - use emit_sample to observe packets
> +test_emit_sample() {
> +	sbx_add "test_emit_sample" || return $?
> +
> +	# Add a datapath with per-vport dispatching.
> +	ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
> +
> +	info "create namespaces"
> +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> +		client c0 c1 172.31.110.10/24 -u || return 1
> +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> +		server s0 s1 172.31.110.20/24 -u || return 1
> +
> +	# Check if emit_sample actions can be configured.
> +	ovs_add_flow "test_emit_sample" emit_sample \
> +	'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
> +	if [ $? == 1 ]; then
> +		info "no support for emit_sample - skipping"
> +		ovs_exit_sig
> +		return $ksft_skip
> +	fi
> +
> +	ovs_del_flows "test_emit_sample" emit_sample
> +
> +	# Allow ARP
> +	ovs_add_flow "test_emit_sample" emit_sample \
> +		'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
> +	ovs_add_flow "test_emit_sample" emit_sample \
> +		'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
> +
> +	# Test action verification.
> +	OLDIFS=$IFS
> +	IFS='*'
> +	min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
> +	for testcase in \
> +		"cookie to large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
> +		"no group with cookie"*"emit_sample(cookie=abcd)" \
> +		"no group"*"sample()";
> +	do
> +		set -- $testcase;
> +		ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
> +		if [ $? == 1 ]; then
> +			info "failed - $1"
> +			return 1
> +		fi
> +	done
> +	IFS=$OLDIFS
> +
> +	# Sample first 14 bytes of all traffic.
> +	ovs_add_flow "test_emit_sample" emit_sample \
> +	"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" "trunc(14),emit_sample(group=1,cookie=c0ffee),2"
> +
> +	# Sample all traffic. In this case, use a sample() action with both
> +	# emit_sample and an upcall emulating simultaneous local sampling and
> +	# sFlow / IPFIX.
> +	nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | cut -d ":" -f 2 | tr -d ' ')
> +	ovs_add_flow "test_emit_sample" emit_sample \
> +	"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" "sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
> +
> +	# Record emit_sample data.
> +	python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 2>$ovs_dir/psample.err &
> +	pid=$!
> +	on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"

  Maybe ovs_netns_spawn_daemon ?

> +
> +	# Send a single ping.
> +	sleep 1
> +	ovs_sbx "test_emit_sample" ip netns exec client ping -I c1 172.31.110.20 -c 1 || return 1
> +	sleep 1
> +
> +	# We should have received one userspace action upcall and 2 psample packets.
> +	grep -E "userspace action command" $ovs_dir/s0.out >/dev/null 2>&1 || return 1
> +
> +	# client -> server samples should only contain the first 14 bytes of the packet.
> +	grep -E "rate:4294967295,group:1,cookie:c0ffee data:[0-9a-f]{28}$" \
> +			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
> +	grep -E "rate:4294967295,group:2,cookie:eeff0c" \
> +			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
> +
> +	return 0
> +}
> +
>  # drop_reason test
>  # - drop packets and verify the right drop reason is reported
>  test_drop_reason() {
> diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> index f8b5362aac8c..44fdeb9491a2 100644
> --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> @@ -27,8 +27,10 @@ try:
>      from pyroute2.netlink import genlmsg
>      from pyroute2.netlink import nla
>      from pyroute2.netlink import nlmsg_atoms
> -    from pyroute2.netlink.exceptions import NetlinkError
> +    from pyroute2.netlink.event import EventSocket
>      from pyroute2.netlink.generic import GenericNetlinkSocket
> +    from pyroute2.netlink.nlsocket import Marshal
> +    from pyroute2.netlink.exceptions import NetlinkError

Why did this get moved?

>      import pyroute2
>  
>  except ModuleNotFoundError:
> @@ -575,13 +577,27 @@ class ovsactions(nla):
>                  print_str += "userdata="
>                  for f in self.get_attr("OVS_USERSPACE_ATTR_USERDATA"):
>                      print_str += "%x." % f
> -            if self.get_attr("OVS_USERSPACE_ATTR_TUN_PORT") is not None:
> +            if self.get_attr("OVS_USERSPACE_ATTR_EGRESS_TUN_PORT") is not None:
>                  print_str += "egress_tun_port=%d" % self.get_attr(
> -                    "OVS_USERSPACE_ATTR_TUN_PORT"
> +                    "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT"

Looks like a bugfix here.

>                  )
>              print_str += ")"
>              return print_str
>  
> +        def parse(self, actstr):
> +            attrs_desc = (
> +                ("pid", "OVS_USERSPACE_ATTR_PID", int),
> +                ("userdata", "OVS_USERSPACE_ATTR_USERDATA",
> +                    lambda x: list(bytearray.fromhex(x))),
> +                ("egress_tun_port", "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT", int)
> +            )
> +
> +            attrs, actstr = parse_attrs(actstr, attrs_desc)
> +            for attr in attrs:
> +                self["attrs"].append(attr)
> +
> +            return actstr
> +
>      def dpstr(self, more=False):
>          print_str = ""
>  
> @@ -803,6 +819,25 @@ class ovsactions(nla):
>                  self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
>                  parsed = True
>  
> +            elif parse_starts_block(actstr, "userspace(", False):
> +                uact = self.userspace()
> +                actstr = uact.parse(actstr[len("userpsace(") : ])
> +                self["attrs"].append(["OVS_ACTION_ATTR_USERSPACE", uact])
> +                parsed = True
> +
> +            elif parse_starts_block(actstr, "trunc", False):

This should be "trunc("

> +                parencount += 1
> +                actstr, val = parse_extract_field(
> +                    actstr,
> +                    "trunc(",
> +                    r"([0-9]+)",
> +                    int,
> +                    False,
> +                    None,
> +                )
> +                self["attrs"].append(["OVS_ACTION_ATTR_TRUNC", val])
> +                parsed = True
> +
>              actstr = actstr[strspn(actstr, ", ") :]
>              while parencount > 0:
>                  parencount -= 1
> @@ -2184,10 +2219,70 @@ class OvsFlow(GenericNetlinkSocket):
>          print("MISS upcall[%d/%s]: %s" % (seq, pktpres, keystr), flush=True)
>  
>      def execute(self, packetmsg):
> -        print("userspace execute command")
> +        print("userspace execute command", flush=True)
>  
>      def action(self, packetmsg):
> -        print("userspace action command")
> +        print("userspace action command", flush=True)
> +
> +
> +class psample_sample(genlmsg):
> +    nla_map = (
> +        ("PSAMPLE_ATTR_IIFINDEX", "none"),
> +        ("PSAMPLE_ATTR_OIFINDEX", "none"),
> +        ("PSAMPLE_ATTR_ORIGSIZE", "none"),
> +        ("PSAMPLE_ATTR_SAMPLE_GROUP", "uint32"),
> +        ("PSAMPLE_ATTR_GROUP_SEQ", "none"),
> +        ("PSAMPLE_ATTR_SAMPLE_RATE", "uint32"),
> +        ("PSAMPLE_ATTR_DATA", "array(uint8)"),
> +        ("PSAMPLE_ATTR_GROUP_REFCOUNT", "none"),
> +        ("PSAMPLE_ATTR_TUNNEL", "none"),
> +        ("PSAMPLE_ATTR_PAD", "none"),
> +        ("PSAMPLE_ATTR_OUT_TC", "none"),
> +        ("PSAMPLE_ATTR_OUT_TC_OCC", "none"),
> +        ("PSAMPLE_ATTR_LATENCY", "none"),
> +        ("PSAMPLE_ATTR_TIMESTAMP", "none"),
> +        ("PSAMPLE_ATTR_PROTO", "none"),
> +        ("PSAMPLE_ATTR_USER_COOKIE", "array(uint8)"),
> +    )
> +
> +    def dpstr(self):
> +        fields = []
> +        data = ""
> +        for (attr, value) in self["attrs"]:
> +            if attr == "PSAMPLE_ATTR_SAMPLE_GROUP":
> +                fields.append("group:%d" % value)
> +            if attr == "PSAMPLE_ATTR_SAMPLE_RATE":
> +                fields.append("rate:%d" % value)
> +            if attr == "PSAMPLE_ATTR_USER_COOKIE":
> +                value = "".join(format(x, "02x") for x in value)
> +                fields.append("cookie:%s" % value)
> +            if attr == "PSAMPLE_ATTR_DATA" and len(value) > 0:
> +                data = "data:%s" % "".join(format(x, "02x") for x in value)
> +
> +        return ("%s %s" % (",".join(fields), data)).strip()
> +
> +
> +class psample_msg(Marshal):
> +    PSAMPLE_CMD_SAMPLE = 0
> +    PSAMPLE_CMD_GET_GROUP = 1
> +    PSAMPLE_CMD_NEW_GROUP = 2
> +    PSAMPLE_CMD_DEL_GROUP = 3
> +    PSAMPLE_CMD_SET_FILTER = 4
> +    msg_map = {PSAMPLE_CMD_SAMPLE: psample_sample}
> +
> +
> +class Psample(EventSocket):
> +    genl_family = "psample"
> +    mcast_groups = ["packets"]
> +    marshal_class = psample_msg
> +
> +    def read_samples(self):
> +        while True:
> +            try:
> +                for msg in self.get():
> +                    print(msg.dpstr(), flush=True)
> +            except NetlinkError as ne:
> +                raise ne
>  
>  
>  def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()):
> @@ -2247,7 +2342,7 @@ def main(argv):
>          help="Increment 'verbose' output counter.",
>          default=0,
>      )
> -    subparsers = parser.add_subparsers()
> +    subparsers = parser.add_subparsers(dest="subcommand")
>  
>      showdpcmd = subparsers.add_parser("show")
>      showdpcmd.add_argument(
> @@ -2304,6 +2399,8 @@ def main(argv):
>      delfscmd = subparsers.add_parser("del-flows")
>      delfscmd.add_argument("flsbr", help="Datapath name")
>  
> +    subparsers.add_parser("psample")
> +
>      args = parser.parse_args()
>  
>      if args.verbose > 0:
> @@ -2318,6 +2415,9 @@ def main(argv):
>  
>      sys.setrecursionlimit(100000)
>  
> +    if args.subcommand == "psample":
> +        Psample().read_samples()
> +
>      if hasattr(args, "showdp"):
>          found = False
>          for iface in ndb.interfaces:


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 4/9] net: psample: allow using rate as probability
  2024-06-14 16:11   ` Simon Horman
@ 2024-06-17  6:32     ` Adrián Moreno
  2024-06-17 10:30       ` Simon Horman
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-17  6:32 UTC (permalink / raw)
  To: Simon Horman
  Cc: netdev, aconole, echaudro, i.maximets, dev, Yotam Gigi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, linux-kernel

On Fri, Jun 14, 2024 at 05:11:30PM GMT, Simon Horman wrote:
> On Mon, Jun 03, 2024 at 08:56:38PM +0200, Adrian Moreno wrote:
> > Although not explicitly documented in the psample module itself, the
> > definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.
> >
> > Quoting tc-sample(8):
> > "RATE of 100 will lead to an average of one sampled packet out of every
> > 100 observed."
> >
> > With this semantics, the rates that we can express with an unsigned
> > 32-bits number are very unevenly distributed and concentrated towards
> > "sampling few packets".
> > For example, we can express a probability of 2.32E-8% but we
> > cannot express anything between 100% and 50%.
> >
> > For sampling applications that are capable of sampling a decent
> > amount of packets, this sampling rate semantics is not very useful.
> >
> > Add a new flag to the uAPI that indicates that the sampling rate is
> > expressed in scaled probability, this is:
> > - 0 is 0% probability, no packets get sampled.
> > - U32_MAX is 100% probability, all packets get sampled.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>
> Hi Adrian,
>
> Would it be possible to add appropriate documentation for
> rate - both the original ratio variant, and the new probability
> variant - somewhere?
>

Hi Simon, thanks for the suggestion. Would the uapi header be a good
place for such documentation?

> That aside, this looks good to me.
>
> ...
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-14 16:55   ` Aaron Conole
@ 2024-06-17  7:08     ` Adrián Moreno
  2024-06-17 11:26       ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-17  7:08 UTC (permalink / raw)
  To: Aaron Conole
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
> Adrian Moreno <amorenoz@redhat.com> writes:
>
> > The behavior of actions might not be the exact same if they are being
> > executed inside a nested sample action. Store the probability of the
> > parent sample action in the skb's cb area.
>
> What does that mean?
>

Emit action, for instance, needs the probability so that psample
consumers know what was the sampling rate applied. Also, the way we
should inform about packet drops (via kfree_skb_reason) changes (see
patch 7/9).

> > Use the probability in emit_sample to pass it down to psample.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
> >  include/uapi/linux/openvswitch.h |  3 ++-
> >  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
> >  net/openvswitch/datapath.h       |  3 +++
> >  net/openvswitch/vport.c          |  1 +
> >  4 files changed, 28 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> > index a0e9dde0584a..9d675725fa2b 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -649,7 +649,8 @@ enum ovs_flow_attr {
> >   * Actions are passed as nested attributes.
> >   *
> >   * Executes the specified actions with the given probability on a per-packet
> > - * basis.
> > + * basis. Nested actions will be able to access the probability value of the
> > + * parent @OVS_ACTION_ATTR_SAMPLE.
> >   */
> >  enum ovs_sample_attr {
> >  	OVS_SAMPLE_ATTR_UNSPEC,
> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 3b4dba0ded59..33f6d93ba5e4 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >  	struct nlattr *sample_arg;
> >  	int rem = nla_len(attr);
> >  	const struct sample_arg *arg;
> > +	u32 init_probability;
> >  	bool clone_flow_key;
> > +	int err;
> >
> >  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
> >  	sample_arg = nla_data(attr);
> >  	arg = nla_data(sample_arg);
> >  	actions = nla_next(sample_arg, &rem);
> > +	init_probability = OVS_CB(skb)->probability;
> >
> >  	if ((arg->probability != U32_MAX) &&
> >  	    (!arg->probability || get_random_u32() > arg->probability)) {
> > @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >  		return 0;
> >  	}
> >
> > +	if (init_probability) {
> > +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> > +					    arg->probability / U32_MAX);
> > +	} else {
> > +		OVS_CB(skb)->probability = arg->probability;
> > +	}
> > +
>
> I'm confused by this.  Eventually, integer arithmetic will practically
> guarantee that nested sample() calls will go to 0.  So eventually, the
> test above will be impossible to meet mathematically.
>
> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
> have a positive probability count, and still be possible for
> get_random_u32() call to match.
>

Using OVS's probability semantics, we can express probabilities as low
as (100/U32_MAX)% which is pretty low indeed. However, just because the
probability of executing the action is low I don't think we should not
report it.

Rethinking the integer arithmetics, it's true that we should avoid
hitting zero on the division, eg: nesting 6x 1% sampling rates will make
the result be zero which will make probability restoration fail on the
way back. Threrefore, the new probability should be at least 1.


> I'm not sure about this particular change.  Why do we need it?
>

Why do we need to propagate the probability down to nested "sample"
actions? or why do we need to store the probability in the cb area in
the first place?

The former: Just for correctness as only storing the last one would be
incorrect. Although I don't know of any use for nested "sample" actions.
The latter: To pass it down to psample so that sample receivers know how
the sampling rate applied (and, e.g: do throughput estimations like OVS
does with IPFIX).


> >  	clone_flow_key = !arg->exec;
> > -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> > -			     clone_flow_key);
> > +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> > +			    clone_flow_key);
> > +
> > +	if (!last)
>
> Is this right?  Don't we only want to set the probability on the last
> action?  Should the test be 'if (last)'?
>

This is restoring the parent's probability after the actions in the
current sample action have been executed.

If it was the last action there is no need to restore the probability
back to the parent's (or zero if it's there's only one level) since no
further action will require it. And more importantly, if it's the last
action, the packet gets free'ed inside that "branch" so we must not
access its memory.


> > +		OVS_CB(skb)->probability = init_probability;
> > +
> > +	return err;
> >  }
> >
> >  /* When 'last' is true, clone() should always consume the 'skb'.
> > @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >  	struct psample_metadata md = {};
> >  	struct vport *input_vport;
> >  	const struct nlattr *a;
> > +	u32 rate;
> >  	int rem;
> >
> >  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> > @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >
> >  	md.in_ifindex = input_vport->dev->ifindex;
> >  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> > +	md.rate_as_probability = 1;
> > +
> > +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
> >
> > -	psample_sample_packet(&psample_group, skb, 0, &md);
> > +	psample_sample_packet(&psample_group, skb, rate, &md);
> >  #endif
> >
> >  	return 0;
> > diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> > index 0cd29971a907..9ca6231ea647 100644
> > --- a/net/openvswitch/datapath.h
> > +++ b/net/openvswitch/datapath.h
> > @@ -115,12 +115,15 @@ struct datapath {
> >   * fragmented.
> >   * @acts_origlen: The netlink size of the flow actions applied to this skb.
> >   * @cutlen: The number of bytes from the packet end to be removed.
> > + * @probability: The sampling probability that was applied to this skb; 0 means
> > + * no sampling has occurred; U32_MAX means 100% probability.
> >   */
> >  struct ovs_skb_cb {
> >  	struct vport		*input_vport;
> >  	u16			mru;
> >  	u16			acts_origlen;
> >  	u32			cutlen;
> > +	u32			probability;
> >  };
> >  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
> >
> > diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> > index 972ae01a70f7..8732f6e51ae5 100644
> > --- a/net/openvswitch/vport.c
> > +++ b/net/openvswitch/vport.c
> > @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
> >  	OVS_CB(skb)->input_vport = vport;
> >  	OVS_CB(skb)->mru = 0;
> >  	OVS_CB(skb)->cutlen = 0;
> > +	OVS_CB(skb)->probability = 0;
> >  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
> >  		u32 mark;
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-14 17:07   ` Aaron Conole
@ 2024-06-17  7:18     ` Adrián Moreno
  2024-06-18  9:08       ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-17  7:18 UTC (permalink / raw)
  To: Aaron Conole
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

On Fri, Jun 14, 2024 at 01:07:33PM GMT, Aaron Conole wrote:
> Adrian Moreno <amorenoz@redhat.com> writes:
>
> > Add a test to verify sampling packets via psample works.
> >
> > In order to do that, create a subcommand in ovs-dpctl.py to listen to
> > on the psample multicast group and print samples.
> >
> > In order to also test simultaneous sFlow and psample actions and
> > packet truncation, add missing parsing support for "userspace" and
> > "trunc" actions.
>
> Maybe split that into a separate patch.  This has a bugfix and 3
> features being pushed in.  I know it's already getting long as a series,
> so maybe it's okay to fold the userspace attribute bugfix with the parse
> support (since it wasn't really usable before).
>

OK. Sounds reasonable.

> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
> >  .../selftests/net/openvswitch/openvswitch.sh  |  99 +++++++++++++++-
> >  .../selftests/net/openvswitch/ovs-dpctl.py    | 112 +++++++++++++++++-
> >  2 files changed, 204 insertions(+), 7 deletions(-)
> >
> > diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > index 5cae53543849..f6e0ae3f6424 100755
> > --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > @@ -20,7 +20,8 @@ tests="
> >  	nat_related_v4				ip4-nat-related: ICMP related matches work with SNAT
> >  	netlink_checks				ovsnl: validate netlink attrs and settings
> >  	upcall_interfaces			ovs: test the upcall interfaces
> > -	drop_reason				drop: test drop reasons are emitted"
> > +	drop_reason				drop: test drop reasons are emitted
> > +	emit_sample 				emit_sample: Sampling packets with psample"
> >
> >  info() {
> >      [ $VERBOSE = 0 ] || echo $*
> > @@ -170,6 +171,19 @@ ovs_drop_reason_count()
> >  	return `echo "$perf_output" | grep "$pattern" | wc -l`
> >  }
> >
> > +ovs_test_flow_fails () {
> > +	ERR_MSG="Flow actions may not be safe on all matching packets"
> > +
> > +	PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
> > +	ovs_add_flow $@ &> /dev/null $@ && return 1
> > +	POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
> > +
> > +	if [ "$PRE_TEST" == "$POST_TEST" ]; then
> > +		return 1
> > +	fi
> > +	return 0
> > +}
> > +
> >  usage() {
> >  	echo
> >  	echo "$0 [OPTIONS] [TEST]..."
> > @@ -184,6 +198,89 @@ usage() {
> >  	exit 1
> >  }
> >
> > +
> > +# emit_sample test
> > +# - use emit_sample to observe packets
> > +test_emit_sample() {
> > +	sbx_add "test_emit_sample" || return $?
> > +
> > +	# Add a datapath with per-vport dispatching.
> > +	ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
> > +
> > +	info "create namespaces"
> > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> > +		client c0 c1 172.31.110.10/24 -u || return 1
> > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> > +		server s0 s1 172.31.110.20/24 -u || return 1
> > +
> > +	# Check if emit_sample actions can be configured.
> > +	ovs_add_flow "test_emit_sample" emit_sample \
> > +	'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
> > +	if [ $? == 1 ]; then
> > +		info "no support for emit_sample - skipping"
> > +		ovs_exit_sig
> > +		return $ksft_skip
> > +	fi
> > +
> > +	ovs_del_flows "test_emit_sample" emit_sample
> > +
> > +	# Allow ARP
> > +	ovs_add_flow "test_emit_sample" emit_sample \
> > +		'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
> > +	ovs_add_flow "test_emit_sample" emit_sample \
> > +		'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
> > +
> > +	# Test action verification.
> > +	OLDIFS=$IFS
> > +	IFS='*'
> > +	min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
> > +	for testcase in \
> > +		"cookie to large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
> > +		"no group with cookie"*"emit_sample(cookie=abcd)" \
> > +		"no group"*"sample()";
> > +	do
> > +		set -- $testcase;
> > +		ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
> > +		if [ $? == 1 ]; then
> > +			info "failed - $1"
> > +			return 1
> > +		fi
> > +	done
> > +	IFS=$OLDIFS
> > +
> > +	# Sample first 14 bytes of all traffic.
> > +	ovs_add_flow "test_emit_sample" emit_sample \
> > +	"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" "trunc(14),emit_sample(group=1,cookie=c0ffee),2"
> > +
> > +	# Sample all traffic. In this case, use a sample() action with both
> > +	# emit_sample and an upcall emulating simultaneous local sampling and
> > +	# sFlow / IPFIX.
> > +	nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | cut -d ":" -f 2 | tr -d ' ')
> > +	ovs_add_flow "test_emit_sample" emit_sample \
> > +	"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" "sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
> > +
> > +	# Record emit_sample data.
> > +	python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 2>$ovs_dir/psample.err &
> > +	pid=$!
> > +	on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"
>
>   Maybe ovs_netns_spawn_daemon ?
>

I'll take a look at it, thanks.

> > +
> > +	# Send a single ping.
> > +	sleep 1
> > +	ovs_sbx "test_emit_sample" ip netns exec client ping -I c1 172.31.110.20 -c 1 || return 1
> > +	sleep 1
> > +
> > +	# We should have received one userspace action upcall and 2 psample packets.
> > +	grep -E "userspace action command" $ovs_dir/s0.out >/dev/null 2>&1 || return 1
> > +
> > +	# client -> server samples should only contain the first 14 bytes of the packet.
> > +	grep -E "rate:4294967295,group:1,cookie:c0ffee data:[0-9a-f]{28}$" \
> > +			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
> > +	grep -E "rate:4294967295,group:2,cookie:eeff0c" \
> > +			 $ovs_dir/psample.out >/dev/null 2>&1 || return 1
> > +
> > +	return 0
> > +}
> > +
> >  # drop_reason test
> >  # - drop packets and verify the right drop reason is reported
> >  test_drop_reason() {
> > diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > index f8b5362aac8c..44fdeb9491a2 100644
> > --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py
> > @@ -27,8 +27,10 @@ try:
> >      from pyroute2.netlink import genlmsg
> >      from pyroute2.netlink import nla
> >      from pyroute2.netlink import nlmsg_atoms
> > -    from pyroute2.netlink.exceptions import NetlinkError
> > +    from pyroute2.netlink.event import EventSocket
> >      from pyroute2.netlink.generic import GenericNetlinkSocket
> > +    from pyroute2.netlink.nlsocket import Marshal
> > +    from pyroute2.netlink.exceptions import NetlinkError
>
> Why did this get moved?
>

I guess I first removed it and then added it in the wrong order. I'll
restore it, thanks.

> >      import pyroute2
> >
> >  except ModuleNotFoundError:
> > @@ -575,13 +577,27 @@ class ovsactions(nla):
> >                  print_str += "userdata="
> >                  for f in self.get_attr("OVS_USERSPACE_ATTR_USERDATA"):
> >                      print_str += "%x." % f
> > -            if self.get_attr("OVS_USERSPACE_ATTR_TUN_PORT") is not None:
> > +            if self.get_attr("OVS_USERSPACE_ATTR_EGRESS_TUN_PORT") is not None:
> >                  print_str += "egress_tun_port=%d" % self.get_attr(
> > -                    "OVS_USERSPACE_ATTR_TUN_PORT"
> > +                    "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT"
>
> Looks like a bugfix here.
>

Yep. I'll fold it in with the rest of the usersapce action support into
an independent patch.

> >                  )
> >              print_str += ")"
> >              return print_str
> >
> > +        def parse(self, actstr):
> > +            attrs_desc = (
> > +                ("pid", "OVS_USERSPACE_ATTR_PID", int),
> > +                ("userdata", "OVS_USERSPACE_ATTR_USERDATA",
> > +                    lambda x: list(bytearray.fromhex(x))),
> > +                ("egress_tun_port", "OVS_USERSPACE_ATTR_EGRESS_TUN_PORT", int)
> > +            )
> > +
> > +            attrs, actstr = parse_attrs(actstr, attrs_desc)
> > +            for attr in attrs:
> > +                self["attrs"].append(attr)
> > +
> > +            return actstr
> > +
> >      def dpstr(self, more=False):
> >          print_str = ""
> >
> > @@ -803,6 +819,25 @@ class ovsactions(nla):
> >                  self["attrs"].append(["OVS_ACTION_ATTR_EMIT_SAMPLE", emitact])
> >                  parsed = True
> >
> > +            elif parse_starts_block(actstr, "userspace(", False):
> > +                uact = self.userspace()
> > +                actstr = uact.parse(actstr[len("userpsace(") : ])
> > +                self["attrs"].append(["OVS_ACTION_ATTR_USERSPACE", uact])
> > +                parsed = True
> > +
> > +            elif parse_starts_block(actstr, "trunc", False):
>
> This should be "trunc("
>

Probably, yes. The rest of the actions do look for the initial "(".
Thinking in a generalization of the action parsing, could we just look
for the action name, i.e: "trunc" and let errors be raised inside each
parsing logic? In this case, check if "val is not None" after
"parse_extract_field".

> > +                parencount += 1
> > +                actstr, val = parse_extract_field(
> > +                    actstr,
> > +                    "trunc(",
> > +                    r"([0-9]+)",
> > +                    int,
> > +                    False,
> > +                    None,
> > +                )
> > +                self["attrs"].append(["OVS_ACTION_ATTR_TRUNC", val])
> > +                parsed = True
> > +
> >              actstr = actstr[strspn(actstr, ", ") :]
> >              while parencount > 0:
> >                  parencount -= 1
> > @@ -2184,10 +2219,70 @@ class OvsFlow(GenericNetlinkSocket):
> >          print("MISS upcall[%d/%s]: %s" % (seq, pktpres, keystr), flush=True)
> >
> >      def execute(self, packetmsg):
> > -        print("userspace execute command")
> > +        print("userspace execute command", flush=True)
> >
> >      def action(self, packetmsg):
> > -        print("userspace action command")
> > +        print("userspace action command", flush=True)
> > +
> > +
> > +class psample_sample(genlmsg):
> > +    nla_map = (
> > +        ("PSAMPLE_ATTR_IIFINDEX", "none"),
> > +        ("PSAMPLE_ATTR_OIFINDEX", "none"),
> > +        ("PSAMPLE_ATTR_ORIGSIZE", "none"),
> > +        ("PSAMPLE_ATTR_SAMPLE_GROUP", "uint32"),
> > +        ("PSAMPLE_ATTR_GROUP_SEQ", "none"),
> > +        ("PSAMPLE_ATTR_SAMPLE_RATE", "uint32"),
> > +        ("PSAMPLE_ATTR_DATA", "array(uint8)"),
> > +        ("PSAMPLE_ATTR_GROUP_REFCOUNT", "none"),
> > +        ("PSAMPLE_ATTR_TUNNEL", "none"),
> > +        ("PSAMPLE_ATTR_PAD", "none"),
> > +        ("PSAMPLE_ATTR_OUT_TC", "none"),
> > +        ("PSAMPLE_ATTR_OUT_TC_OCC", "none"),
> > +        ("PSAMPLE_ATTR_LATENCY", "none"),
> > +        ("PSAMPLE_ATTR_TIMESTAMP", "none"),
> > +        ("PSAMPLE_ATTR_PROTO", "none"),
> > +        ("PSAMPLE_ATTR_USER_COOKIE", "array(uint8)"),
> > +    )
> > +
> > +    def dpstr(self):
> > +        fields = []
> > +        data = ""
> > +        for (attr, value) in self["attrs"]:
> > +            if attr == "PSAMPLE_ATTR_SAMPLE_GROUP":
> > +                fields.append("group:%d" % value)
> > +            if attr == "PSAMPLE_ATTR_SAMPLE_RATE":
> > +                fields.append("rate:%d" % value)
> > +            if attr == "PSAMPLE_ATTR_USER_COOKIE":
> > +                value = "".join(format(x, "02x") for x in value)
> > +                fields.append("cookie:%s" % value)
> > +            if attr == "PSAMPLE_ATTR_DATA" and len(value) > 0:
> > +                data = "data:%s" % "".join(format(x, "02x") for x in value)
> > +
> > +        return ("%s %s" % (",".join(fields), data)).strip()
> > +
> > +
> > +class psample_msg(Marshal):
> > +    PSAMPLE_CMD_SAMPLE = 0
> > +    PSAMPLE_CMD_GET_GROUP = 1
> > +    PSAMPLE_CMD_NEW_GROUP = 2
> > +    PSAMPLE_CMD_DEL_GROUP = 3
> > +    PSAMPLE_CMD_SET_FILTER = 4
> > +    msg_map = {PSAMPLE_CMD_SAMPLE: psample_sample}
> > +
> > +
> > +class Psample(EventSocket):
> > +    genl_family = "psample"
> > +    mcast_groups = ["packets"]
> > +    marshal_class = psample_msg
> > +
> > +    def read_samples(self):
> > +        while True:
> > +            try:
> > +                for msg in self.get():
> > +                    print(msg.dpstr(), flush=True)
> > +            except NetlinkError as ne:
> > +                raise ne
> >
> >
> >  def print_ovsdp_full(dp_lookup_rep, ifindex, ndb=NDB(), vpl=OvsVport()):
> > @@ -2247,7 +2342,7 @@ def main(argv):
> >          help="Increment 'verbose' output counter.",
> >          default=0,
> >      )
> > -    subparsers = parser.add_subparsers()
> > +    subparsers = parser.add_subparsers(dest="subcommand")
> >
> >      showdpcmd = subparsers.add_parser("show")
> >      showdpcmd.add_argument(
> > @@ -2304,6 +2399,8 @@ def main(argv):
> >      delfscmd = subparsers.add_parser("del-flows")
> >      delfscmd.add_argument("flsbr", help="Datapath name")
> >
> > +    subparsers.add_parser("psample")
> > +
> >      args = parser.parse_args()
> >
> >      if args.verbose > 0:
> > @@ -2318,6 +2415,9 @@ def main(argv):
> >
> >      sys.setrecursionlimit(100000)
> >
> > +    if args.subcommand == "psample":
> > +        Psample().read_samples()
> > +
> >      if hasattr(args, "showdp"):
> >          found = False
> >          for iface in ndb.interfaces:
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample
  2024-06-03 18:56 ` [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample Adrian Moreno
  2024-06-14 16:14   ` Simon Horman
@ 2024-06-17 10:00   ` Ilya Maximets
  2024-06-18  7:38     ` Adrián Moreno
  1 sibling, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-17 10:00 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: i.maximets, aconole, echaudro, horms, dev, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-kernel

On 6/3/24 20:56, Adrian Moreno wrote:
> If the action has a user_cookie, pass it along to the sample so it can
> be easily identified.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  net/sched/act_sample.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
> 
> diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
> index a69b53d54039..5c3f86ec964a 100644
> --- a/net/sched/act_sample.c
> +++ b/net/sched/act_sample.c
> @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
>  				     const struct tc_action *a,
>  				     struct tcf_result *res)
>  {
> +	u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};

Is it necessary to initialize these 16 bytes on every call?
Might be expensive.  We're passing the data length around,
so the uninitialized parts should not be accessed.

Best regards, Ilya Maximets.

>  	struct tcf_sample *s = to_sample(a);
>  	struct psample_group *psample_group;
>  	struct psample_metadata md = {};
> +	struct tc_cookie *user_cookie;
>  	int retval;
>  
>  	tcf_lastuse_update(&s->tcf_tm);
> @@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
>  		if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
>  			skb_push(skb, skb->mac_len);
>  
> +		rcu_read_lock();
> +		user_cookie = rcu_dereference(a->user_cookie);
> +		if (user_cookie) {
> +			memcpy(cookie_data, user_cookie->data,
> +			       user_cookie->len);
> +			md.user_cookie = cookie_data;
> +			md.user_cookie_len = user_cookie->len;
> +		}
> +		rcu_read_unlock();
> +
>  		md.trunc_size = s->truncate ? s->trunc_size : skb->len;
>  		psample_sample_packet(psample_group, skb, s->rate, &md);
>  


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 4/9] net: psample: allow using rate as probability
  2024-06-17  6:32     ` Adrián Moreno
@ 2024-06-17 10:30       ` Simon Horman
  0 siblings, 0 replies; 57+ messages in thread
From: Simon Horman @ 2024-06-17 10:30 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: netdev, aconole, echaudro, i.maximets, dev, Yotam Gigi,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, linux-kernel

On Mon, Jun 17, 2024 at 06:32:14AM +0000, Adrián Moreno wrote:
> On Fri, Jun 14, 2024 at 05:11:30PM GMT, Simon Horman wrote:
> > On Mon, Jun 03, 2024 at 08:56:38PM +0200, Adrian Moreno wrote:
> > > Although not explicitly documented in the psample module itself, the
> > > definition of PSAMPLE_ATTR_SAMPLE_RATE seems inherited from act_sample.
> > >
> > > Quoting tc-sample(8):
> > > "RATE of 100 will lead to an average of one sampled packet out of every
> > > 100 observed."
> > >
> > > With this semantics, the rates that we can express with an unsigned
> > > 32-bits number are very unevenly distributed and concentrated towards
> > > "sampling few packets".
> > > For example, we can express a probability of 2.32E-8% but we
> > > cannot express anything between 100% and 50%.
> > >
> > > For sampling applications that are capable of sampling a decent
> > > amount of packets, this sampling rate semantics is not very useful.
> > >
> > > Add a new flag to the uAPI that indicates that the sampling rate is
> > > expressed in scaled probability, this is:
> > > - 0 is 0% probability, no packets get sampled.
> > > - U32_MAX is 100% probability, all packets get sampled.
> > >
> > > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >
> > Hi Adrian,
> >
> > Would it be possible to add appropriate documentation for
> > rate - both the original ratio variant, and the new probability
> > variant - somewhere?
> >
> 
> Hi Simon, thanks for the suggestion. Would the uapi header be a good
> place for such documentation?

Hi Adrian,

I didn't look closely, but that does sound like a good place to me.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
                     ` (3 preceding siblings ...)
  2024-06-14 16:13   ` Simon Horman
@ 2024-06-17 10:44   ` Ilya Maximets
  2024-06-18  7:33     ` Adrián Moreno
  4 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-17 10:44 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: i.maximets, aconole, echaudro, horms, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On 6/3/24 20:56, Adrian Moreno wrote:
> Add support for a new action: emit_sample.
> 
> This action accepts a u32 group id and a variable-length cookie and uses
> the psample multicast group to make the packet available for
> observability.
> 
> The maximum length of the user-defined cookie is set to 16, same as
> tc_cookie, to discourage using cookies that will not be offloadable.
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
>  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
>  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
>  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
>  4 files changed, 124 insertions(+), 1 deletion(-)

Some nits below, beside ones already mentioned.

> 
> diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
> index 4fdfc6b5cae9..a7ab5593a24f 100644
> --- a/Documentation/netlink/specs/ovs_flow.yaml
> +++ b/Documentation/netlink/specs/ovs_flow.yaml
> @@ -727,6 +727,12 @@ attribute-sets:
>          name: dec-ttl
>          type: nest
>          nested-attributes: dec-ttl-attrs
> +      -
> +        name: emit-sample
> +        type: nest
> +        nested-attributes: emit-sample-attrs
> +        doc: |
> +          Sends a packet sample to psample for external observation.
>    -
>      name: tunnel-key-attrs
>      enum-name: ovs-tunnel-key-attr
> @@ -938,6 +944,17 @@ attribute-sets:
>        -
>          name: gbp
>          type: u32
> +  -
> +    name: emit-sample-attrs
> +    enum-name: ovs-emit-sample-attr
> +    name-prefix: ovs-emit-sample-attr-
> +    attributes:
> +      -
> +        name: group
> +        type: u32
> +      -
> +        name: cookie
> +        type: binary
>  
>  operations:
>    name-prefix: ovs-flow-cmd-
> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> index efc82c318fa2..a0e9dde0584a 100644
> --- a/include/uapi/linux/openvswitch.h
> +++ b/include/uapi/linux/openvswitch.h
> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>  };
>  #endif
>  
> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> +/**
> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> + * action.
> + *
> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> + * sample.
> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> + * user-defined metadata. The maximum length is 16 bytes.

s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/

> + *
> + * Sends the packet to the psample multicast group with the specified group and
> + * cookie. It is possible to combine this action with the
> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> + */
> +enum ovs_emit_sample_attr {
> +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> +	__OVS_EMIT_SAMPLE_ATTR_MAX
> +};
> +
> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> +
> +
>  /**
>   * enum ovs_action_attr - Action types.
>   *
> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>  
>  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>  				       * from userspace. */
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 964225580824..3b4dba0ded59 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -24,6 +24,11 @@
>  #include <net/checksum.h>
>  #include <net/dsfield.h>
>  #include <net/mpls.h>
> +
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +#include <net/psample.h>
> +#endif
> +
>  #include <net/sctp/checksum.h>
>  
>  #include "datapath.h"
> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>  	return 0;
>  }
>  
> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> +			       const struct sw_flow_key *key,
> +			       const struct nlattr *attr)
> +{
> +#if IS_ENABLED(CONFIG_PSAMPLE)
> +	struct psample_group psample_group = {};
> +	struct psample_metadata md = {};
> +	struct vport *input_vport;
> +	const struct nlattr *a;
> +	int rem;
> +
> +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> +	     a = nla_next(a, &rem)) {

Since the action is strictly validated, can use use nla_for_each_attr()
or nla_for_each_nested() ?

> +		switch (nla_type(a)) {
> +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> +			psample_group.group_num = nla_get_u32(a);
> +			break;
> +
> +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> +			md.user_cookie = nla_data(a);
> +			md.user_cookie_len = nla_len(a);
> +			break;
> +		}
> +	}
> +
> +	psample_group.net = ovs_dp_get_net(dp);
> +
> +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> +	if (!input_vport)
> +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);

We may need to check that we actually found the local port.

> +
> +	md.in_ifindex = input_vport->dev->ifindex;
> +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> +
> +	psample_sample_packet(&psample_group, skb, 0, &md);
> +#endif
> +
> +	return 0;
> +}
> +
>  /* Execute a list of actions against 'skb'. */
>  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			      struct sw_flow_key *key,
> @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  			ovs_kfree_skb_reason(skb, reason);
>  			return 0;
>  		}
> +
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> +			err = execute_emit_sample(dp, skb, key, a);
> +			OVS_CB(skb)->cutlen = 0;
> +			break;
>  		}
>  
>  		if (unlikely(err)) {
> diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> index f224d9bcea5e..eb59ff9c8154 100644
> --- a/net/openvswitch/flow_netlink.c
> +++ b/net/openvswitch/flow_netlink.c
> @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
>  		case OVS_ACTION_ATTR_TRUNC:
>  		case OVS_ACTION_ATTR_USERSPACE:
>  		case OVS_ACTION_ATTR_DROP:
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
>  			break;
>  
>  		case OVS_ACTION_ATTR_CT:
> @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
>  	/* Whenever new actions are added, the need to update this
>  	 * function should be considered.
>  	 */
> -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
> +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
>  
>  	if (!actions)
>  		return;
> @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
>  	return 0;
>  }
>  
> +static int validate_emit_sample(const struct nlattr *attr)
> +{
> +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
> +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
> +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
> +			.type = NLA_BINARY,
> +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE

Maybe add a trailing comma here as well, since it's not a one-line definition.
Just in case.

> +		},
> +	};
> +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];

One too many spaces                              ^^

> +	int err;
> +
> +	if (!IS_ENABLED(CONFIG_PSAMPLE))
> +		return -EOPNOTSUPP;
> +
> +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
> +			       NULL);
> +	if (err)
> +		return err;
> +
> +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
> +}
> +
>  static int copy_action(const struct nlattr *from,
>  		       struct sw_flow_actions **sfa, bool log)
>  {
> @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
>  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
>  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
> +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
>  		};
>  		const struct ovs_action_push_vlan *vlan;
>  		int type = nla_type(a);
> @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
>  				return -EINVAL;
>  			break;
>  
> +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> +			err = validate_emit_sample(a);
> +			if (err)
> +				return err;
> +			break;
> +
>  		default:
>  			OVS_NLERR(log, "Unknown Action type %d", type);
>  			return -EINVAL;


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-17  7:08     ` Adrián Moreno
@ 2024-06-17 11:26       ` Ilya Maximets
  2024-06-18  7:36         ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-17 11:26 UTC (permalink / raw)
  To: Adrián Moreno, Aaron Conole
  Cc: i.maximets, netdev, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On 6/17/24 09:08, Adrián Moreno wrote:
> On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
>> Adrian Moreno <amorenoz@redhat.com> writes:
>>
>>> The behavior of actions might not be the exact same if they are being
>>> executed inside a nested sample action. Store the probability of the
>>> parent sample action in the skb's cb area.
>>
>> What does that mean?
>>
> 
> Emit action, for instance, needs the probability so that psample
> consumers know what was the sampling rate applied. Also, the way we
> should inform about packet drops (via kfree_skb_reason) changes (see
> patch 7/9).
> 
>>> Use the probability in emit_sample to pass it down to psample.
>>>
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>  include/uapi/linux/openvswitch.h |  3 ++-
>>>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
>>>  net/openvswitch/datapath.h       |  3 +++
>>>  net/openvswitch/vport.c          |  1 +
>>>  4 files changed, 28 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
>>> index a0e9dde0584a..9d675725fa2b 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
>>>   * Actions are passed as nested attributes.
>>>   *
>>>   * Executes the specified actions with the given probability on a per-packet
>>> - * basis.
>>> + * basis. Nested actions will be able to access the probability value of the
>>> + * parent @OVS_ACTION_ATTR_SAMPLE.
>>>   */
>>>  enum ovs_sample_attr {
>>>  	OVS_SAMPLE_ATTR_UNSPEC,
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 3b4dba0ded59..33f6d93ba5e4 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>  	struct nlattr *sample_arg;
>>>  	int rem = nla_len(attr);
>>>  	const struct sample_arg *arg;
>>> +	u32 init_probability;
>>>  	bool clone_flow_key;
>>> +	int err;
>>>
>>>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
>>>  	sample_arg = nla_data(attr);
>>>  	arg = nla_data(sample_arg);
>>>  	actions = nla_next(sample_arg, &rem);
>>> +	init_probability = OVS_CB(skb)->probability;
>>>
>>>  	if ((arg->probability != U32_MAX) &&
>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>  		return 0;
>>>  	}
>>>
>>> +	if (init_probability) {
>>> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
>>> +					    arg->probability / U32_MAX);
>>> +	} else {
>>> +		OVS_CB(skb)->probability = arg->probability;
>>> +	}
>>> +
>>
>> I'm confused by this.  Eventually, integer arithmetic will practically
>> guarantee that nested sample() calls will go to 0.  So eventually, the
>> test above will be impossible to meet mathematically.
>>
>> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
>> have a positive probability count, and still be possible for
>> get_random_u32() call to match.
>>
> 
> Using OVS's probability semantics, we can express probabilities as low
> as (100/U32_MAX)% which is pretty low indeed. However, just because the
> probability of executing the action is low I don't think we should not
> report it.
> 
> Rethinking the integer arithmetics, it's true that we should avoid
> hitting zero on the division, eg: nesting 6x 1% sampling rates will make
> the result be zero which will make probability restoration fail on the
> way back. Threrefore, the new probability should be at least 1.
> 
> 
>> I'm not sure about this particular change.  Why do we need it?
>>
> 
> Why do we need to propagate the probability down to nested "sample"
> actions? or why do we need to store the probability in the cb area in
> the first place?
> 
> The former: Just for correctness as only storing the last one would be
> incorrect. Although I don't know of any use for nested "sample" actions.

I think, we can drop this for now.  All the user interfaces specify
the probability per action.  So, it should be fine to report the
probability of the action that emitted the sample without taking into
account the whole timeline of that packet.  Besides, packet can leave
OVS and go back loosing the metadata, so it will not actually be a
full solution anyway.  Single-action metadata is easier to define.

> The latter: To pass it down to psample so that sample receivers know how
> the sampling rate applied (and, e.g: do throughput estimations like OVS
> does with IPFIX).
> 
> 
>>>  	clone_flow_key = !arg->exec;
>>> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
>>> -			     clone_flow_key);
>>> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
>>> +			    clone_flow_key);
>>> +
>>> +	if (!last)
>>
>> Is this right?  Don't we only want to set the probability on the last
>> action?  Should the test be 'if (last)'?
>>
> 
> This is restoring the parent's probability after the actions in the
> current sample action have been executed.
> 
> If it was the last action there is no need to restore the probability
> back to the parent's (or zero if it's there's only one level) since no
> further action will require it. And more importantly, if it's the last
> action, the packet gets free'ed inside that "branch" so we must not
> access its memory.
> 
> 
>>> +		OVS_CB(skb)->probability = init_probability;
>>> +
>>> +	return err;
>>>  }
>>>
>>>  /* When 'last' is true, clone() should always consume the 'skb'.
>>> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>>  	struct psample_metadata md = {};
>>>  	struct vport *input_vport;
>>>  	const struct nlattr *a;
>>> +	u32 rate;
>>>  	int rem;
>>>
>>>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>>
>>>  	md.in_ifindex = input_vport->dev->ifindex;
>>>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
>>> +	md.rate_as_probability = 1;
>>> +
>>> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
>>>
>>> -	psample_sample_packet(&psample_group, skb, 0, &md);
>>> +	psample_sample_packet(&psample_group, skb, rate, &md);
>>>  #endif
>>>
>>>  	return 0;
>>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
>>> index 0cd29971a907..9ca6231ea647 100644
>>> --- a/net/openvswitch/datapath.h
>>> +++ b/net/openvswitch/datapath.h
>>> @@ -115,12 +115,15 @@ struct datapath {
>>>   * fragmented.
>>>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
>>>   * @cutlen: The number of bytes from the packet end to be removed.
>>> + * @probability: The sampling probability that was applied to this skb; 0 means
>>> + * no sampling has occurred; U32_MAX means 100% probability.
>>>   */
>>>  struct ovs_skb_cb {
>>>  	struct vport		*input_vport;
>>>  	u16			mru;
>>>  	u16			acts_origlen;
>>>  	u32			cutlen;
>>> +	u32			probability;
>>>  };
>>>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
>>>
>>> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
>>> index 972ae01a70f7..8732f6e51ae5 100644
>>> --- a/net/openvswitch/vport.c
>>> +++ b/net/openvswitch/vport.c
>>> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
>>>  	OVS_CB(skb)->input_vport = vport;
>>>  	OVS_CB(skb)->mru = 0;
>>>  	OVS_CB(skb)->cutlen = 0;
>>> +	OVS_CB(skb)->probability = 0;
>>>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
>>>  		u32 mark;
>>
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-03 18:56 ` [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample Adrian Moreno
  2024-06-14 16:17   ` Simon Horman
@ 2024-06-17 11:55   ` Ilya Maximets
  2024-06-17 12:10     ` Ilya Maximets
  1 sibling, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-17 11:55 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: i.maximets, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On 6/3/24 20:56, Adrian Moreno wrote:
> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> observability-oriented.
> 
> Apart from some corner case in which it's used a replacement of clone()
> for old kernels, it's really only used for sFlow, IPFIX and now,
> local emit_sample.
> 
> With this in mind, it doesn't make much sense to report
> OVS_DROP_LAST_ACTION inside sample actions.
> 
> For instance, if the flow:
> 
>   actions:sample(..,emit_sample(..)),2
> 
> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> confusing for users since the packet did reach its destination.
> 
> This patch makes internal action execution silently consume the skb
> instead of notifying a drop for this case.
> 
> Unfortunately, this patch does not remove all potential sources of
> confusion since, if the sample action itself is the last action, e.g:
> 
>     actions:sample(..,emit_sample(..))
> 
> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> 
> Sadly, this case is difficult to solve without breaking the
> optimization by which the skb is not cloned on last sample actions.
> But, given explicit drop actions are now supported, OVS can just add one
> after the last sample() and rewrite the flow as:
> 
>     actions:sample(..,emit_sample(..)),drop
> 
> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  net/openvswitch/actions.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index 33f6d93ba5e4..54fc1abcff95 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>  static struct action_flow_keys __percpu *flow_keys;
>  static DEFINE_PER_CPU(int, exec_actions_level);
>  
> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> +{
> +	/* Do not emit packet drops inside sample(). */
> +	if (OVS_CB(skb)->probability)
> +		consume_skb(skb);
> +	else
> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +}
> +
>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>   * space. Return NULL if out of key spaces.
>   */
> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>  	if ((arg->probability != U32_MAX) &&
>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>  		if (last)
> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +			ovs_drop_skb_last_action(skb);
>  		return 0;
>  	}
>  
> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>  		}
>  	}
>  
> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> +	ovs_drop_skb_last_action(skb);

I don't think I agree with this one.  If we have a sample() action with
a lot of different actions inside and we reached the end while the last
action didn't consume the skb, then we should report that.  E.g.
"sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.

The only actions that are actually consuming the skb are "output",
"userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
consuming the skb "naturally" by stealing it when it is the last action.
"userspace" has an explicit check to consume the skb if it is the last
action.  "emit_sample" should have the similar check.  It should likely
be added at the point of action introduction instead of having a separate
patch.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-17 11:55   ` Ilya Maximets
@ 2024-06-17 12:10     ` Ilya Maximets
  2024-06-18  7:00       ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-17 12:10 UTC (permalink / raw)
  To: Adrian Moreno, netdev
  Cc: i.maximets, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On 6/17/24 13:55, Ilya Maximets wrote:
> On 6/3/24 20:56, Adrian Moreno wrote:
>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>> observability-oriented.
>>
>> Apart from some corner case in which it's used a replacement of clone()
>> for old kernels, it's really only used for sFlow, IPFIX and now,
>> local emit_sample.
>>
>> With this in mind, it doesn't make much sense to report
>> OVS_DROP_LAST_ACTION inside sample actions.
>>
>> For instance, if the flow:
>>
>>   actions:sample(..,emit_sample(..)),2
>>
>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>> confusing for users since the packet did reach its destination.
>>
>> This patch makes internal action execution silently consume the skb
>> instead of notifying a drop for this case.
>>
>> Unfortunately, this patch does not remove all potential sources of
>> confusion since, if the sample action itself is the last action, e.g:
>>
>>     actions:sample(..,emit_sample(..))
>>
>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>
>> Sadly, this case is difficult to solve without breaking the
>> optimization by which the skb is not cloned on last sample actions.
>> But, given explicit drop actions are now supported, OVS can just add one
>> after the last sample() and rewrite the flow as:
>>
>>     actions:sample(..,emit_sample(..)),drop
>>
>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> ---
>>  net/openvswitch/actions.c | 13 +++++++++++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>> index 33f6d93ba5e4..54fc1abcff95 100644
>> --- a/net/openvswitch/actions.c
>> +++ b/net/openvswitch/actions.c
>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>  static struct action_flow_keys __percpu *flow_keys;
>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>  
>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>> +{
>> +	/* Do not emit packet drops inside sample(). */
>> +	if (OVS_CB(skb)->probability)
>> +		consume_skb(skb);
>> +	else
>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +}
>> +
>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>   * space. Return NULL if out of key spaces.
>>   */
>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>  	if ((arg->probability != U32_MAX) &&
>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>  		if (last)
>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +			ovs_drop_skb_last_action(skb);

Always consuming the skb at this point makes sense, since having smaple()
as a last action is a reasonable thing to have.  But this looks more like
a fix for the original drop reason patch set.

>>  		return 0;
>>  	}
>>  
>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>  		}
>>  	}
>>  
>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>> +	ovs_drop_skb_last_action(skb);
> 
> I don't think I agree with this one.  If we have a sample() action with
> a lot of different actions inside and we reached the end while the last
> action didn't consume the skb, then we should report that.  E.g.
> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> 
> The only actions that are actually consuming the skb are "output",
> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> consuming the skb "naturally" by stealing it when it is the last action.
> "userspace" has an explicit check to consume the skb if it is the last
> action.  "emit_sample" should have the similar check.  It should likely
> be added at the point of action introduction instead of having a separate
> patch.
> 
> Best regards, Ilya Maximets.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-17 12:10     ` Ilya Maximets
@ 2024-06-18  7:00       ` Adrián Moreno
  2024-06-18 10:22         ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18  7:00 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> On 6/17/24 13:55, Ilya Maximets wrote:
> > On 6/3/24 20:56, Adrian Moreno wrote:
> >> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >> observability-oriented.
> >>
> >> Apart from some corner case in which it's used a replacement of clone()
> >> for old kernels, it's really only used for sFlow, IPFIX and now,
> >> local emit_sample.
> >>
> >> With this in mind, it doesn't make much sense to report
> >> OVS_DROP_LAST_ACTION inside sample actions.
> >>
> >> For instance, if the flow:
> >>
> >>   actions:sample(..,emit_sample(..)),2
> >>
> >> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >> confusing for users since the packet did reach its destination.
> >>
> >> This patch makes internal action execution silently consume the skb
> >> instead of notifying a drop for this case.
> >>
> >> Unfortunately, this patch does not remove all potential sources of
> >> confusion since, if the sample action itself is the last action, e.g:
> >>
> >>     actions:sample(..,emit_sample(..))
> >>
> >> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>
> >> Sadly, this case is difficult to solve without breaking the
> >> optimization by which the skb is not cloned on last sample actions.
> >> But, given explicit drop actions are now supported, OVS can just add one
> >> after the last sample() and rewrite the flow as:
> >>
> >>     actions:sample(..,emit_sample(..)),drop
> >>
> >> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >> ---
> >>  net/openvswitch/actions.c | 13 +++++++++++--
> >>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >> index 33f6d93ba5e4..54fc1abcff95 100644
> >> --- a/net/openvswitch/actions.c
> >> +++ b/net/openvswitch/actions.c
> >> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>  static struct action_flow_keys __percpu *flow_keys;
> >>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>
> >> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >> +{
> >> +	/* Do not emit packet drops inside sample(). */
> >> +	if (OVS_CB(skb)->probability)
> >> +		consume_skb(skb);
> >> +	else
> >> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +}
> >> +
> >>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>   * space. Return NULL if out of key spaces.
> >>   */
> >> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>  	if ((arg->probability != U32_MAX) &&
> >>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>  		if (last)
> >> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +			ovs_drop_skb_last_action(skb);
>
> Always consuming the skb at this point makes sense, since having smaple()
> as a last action is a reasonable thing to have.  But this looks more like
> a fix for the original drop reason patch set.
>

I don't think consuming the skb at this point makes sense. It was very
intentionally changed to a drop since a very common use-case for
sampling is drop-sampling, i.e: replacing an empty action list (that
triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
that replacement should not have any effect on the number of
OVS_DROP_LAST_ACTION being reported as the packets are being treated in
the same way (only observed in one case).


> >>  		return 0;
> >>  	}
> >>
> >> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>  		}
> >>  	}
> >>
> >> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >> +	ovs_drop_skb_last_action(skb);
> >
> > I don't think I agree with this one.  If we have a sample() action with
> > a lot of different actions inside and we reached the end while the last
> > action didn't consume the skb, then we should report that.  E.g.
> > "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> > cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >

What is the use case for such action list? Having an action branch
executed randomly doesn't make sense to me if it's not some
observability thing (which IMHO should not trigger drops).

> > The only actions that are actually consuming the skb are "output",
> > "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> > consuming the skb "naturally" by stealing it when it is the last action.
> > "userspace" has an explicit check to consume the skb if it is the last
> > action.  "emit_sample" should have the similar check.  It should likely
> > be added at the point of action introduction instead of having a separate
> > patch.
> >

Unlinke "output", "recirc", "userspace", etc. with emit_sample the
packet does not continue it's way through the datapath.

It would be very confusing if OVS starts monitoring drops and adds a bunch
of flows such as "actions:emit_sample()" and suddently it stops reporting such
drops via standard kfree_skb_reason. Packets _are_ being dropped here,
we are just observing them.

And if we change emit_sample to trigger a drop if it's the last action,
then "sample(50%, emit_sample()),2" will trigger a drop half of the times
which is also terribly confusing.

I think we should try to be clear and informative with what we
_actually_ drop and not require the user that is just running
"dropwatch" to understand the internals of the OVS module.

So if you don't want to accept the "observational" nature of sample(),
the only other solution that does not bring even more confusion to OVS
drops would be to have userspace add explicit drop actions. WDYT?


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-17 10:44   ` Ilya Maximets
@ 2024-06-18  7:33     ` Adrián Moreno
  2024-06-18  9:47       ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18  7:33 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On Mon, Jun 17, 2024 at 12:44:45PM GMT, Ilya Maximets wrote:
> On 6/3/24 20:56, Adrian Moreno wrote:
> > Add support for a new action: emit_sample.
> >
> > This action accepts a u32 group id and a variable-length cookie and uses
> > the psample multicast group to make the packet available for
> > observability.
> >
> > The maximum length of the user-defined cookie is set to 16, same as
> > tc_cookie, to discourage using cookies that will not be offloadable.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
> >  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
> >  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
> >  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
> >  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
> >  4 files changed, 124 insertions(+), 1 deletion(-)
>
> Some nits below, beside ones already mentioned.
>

Thanks, Ilya.

> >
> > diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
> > index 4fdfc6b5cae9..a7ab5593a24f 100644
> > --- a/Documentation/netlink/specs/ovs_flow.yaml
> > +++ b/Documentation/netlink/specs/ovs_flow.yaml
> > @@ -727,6 +727,12 @@ attribute-sets:
> >          name: dec-ttl
> >          type: nest
> >          nested-attributes: dec-ttl-attrs
> > +      -
> > +        name: emit-sample
> > +        type: nest
> > +        nested-attributes: emit-sample-attrs
> > +        doc: |
> > +          Sends a packet sample to psample for external observation.
> >    -
> >      name: tunnel-key-attrs
> >      enum-name: ovs-tunnel-key-attr
> > @@ -938,6 +944,17 @@ attribute-sets:
> >        -
> >          name: gbp
> >          type: u32
> > +  -
> > +    name: emit-sample-attrs
> > +    enum-name: ovs-emit-sample-attr
> > +    name-prefix: ovs-emit-sample-attr-
> > +    attributes:
> > +      -
> > +        name: group
> > +        type: u32
> > +      -
> > +        name: cookie
> > +        type: binary
> >
> >  operations:
> >    name-prefix: ovs-flow-cmd-
> > diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> > index efc82c318fa2..a0e9dde0584a 100644
> > --- a/include/uapi/linux/openvswitch.h
> > +++ b/include/uapi/linux/openvswitch.h
> > @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
> >  };
> >  #endif
> >
> > +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
> > +/**
> > + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
> > + * action.
> > + *
> > + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
> > + * sample.
> > + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
> > + * user-defined metadata. The maximum length is 16 bytes.
>
> s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/
>
> > + *
> > + * Sends the packet to the psample multicast group with the specified group and
> > + * cookie. It is possible to combine this action with the
> > + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
> > + */
> > +enum ovs_emit_sample_attr {
> > +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
> > +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
> > +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
> > +	__OVS_EMIT_SAMPLE_ATTR_MAX
> > +};
> > +
> > +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
> > +
> > +
> >  /**
> >   * enum ovs_action_attr - Action types.
> >   *
> > @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
> >  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
> >  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
> >  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
> > +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
> >
> >  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
> >  				       * from userspace. */
> > diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> > index 964225580824..3b4dba0ded59 100644
> > --- a/net/openvswitch/actions.c
> > +++ b/net/openvswitch/actions.c
> > @@ -24,6 +24,11 @@
> >  #include <net/checksum.h>
> >  #include <net/dsfield.h>
> >  #include <net/mpls.h>
> > +
> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> > +#include <net/psample.h>
> > +#endif
> > +
> >  #include <net/sctp/checksum.h>
> >
> >  #include "datapath.h"
> > @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
> >  	return 0;
> >  }
> >
> > +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> > +			       const struct sw_flow_key *key,
> > +			       const struct nlattr *attr)
> > +{
> > +#if IS_ENABLED(CONFIG_PSAMPLE)
> > +	struct psample_group psample_group = {};
> > +	struct psample_metadata md = {};
> > +	struct vport *input_vport;
> > +	const struct nlattr *a;
> > +	int rem;
> > +
> > +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> > +	     a = nla_next(a, &rem)) {
>
> Since the action is strictly validated, can use use nla_for_each_attr()
> or nla_for_each_nested() ?
>

Probably, yes.

> > +		switch (nla_type(a)) {
> > +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
> > +			psample_group.group_num = nla_get_u32(a);
> > +			break;
> > +
> > +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
> > +			md.user_cookie = nla_data(a);
> > +			md.user_cookie_len = nla_len(a);
> > +			break;
> > +		}
> > +	}
> > +
> > +	psample_group.net = ovs_dp_get_net(dp);
> > +
> > +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
> > +	if (!input_vport)
> > +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
>
> We may need to check that we actually found the local port.
>

Sure. What can cause the local port not to exist?

> > +
> > +	md.in_ifindex = input_vport->dev->ifindex;
> > +	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> > +
> > +	psample_sample_packet(&psample_group, skb, 0, &md);
> > +#endif
> > +
> > +	return 0;
> > +}
> > +
> >  /* Execute a list of actions against 'skb'. */
> >  static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >  			      struct sw_flow_key *key,
> > @@ -1502,6 +1547,11 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >  			ovs_kfree_skb_reason(skb, reason);
> >  			return 0;
> >  		}
> > +
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> > +			err = execute_emit_sample(dp, skb, key, a);
> > +			OVS_CB(skb)->cutlen = 0;
> > +			break;
> >  		}
> >
> >  		if (unlikely(err)) {
> > diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
> > index f224d9bcea5e..eb59ff9c8154 100644
> > --- a/net/openvswitch/flow_netlink.c
> > +++ b/net/openvswitch/flow_netlink.c
> > @@ -64,6 +64,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
> >  		case OVS_ACTION_ATTR_TRUNC:
> >  		case OVS_ACTION_ATTR_USERSPACE:
> >  		case OVS_ACTION_ATTR_DROP:
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> >  			break;
> >
> >  		case OVS_ACTION_ATTR_CT:
> > @@ -2409,7 +2410,7 @@ static void ovs_nla_free_nested_actions(const struct nlattr *actions, int len)
> >  	/* Whenever new actions are added, the need to update this
> >  	 * function should be considered.
> >  	 */
> > -	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 24);
> > +	BUILD_BUG_ON(OVS_ACTION_ATTR_MAX != 25);
> >
> >  	if (!actions)
> >  		return;
> > @@ -3157,6 +3158,29 @@ static int validate_and_copy_check_pkt_len(struct net *net,
> >  	return 0;
> >  }
> >
> > +static int validate_emit_sample(const struct nlattr *attr)
> > +{
> > +	static const struct nla_policy policy[OVS_EMIT_SAMPLE_ATTR_MAX + 1] = {
> > +		[OVS_EMIT_SAMPLE_ATTR_GROUP] = { .type = NLA_U32 },
> > +		[OVS_EMIT_SAMPLE_ATTR_COOKIE] = {
> > +			.type = NLA_BINARY,
> > +			.len = OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE
>
> Maybe add a trailing comma here as well, since it's not a one-line definition.
> Just in case.
>

Sure.

> > +		},
> > +	};
> > +	struct nlattr *a[OVS_EMIT_SAMPLE_ATTR_MAX  + 1];
>
> One too many spaces                              ^^
>

Thanks.

> > +	int err;
> > +
> > +	if (!IS_ENABLED(CONFIG_PSAMPLE))
> > +		return -EOPNOTSUPP;
> > +
> > +	err = nla_parse_nested(a, OVS_EMIT_SAMPLE_ATTR_MAX, attr, policy,
> > +			       NULL);
> > +	if (err)
> > +		return err;
> > +
> > +	return a[OVS_EMIT_SAMPLE_ATTR_GROUP] ? 0 : -EINVAL;
> > +}
> > +
> >  static int copy_action(const struct nlattr *from,
> >  		       struct sw_flow_actions **sfa, bool log)
> >  {
> > @@ -3212,6 +3236,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >  			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
> >  			[OVS_ACTION_ATTR_DEC_TTL] = (u32)-1,
> >  			[OVS_ACTION_ATTR_DROP] = sizeof(u32),
> > +			[OVS_ACTION_ATTR_EMIT_SAMPLE] = (u32)-1,
> >  		};
> >  		const struct ovs_action_push_vlan *vlan;
> >  		int type = nla_type(a);
> > @@ -3490,6 +3515,12 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
> >  				return -EINVAL;
> >  			break;
> >
> > +		case OVS_ACTION_ATTR_EMIT_SAMPLE:
> > +			err = validate_emit_sample(a);
> > +			if (err)
> > +				return err;
> > +			break;
> > +
> >  		default:
> >  			OVS_NLERR(log, "Unknown Action type %d", type);
> >  			return -EINVAL;
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb.
  2024-06-17 11:26       ` Ilya Maximets
@ 2024-06-18  7:36         ` Adrián Moreno
  0 siblings, 0 replies; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18  7:36 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: Aaron Conole, netdev, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Mon, Jun 17, 2024 at 01:26:39PM GMT, Ilya Maximets wrote:
> On 6/17/24 09:08, Adrián Moreno wrote:
> > On Fri, Jun 14, 2024 at 12:55:59PM GMT, Aaron Conole wrote:
> >> Adrian Moreno <amorenoz@redhat.com> writes:
> >>
> >>> The behavior of actions might not be the exact same if they are being
> >>> executed inside a nested sample action. Store the probability of the
> >>> parent sample action in the skb's cb area.
> >>
> >> What does that mean?
> >>
> >
> > Emit action, for instance, needs the probability so that psample
> > consumers know what was the sampling rate applied. Also, the way we
> > should inform about packet drops (via kfree_skb_reason) changes (see
> > patch 7/9).
> >
> >>> Use the probability in emit_sample to pass it down to psample.
> >>>
> >>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>> ---
> >>>  include/uapi/linux/openvswitch.h |  3 ++-
> >>>  net/openvswitch/actions.c        | 25 ++++++++++++++++++++++---
> >>>  net/openvswitch/datapath.h       |  3 +++
> >>>  net/openvswitch/vport.c          |  1 +
> >>>  4 files changed, 28 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
> >>> index a0e9dde0584a..9d675725fa2b 100644
> >>> --- a/include/uapi/linux/openvswitch.h
> >>> +++ b/include/uapi/linux/openvswitch.h
> >>> @@ -649,7 +649,8 @@ enum ovs_flow_attr {
> >>>   * Actions are passed as nested attributes.
> >>>   *
> >>>   * Executes the specified actions with the given probability on a per-packet
> >>> - * basis.
> >>> + * basis. Nested actions will be able to access the probability value of the
> >>> + * parent @OVS_ACTION_ATTR_SAMPLE.
> >>>   */
> >>>  enum ovs_sample_attr {
> >>>  	OVS_SAMPLE_ATTR_UNSPEC,
> >>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>> index 3b4dba0ded59..33f6d93ba5e4 100644
> >>> --- a/net/openvswitch/actions.c
> >>> +++ b/net/openvswitch/actions.c
> >>> @@ -1048,12 +1048,15 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>  	struct nlattr *sample_arg;
> >>>  	int rem = nla_len(attr);
> >>>  	const struct sample_arg *arg;
> >>> +	u32 init_probability;
> >>>  	bool clone_flow_key;
> >>> +	int err;
> >>>
> >>>  	/* The first action is always 'OVS_SAMPLE_ATTR_ARG'. */
> >>>  	sample_arg = nla_data(attr);
> >>>  	arg = nla_data(sample_arg);
> >>>  	actions = nla_next(sample_arg, &rem);
> >>> +	init_probability = OVS_CB(skb)->probability;
> >>>
> >>>  	if ((arg->probability != U32_MAX) &&
> >>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>> @@ -1062,9 +1065,21 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>  		return 0;
> >>>  	}
> >>>
> >>> +	if (init_probability) {
> >>> +		OVS_CB(skb)->probability = ((u64)OVS_CB(skb)->probability *
> >>> +					    arg->probability / U32_MAX);
> >>> +	} else {
> >>> +		OVS_CB(skb)->probability = arg->probability;
> >>> +	}
> >>> +
> >>
> >> I'm confused by this.  Eventually, integer arithmetic will practically
> >> guarantee that nested sample() calls will go to 0.  So eventually, the
> >> test above will be impossible to meet mathematically.
> >>
> >> OTOH, you could argue that a 1% of 50% is low anyway, but it still would
> >> have a positive probability count, and still be possible for
> >> get_random_u32() call to match.
> >>
> >
> > Using OVS's probability semantics, we can express probabilities as low
> > as (100/U32_MAX)% which is pretty low indeed. However, just because the
> > probability of executing the action is low I don't think we should not
> > report it.
> >
> > Rethinking the integer arithmetics, it's true that we should avoid
> > hitting zero on the division, eg: nesting 6x 1% sampling rates will make
> > the result be zero which will make probability restoration fail on the
> > way back. Threrefore, the new probability should be at least 1.
> >
> >
> >> I'm not sure about this particular change.  Why do we need it?
> >>
> >
> > Why do we need to propagate the probability down to nested "sample"
> > actions? or why do we need to store the probability in the cb area in
> > the first place?
> >
> > The former: Just for correctness as only storing the last one would be
> > incorrect. Although I don't know of any use for nested "sample" actions.
>
> I think, we can drop this for now.  All the user interfaces specify
> the probability per action.  So, it should be fine to report the
> probability of the action that emitted the sample without taking into
> account the whole timeline of that packet.  Besides, packet can leave
> OVS and go back loosing the metadata, so it will not actually be a
> full solution anyway.  Single-action metadata is easier to define.
>

Sure, I guess we can drop it, I don't think there is a use case for nested
samples anyway.

> > The latter: To pass it down to psample so that sample receivers know how
> > the sampling rate applied (and, e.g: do throughput estimations like OVS
> > does with IPFIX).
> >
> >
> >>>  	clone_flow_key = !arg->exec;
> >>> -	return clone_execute(dp, skb, key, 0, actions, rem, last,
> >>> -			     clone_flow_key);
> >>> +	err = clone_execute(dp, skb, key, 0, actions, rem, last,
> >>> +			    clone_flow_key);
> >>> +
> >>> +	if (!last)
> >>
> >> Is this right?  Don't we only want to set the probability on the last
> >> action?  Should the test be 'if (last)'?
> >>
> >
> > This is restoring the parent's probability after the actions in the
> > current sample action have been executed.
> >
> > If it was the last action there is no need to restore the probability
> > back to the parent's (or zero if it's there's only one level) since no
> > further action will require it. And more importantly, if it's the last
> > action, the packet gets free'ed inside that "branch" so we must not
> > access its memory.
> >
> >
> >>> +		OVS_CB(skb)->probability = init_probability;
> >>> +
> >>> +	return err;
> >>>  }
> >>>
> >>>  /* When 'last' is true, clone() should always consume the 'skb'.
> >>> @@ -1313,6 +1328,7 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >>>  	struct psample_metadata md = {};
> >>>  	struct vport *input_vport;
> >>>  	const struct nlattr *a;
> >>> +	u32 rate;
> >>>  	int rem;
> >>>
> >>>  	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
> >>> @@ -1337,8 +1353,11 @@ static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
> >>>
> >>>  	md.in_ifindex = input_vport->dev->ifindex;
> >>>  	md.trunc_size = skb->len - OVS_CB(skb)->cutlen;
> >>> +	md.rate_as_probability = 1;
> >>> +
> >>> +	rate = OVS_CB(skb)->probability ? OVS_CB(skb)->probability : U32_MAX;
> >>>
> >>> -	psample_sample_packet(&psample_group, skb, 0, &md);
> >>> +	psample_sample_packet(&psample_group, skb, rate, &md);
> >>>  #endif
> >>>
> >>>  	return 0;
> >>> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> >>> index 0cd29971a907..9ca6231ea647 100644
> >>> --- a/net/openvswitch/datapath.h
> >>> +++ b/net/openvswitch/datapath.h
> >>> @@ -115,12 +115,15 @@ struct datapath {
> >>>   * fragmented.
> >>>   * @acts_origlen: The netlink size of the flow actions applied to this skb.
> >>>   * @cutlen: The number of bytes from the packet end to be removed.
> >>> + * @probability: The sampling probability that was applied to this skb; 0 means
> >>> + * no sampling has occurred; U32_MAX means 100% probability.
> >>>   */
> >>>  struct ovs_skb_cb {
> >>>  	struct vport		*input_vport;
> >>>  	u16			mru;
> >>>  	u16			acts_origlen;
> >>>  	u32			cutlen;
> >>> +	u32			probability;
> >>>  };
> >>>  #define OVS_CB(skb) ((struct ovs_skb_cb *)(skb)->cb)
> >>>
> >>> diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c
> >>> index 972ae01a70f7..8732f6e51ae5 100644
> >>> --- a/net/openvswitch/vport.c
> >>> +++ b/net/openvswitch/vport.c
> >>> @@ -500,6 +500,7 @@ int ovs_vport_receive(struct vport *vport, struct sk_buff *skb,
> >>>  	OVS_CB(skb)->input_vport = vport;
> >>>  	OVS_CB(skb)->mru = 0;
> >>>  	OVS_CB(skb)->cutlen = 0;
> >>> +	OVS_CB(skb)->probability = 0;
> >>>  	if (unlikely(dev_net(skb->dev) != ovs_dp_get_net(vport->dp))) {
> >>>  		u32 mark;
> >>
> >
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample
  2024-06-17 10:00   ` Ilya Maximets
@ 2024-06-18  7:38     ` Adrián Moreno
  2024-06-18  9:42       ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18  7:38 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, linux-kernel

On Mon, Jun 17, 2024 at 12:00:04PM GMT, Ilya Maximets wrote:
> On 6/3/24 20:56, Adrian Moreno wrote:
> > If the action has a user_cookie, pass it along to the sample so it can
> > be easily identified.
> >
> > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > ---
> >  net/sched/act_sample.c | 12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
> > index a69b53d54039..5c3f86ec964a 100644
> > --- a/net/sched/act_sample.c
> > +++ b/net/sched/act_sample.c
> > @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
> >  				     const struct tc_action *a,
> >  				     struct tcf_result *res)
> >  {
> > +	u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
>
> Is it necessary to initialize these 16 bytes on every call?
> Might be expensive.  We're passing the data length around,
> so the uninitialized parts should not be accessed.
>

They "should" not, indeed. I was just trying to be extra careful.
Are you worried TC_COOKIE_MAX_SIZE could grow or the cycles needed to
clear the current 16 bytes?

> Best regards, Ilya Maximets.
>
> >  	struct tcf_sample *s = to_sample(a);
> >  	struct psample_group *psample_group;
> >  	struct psample_metadata md = {};
> > +	struct tc_cookie *user_cookie;
> >  	int retval;
> >
> >  	tcf_lastuse_update(&s->tcf_tm);
> > @@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
> >  		if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
> >  			skb_push(skb, skb->mac_len);
> >
> > +		rcu_read_lock();
> > +		user_cookie = rcu_dereference(a->user_cookie);
> > +		if (user_cookie) {
> > +			memcpy(cookie_data, user_cookie->data,
> > +			       user_cookie->len);
> > +			md.user_cookie = cookie_data;
> > +			md.user_cookie_len = user_cookie->len;
> > +		}
> > +		rcu_read_unlock();
> > +
> >  		md.trunc_size = s->truncate ? s->trunc_size : skb->len;
> >  		psample_sample_packet(psample_group, skb, s->rate, &md);
> >
>


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-17  7:18     ` Adrián Moreno
@ 2024-06-18  9:08       ` Adrián Moreno
  2024-06-18 13:27         ` Aaron Conole
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18  9:08 UTC (permalink / raw)
  To: Aaron Conole
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

On Mon, Jun 17, 2024 at 07:18:05AM GMT, Adrián Moreno wrote:
> On Fri, Jun 14, 2024 at 01:07:33PM GMT, Aaron Conole wrote:
> > Adrian Moreno <amorenoz@redhat.com> writes:
> >
> > > Add a test to verify sampling packets via psample works.
> > >
> > > In order to do that, create a subcommand in ovs-dpctl.py to listen to
> > > on the psample multicast group and print samples.
> > >
> > > In order to also test simultaneous sFlow and psample actions and
> > > packet truncation, add missing parsing support for "userspace" and
> > > "trunc" actions.
> >
> > Maybe split that into a separate patch.  This has a bugfix and 3
> > features being pushed in.  I know it's already getting long as a series,
> > so maybe it's okay to fold the userspace attribute bugfix with the parse
> > support (since it wasn't really usable before).
> >
>
> OK. Sounds reasonable.
>
> > > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> > > ---
> > >  .../selftests/net/openvswitch/openvswitch.sh  |  99 +++++++++++++++-
> > >  .../selftests/net/openvswitch/ovs-dpctl.py    | 112 +++++++++++++++++-
> > >  2 files changed, 204 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > > index 5cae53543849..f6e0ae3f6424 100755
> > > --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > > +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
> > > @@ -20,7 +20,8 @@ tests="
> > >  	nat_related_v4				ip4-nat-related: ICMP related matches work with SNAT
> > >  	netlink_checks				ovsnl: validate netlink attrs and settings
> > >  	upcall_interfaces			ovs: test the upcall interfaces
> > > -	drop_reason				drop: test drop reasons are emitted"
> > > +	drop_reason				drop: test drop reasons are emitted
> > > +	emit_sample 				emit_sample: Sampling packets with psample"
> > >
> > >  info() {
> > >      [ $VERBOSE = 0 ] || echo $*
> > > @@ -170,6 +171,19 @@ ovs_drop_reason_count()
> > >  	return `echo "$perf_output" | grep "$pattern" | wc -l`
> > >  }
> > >
> > > +ovs_test_flow_fails () {
> > > +	ERR_MSG="Flow actions may not be safe on all matching packets"
> > > +
> > > +	PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
> > > +	ovs_add_flow $@ &> /dev/null $@ && return 1
> > > +	POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
> > > +
> > > +	if [ "$PRE_TEST" == "$POST_TEST" ]; then
> > > +		return 1
> > > +	fi
> > > +	return 0
> > > +}
> > > +
> > >  usage() {
> > >  	echo
> > >  	echo "$0 [OPTIONS] [TEST]..."
> > > @@ -184,6 +198,89 @@ usage() {
> > >  	exit 1
> > >  }
> > >
> > > +
> > > +# emit_sample test
> > > +# - use emit_sample to observe packets
> > > +test_emit_sample() {
> > > +	sbx_add "test_emit_sample" || return $?
> > > +
> > > +	# Add a datapath with per-vport dispatching.
> > > +	ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
> > > +
> > > +	info "create namespaces"
> > > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> > > +		client c0 c1 172.31.110.10/24 -u || return 1
> > > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
> > > +		server s0 s1 172.31.110.20/24 -u || return 1
> > > +
> > > +	# Check if emit_sample actions can be configured.
> > > +	ovs_add_flow "test_emit_sample" emit_sample \
> > > +	'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
> > > +	if [ $? == 1 ]; then
> > > +		info "no support for emit_sample - skipping"
> > > +		ovs_exit_sig
> > > +		return $ksft_skip
> > > +	fi
> > > +
> > > +	ovs_del_flows "test_emit_sample" emit_sample
> > > +
> > > +	# Allow ARP
> > > +	ovs_add_flow "test_emit_sample" emit_sample \
> > > +		'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
> > > +	ovs_add_flow "test_emit_sample" emit_sample \
> > > +		'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
> > > +
> > > +	# Test action verification.
> > > +	OLDIFS=$IFS
> > > +	IFS='*'
> > > +	min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
> > > +	for testcase in \
> > > +		"cookie to large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
> > > +		"no group with cookie"*"emit_sample(cookie=abcd)" \
> > > +		"no group"*"sample()";
> > > +	do
> > > +		set -- $testcase;
> > > +		ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
> > > +		if [ $? == 1 ]; then
> > > +			info "failed - $1"
> > > +			return 1
> > > +		fi
> > > +	done
> > > +	IFS=$OLDIFS
> > > +
> > > +	# Sample first 14 bytes of all traffic.
> > > +	ovs_add_flow "test_emit_sample" emit_sample \
> > > +	"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" "trunc(14),emit_sample(group=1,cookie=c0ffee),2"
> > > +
> > > +	# Sample all traffic. In this case, use a sample() action with both
> > > +	# emit_sample and an upcall emulating simultaneous local sampling and
> > > +	# sFlow / IPFIX.
> > > +	nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | cut -d ":" -f 2 | tr -d ' ')
> > > +	ovs_add_flow "test_emit_sample" emit_sample \
> > > +	"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" "sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
> > > +
> > > +	# Record emit_sample data.
> > > +	python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 2>$ovs_dir/psample.err &
> > > +	pid=$!
> > > +	on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"
> >
> >   Maybe ovs_netns_spawn_daemon ?
> >
>
> I'll take a look at it, thanks.
>

I've looked into ovs_netns_spawn_daemon and I think it'll not be useful
for this command since it needs to run in the default namespace. I can
add a new "ovs_spawn_daemon" so it's reusable. WDYT?

> [...]


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample
  2024-06-18  7:38     ` Adrián Moreno
@ 2024-06-18  9:42       ` Ilya Maximets
  0 siblings, 0 replies; 57+ messages in thread
From: Ilya Maximets @ 2024-06-18  9:42 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev,
	Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, linux-kernel

On 6/18/24 09:38, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 12:00:04PM GMT, Ilya Maximets wrote:
>> On 6/3/24 20:56, Adrian Moreno wrote:
>>> If the action has a user_cookie, pass it along to the sample so it can
>>> be easily identified.
>>>
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>  net/sched/act_sample.c | 12 ++++++++++++
>>>  1 file changed, 12 insertions(+)
>>>
>>> diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
>>> index a69b53d54039..5c3f86ec964a 100644
>>> --- a/net/sched/act_sample.c
>>> +++ b/net/sched/act_sample.c
>>> @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
>>>  				     const struct tc_action *a,
>>>  				     struct tcf_result *res)
>>>  {
>>> +	u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
>>
>> Is it necessary to initialize these 16 bytes on every call?
>> Might be expensive.  We're passing the data length around,
>> so the uninitialized parts should not be accessed.
>>
> 
> They "should" not, indeed. I was just trying to be extra careful.
> Are you worried TC_COOKIE_MAX_SIZE could grow or the cycles needed to
> clear the current 16 bytes?

I'm assuming that any extra cycles spent per packet are undesirable,
so should be avoided, if possible.  Even if we save 1-2 cycles per
packet, it's a lot when we talk about millions of packets per second.

In this particular case, it seems, we do not sacrifice anything, so
it's just a couple of cycles back for free.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-18  7:33     ` Adrián Moreno
@ 2024-06-18  9:47       ` Ilya Maximets
  2024-06-18 10:08         ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-18  9:47 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On 6/18/24 09:33, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 12:44:45PM GMT, Ilya Maximets wrote:
>> On 6/3/24 20:56, Adrian Moreno wrote:
>>> Add support for a new action: emit_sample.
>>>
>>> This action accepts a u32 group id and a variable-length cookie and uses
>>> the psample multicast group to make the packet available for
>>> observability.
>>>
>>> The maximum length of the user-defined cookie is set to 16, same as
>>> tc_cookie, to discourage using cookies that will not be offloadable.
>>>
>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>> ---
>>>  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
>>>  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
>>>  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
>>>  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
>>>  4 files changed, 124 insertions(+), 1 deletion(-)
>>
>> Some nits below, beside ones already mentioned.
>>
> 
> Thanks, Ilya.
> 
>>>
>>> diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
>>> index 4fdfc6b5cae9..a7ab5593a24f 100644
>>> --- a/Documentation/netlink/specs/ovs_flow.yaml
>>> +++ b/Documentation/netlink/specs/ovs_flow.yaml
>>> @@ -727,6 +727,12 @@ attribute-sets:
>>>          name: dec-ttl
>>>          type: nest
>>>          nested-attributes: dec-ttl-attrs
>>> +      -
>>> +        name: emit-sample
>>> +        type: nest
>>> +        nested-attributes: emit-sample-attrs
>>> +        doc: |
>>> +          Sends a packet sample to psample for external observation.
>>>    -
>>>      name: tunnel-key-attrs
>>>      enum-name: ovs-tunnel-key-attr
>>> @@ -938,6 +944,17 @@ attribute-sets:
>>>        -
>>>          name: gbp
>>>          type: u32
>>> +  -
>>> +    name: emit-sample-attrs
>>> +    enum-name: ovs-emit-sample-attr
>>> +    name-prefix: ovs-emit-sample-attr-
>>> +    attributes:
>>> +      -
>>> +        name: group
>>> +        type: u32
>>> +      -
>>> +        name: cookie
>>> +        type: binary
>>>
>>>  operations:
>>>    name-prefix: ovs-flow-cmd-
>>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
>>> index efc82c318fa2..a0e9dde0584a 100644
>>> --- a/include/uapi/linux/openvswitch.h
>>> +++ b/include/uapi/linux/openvswitch.h
>>> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>>>  };
>>>  #endif
>>>
>>> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
>>> +/**
>>> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
>>> + * action.
>>> + *
>>> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
>>> + * sample.
>>> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
>>> + * user-defined metadata. The maximum length is 16 bytes.
>>
>> s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/
>>
>>> + *
>>> + * Sends the packet to the psample multicast group with the specified group and
>>> + * cookie. It is possible to combine this action with the
>>> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
>>> + */
>>> +enum ovs_emit_sample_attr {
>>> +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
>>> +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
>>> +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
>>> +	__OVS_EMIT_SAMPLE_ATTR_MAX
>>> +};
>>> +
>>> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
>>> +
>>> +
>>>  /**
>>>   * enum ovs_action_attr - Action types.
>>>   *
>>> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>>>  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>>>  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>>>  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
>>> +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>>>
>>>  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>>>  				       * from userspace. */
>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>> index 964225580824..3b4dba0ded59 100644
>>> --- a/net/openvswitch/actions.c
>>> +++ b/net/openvswitch/actions.c
>>> @@ -24,6 +24,11 @@
>>>  #include <net/checksum.h>
>>>  #include <net/dsfield.h>
>>>  #include <net/mpls.h>
>>> +
>>> +#if IS_ENABLED(CONFIG_PSAMPLE)
>>> +#include <net/psample.h>
>>> +#endif
>>> +
>>>  #include <net/sctp/checksum.h>
>>>
>>>  #include "datapath.h"
>>> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>>>  	return 0;
>>>  }
>>>
>>> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>> +			       const struct sw_flow_key *key,
>>> +			       const struct nlattr *attr)
>>> +{
>>> +#if IS_ENABLED(CONFIG_PSAMPLE)
>>> +	struct psample_group psample_group = {};
>>> +	struct psample_metadata md = {};
>>> +	struct vport *input_vport;
>>> +	const struct nlattr *a;
>>> +	int rem;
>>> +
>>> +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>> +	     a = nla_next(a, &rem)) {
>>
>> Since the action is strictly validated, can use use nla_for_each_attr()
>> or nla_for_each_nested() ?
>>
> 
> Probably, yes.
> 
>>> +		switch (nla_type(a)) {
>>> +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
>>> +			psample_group.group_num = nla_get_u32(a);
>>> +			break;
>>> +
>>> +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
>>> +			md.user_cookie = nla_data(a);
>>> +			md.user_cookie_len = nla_len(a);
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	psample_group.net = ovs_dp_get_net(dp);
>>> +
>>> +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
>>> +	if (!input_vport)
>>> +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
>>
>> We may need to check that we actually found the local port.
>>
> 
> Sure. What can cause the local port not to exist?

I would assume that since we're only protected by RCU here, there can be
a race with datapath destruction that will remove the local port.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action
  2024-06-18  9:47       ` Ilya Maximets
@ 2024-06-18 10:08         ` Ilya Maximets
  0 siblings, 0 replies; 57+ messages in thread
From: Ilya Maximets @ 2024-06-18 10:08 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev, Donald Hunter,
	Jakub Kicinski, David S. Miller, Eric Dumazet, Paolo Abeni,
	Pravin B Shelar, linux-kernel

On 6/18/24 11:47, Ilya Maximets wrote:
> On 6/18/24 09:33, Adrián Moreno wrote:
>> On Mon, Jun 17, 2024 at 12:44:45PM GMT, Ilya Maximets wrote:
>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>> Add support for a new action: emit_sample.
>>>>
>>>> This action accepts a u32 group id and a variable-length cookie and uses
>>>> the psample multicast group to make the packet available for
>>>> observability.
>>>>
>>>> The maximum length of the user-defined cookie is set to 16, same as
>>>> tc_cookie, to discourage using cookies that will not be offloadable.
>>>>
>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>> ---
>>>>  Documentation/netlink/specs/ovs_flow.yaml | 17 ++++++++
>>>>  include/uapi/linux/openvswitch.h          | 25 ++++++++++++
>>>>  net/openvswitch/actions.c                 | 50 +++++++++++++++++++++++
>>>>  net/openvswitch/flow_netlink.c            | 33 ++++++++++++++-
>>>>  4 files changed, 124 insertions(+), 1 deletion(-)
>>>
>>> Some nits below, beside ones already mentioned.
>>>
>>
>> Thanks, Ilya.
>>
>>>>
>>>> diff --git a/Documentation/netlink/specs/ovs_flow.yaml b/Documentation/netlink/specs/ovs_flow.yaml
>>>> index 4fdfc6b5cae9..a7ab5593a24f 100644
>>>> --- a/Documentation/netlink/specs/ovs_flow.yaml
>>>> +++ b/Documentation/netlink/specs/ovs_flow.yaml
>>>> @@ -727,6 +727,12 @@ attribute-sets:
>>>>          name: dec-ttl
>>>>          type: nest
>>>>          nested-attributes: dec-ttl-attrs
>>>> +      -
>>>> +        name: emit-sample
>>>> +        type: nest
>>>> +        nested-attributes: emit-sample-attrs
>>>> +        doc: |
>>>> +          Sends a packet sample to psample for external observation.
>>>>    -
>>>>      name: tunnel-key-attrs
>>>>      enum-name: ovs-tunnel-key-attr
>>>> @@ -938,6 +944,17 @@ attribute-sets:
>>>>        -
>>>>          name: gbp
>>>>          type: u32
>>>> +  -
>>>> +    name: emit-sample-attrs
>>>> +    enum-name: ovs-emit-sample-attr
>>>> +    name-prefix: ovs-emit-sample-attr-
>>>> +    attributes:
>>>> +      -
>>>> +        name: group
>>>> +        type: u32
>>>> +      -
>>>> +        name: cookie
>>>> +        type: binary
>>>>
>>>>  operations:
>>>>    name-prefix: ovs-flow-cmd-
>>>> diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
>>>> index efc82c318fa2..a0e9dde0584a 100644
>>>> --- a/include/uapi/linux/openvswitch.h
>>>> +++ b/include/uapi/linux/openvswitch.h
>>>> @@ -914,6 +914,30 @@ struct check_pkt_len_arg {
>>>>  };
>>>>  #endif
>>>>
>>>> +#define OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE 16
>>>> +/**
>>>> + * enum ovs_emit_sample_attr - Attributes for %OVS_ACTION_ATTR_EMIT_SAMPLE
>>>> + * action.
>>>> + *
>>>> + * @OVS_EMIT_SAMPLE_ATTR_GROUP: 32-bit number to identify the source of the
>>>> + * sample.
>>>> + * @OVS_EMIT_SAMPLE_ATTR_COOKIE: A variable-length binary cookie that contains
>>>> + * user-defined metadata. The maximum length is 16 bytes.
>>>
>>> s/16/OVS_EMIT_SAMPLE_COOKIE_MAX_SIZE/
>>>
>>>> + *
>>>> + * Sends the packet to the psample multicast group with the specified group and
>>>> + * cookie. It is possible to combine this action with the
>>>> + * %OVS_ACTION_ATTR_TRUNC action to limit the size of the packet being emitted.
>>>> + */
>>>> +enum ovs_emit_sample_attr {
>>>> +	OVS_EMIT_SAMPLE_ATTR_UNPSEC,
>>>> +	OVS_EMIT_SAMPLE_ATTR_GROUP,	/* u32 number. */
>>>> +	OVS_EMIT_SAMPLE_ATTR_COOKIE,	/* Optional, user specified cookie. */
>>>> +	__OVS_EMIT_SAMPLE_ATTR_MAX
>>>> +};
>>>> +
>>>> +#define OVS_EMIT_SAMPLE_ATTR_MAX (__OVS_EMIT_SAMPLE_ATTR_MAX - 1)
>>>> +
>>>> +
>>>>  /**
>>>>   * enum ovs_action_attr - Action types.
>>>>   *
>>>> @@ -1004,6 +1028,7 @@ enum ovs_action_attr {
>>>>  	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
>>>>  	OVS_ACTION_ATTR_DEC_TTL,      /* Nested OVS_DEC_TTL_ATTR_*. */
>>>>  	OVS_ACTION_ATTR_DROP,         /* u32 error code. */
>>>> +	OVS_ACTION_ATTR_EMIT_SAMPLE,  /* Nested OVS_EMIT_SAMPLE_ATTR_*. */
>>>>
>>>>  	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
>>>>  				       * from userspace. */
>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>> index 964225580824..3b4dba0ded59 100644
>>>> --- a/net/openvswitch/actions.c
>>>> +++ b/net/openvswitch/actions.c
>>>> @@ -24,6 +24,11 @@
>>>>  #include <net/checksum.h>
>>>>  #include <net/dsfield.h>
>>>>  #include <net/mpls.h>
>>>> +
>>>> +#if IS_ENABLED(CONFIG_PSAMPLE)
>>>> +#include <net/psample.h>
>>>> +#endif
>>>> +
>>>>  #include <net/sctp/checksum.h>
>>>>
>>>>  #include "datapath.h"
>>>> @@ -1299,6 +1304,46 @@ static int execute_dec_ttl(struct sk_buff *skb, struct sw_flow_key *key)
>>>>  	return 0;
>>>>  }
>>>>
>>>> +static int execute_emit_sample(struct datapath *dp, struct sk_buff *skb,
>>>> +			       const struct sw_flow_key *key,
>>>> +			       const struct nlattr *attr)
>>>> +{
>>>> +#if IS_ENABLED(CONFIG_PSAMPLE)
>>>> +	struct psample_group psample_group = {};
>>>> +	struct psample_metadata md = {};
>>>> +	struct vport *input_vport;
>>>> +	const struct nlattr *a;
>>>> +	int rem;
>>>> +
>>>> +	for (a = nla_data(attr), rem = nla_len(attr); rem > 0;
>>>> +	     a = nla_next(a, &rem)) {
>>>
>>> Since the action is strictly validated, can use use nla_for_each_attr()
>>> or nla_for_each_nested() ?
>>>
>>
>> Probably, yes.
>>
>>>> +		switch (nla_type(a)) {
>>>> +		case OVS_EMIT_SAMPLE_ATTR_GROUP:
>>>> +			psample_group.group_num = nla_get_u32(a);
>>>> +			break;
>>>> +
>>>> +		case OVS_EMIT_SAMPLE_ATTR_COOKIE:
>>>> +			md.user_cookie = nla_data(a);
>>>> +			md.user_cookie_len = nla_len(a);
>>>> +			break;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	psample_group.net = ovs_dp_get_net(dp);
>>>> +
>>>> +	input_vport = ovs_vport_rcu(dp, key->phy.in_port);
>>>> +	if (!input_vport)
>>>> +		input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
>>>
>>> We may need to check that we actually found the local port.
>>>
>>
>> Sure. What can cause the local port not to exist?
> 
> I would assume that since we're only protected by RCU here, there can be
> a race with datapath destruction that will remove the local port.

But, actually, we don't even need to look anything up.  The original
input vport should be available in OVS_CB(skb)->input_vport.

Best regards, Ilya Maximets.


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-18  7:00       ` Adrián Moreno
@ 2024-06-18 10:22         ` Ilya Maximets
  2024-06-18 10:50           ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-18 10:22 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On 6/18/24 09:00, Adrián Moreno wrote:
> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>> On 6/17/24 13:55, Ilya Maximets wrote:
>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>> observability-oriented.
>>>>
>>>> Apart from some corner case in which it's used a replacement of clone()
>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>> local emit_sample.
>>>>
>>>> With this in mind, it doesn't make much sense to report
>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>
>>>> For instance, if the flow:
>>>>
>>>>   actions:sample(..,emit_sample(..)),2
>>>>
>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>> confusing for users since the packet did reach its destination.
>>>>
>>>> This patch makes internal action execution silently consume the skb
>>>> instead of notifying a drop for this case.
>>>>
>>>> Unfortunately, this patch does not remove all potential sources of
>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>
>>>>     actions:sample(..,emit_sample(..))
>>>>
>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>
>>>> Sadly, this case is difficult to solve without breaking the
>>>> optimization by which the skb is not cloned on last sample actions.
>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>> after the last sample() and rewrite the flow as:
>>>>
>>>>     actions:sample(..,emit_sample(..)),drop
>>>>
>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>> ---
>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>> --- a/net/openvswitch/actions.c
>>>> +++ b/net/openvswitch/actions.c
>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>
>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>> +{
>>>> +	/* Do not emit packet drops inside sample(). */
>>>> +	if (OVS_CB(skb)->probability)
>>>> +		consume_skb(skb);
>>>> +	else
>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +}
>>>> +
>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>   * space. Return NULL if out of key spaces.
>>>>   */
>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>  	if ((arg->probability != U32_MAX) &&
>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>  		if (last)
>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +			ovs_drop_skb_last_action(skb);
>>
>> Always consuming the skb at this point makes sense, since having smaple()
>> as a last action is a reasonable thing to have.  But this looks more like
>> a fix for the original drop reason patch set.
>>
> 
> I don't think consuming the skb at this point makes sense. It was very
> intentionally changed to a drop since a very common use-case for
> sampling is drop-sampling, i.e: replacing an empty action list (that
> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> that replacement should not have any effect on the number of
> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> the same way (only observed in one case).
> 
> 
>>>>  		return 0;
>>>>  	}
>>>>
>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>  		}
>>>>  	}
>>>>
>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>> +	ovs_drop_skb_last_action(skb);
>>>
>>> I don't think I agree with this one.  If we have a sample() action with
>>> a lot of different actions inside and we reached the end while the last
>>> action didn't consume the skb, then we should report that.  E.g.
>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>
> 
> What is the use case for such action list? Having an action branch
> executed randomly doesn't make sense to me if it's not some
> observability thing (which IMHO should not trigger drops).

It is exactly my point.  A list of actions that doesn't end is some sort
of a terminal action (output, drop, etc) does not make a lot of sense and
hence should be signaled as an unexpected drop, so users can re-check the
pipeline in case they missed the terminal action somehow.

> 
>>> The only actions that are actually consuming the skb are "output",
>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>> consuming the skb "naturally" by stealing it when it is the last action.
>>> "userspace" has an explicit check to consume the skb if it is the last
>>> action.  "emit_sample" should have the similar check.  It should likely
>>> be added at the point of action introduction instead of having a separate
>>> patch.
>>>
> 
> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> packet does not continue it's way through the datapath.

After "output" the packet leaves the datapath too, i.e. does not continue
it's way through OVS datapath.

> 
> It would be very confusing if OVS starts monitoring drops and adds a bunch
> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> we are just observing them.

This might make sense from the higher logic in user space application, but
it doesn't from the datapath perspective.  And also, if the user adds the
'emit_sample' action for drop monitring, they already know where to find
packet samples, they don't need to use tools like dropwatch anymore.
This packet is not dropped from the datapath perspective, it is sampled.

> 
> And if we change emit_sample to trigger a drop if it's the last action,
> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> which is also terribly confusing.

If emit_sample is the last action, then skb should be consumed silently.
The same as for "output" and "userspace".

> 
> I think we should try to be clear and informative with what we
> _actually_ drop and not require the user that is just running
> "dropwatch" to understand the internals of the OVS module.

If someone is already using sampling to watch their packet drops, why would
they use dropwatch?

> 
> So if you don't want to accept the "observational" nature of sample(),
> the only other solution that does not bring even more confusion to OVS
> drops would be to have userspace add explicit drop actions. WDYT?
> 

These are not drops from the datapath perspective.  Users can add explicit
drop actions if they want to, but I'm really not sure why they would do that
if they are already capturing all these packets in psample, sFlow or IPFIX.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-18 10:22         ` Ilya Maximets
@ 2024-06-18 10:50           ` Adrián Moreno
  2024-06-18 15:44             ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-18 10:50 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> On 6/18/24 09:00, Adrián Moreno wrote:
> > On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >> On 6/17/24 13:55, Ilya Maximets wrote:
> >>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>> observability-oriented.
> >>>>
> >>>> Apart from some corner case in which it's used a replacement of clone()
> >>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>> local emit_sample.
> >>>>
> >>>> With this in mind, it doesn't make much sense to report
> >>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>
> >>>> For instance, if the flow:
> >>>>
> >>>>   actions:sample(..,emit_sample(..)),2
> >>>>
> >>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>> confusing for users since the packet did reach its destination.
> >>>>
> >>>> This patch makes internal action execution silently consume the skb
> >>>> instead of notifying a drop for this case.
> >>>>
> >>>> Unfortunately, this patch does not remove all potential sources of
> >>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>
> >>>>     actions:sample(..,emit_sample(..))
> >>>>
> >>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>
> >>>> Sadly, this case is difficult to solve without breaking the
> >>>> optimization by which the skb is not cloned on last sample actions.
> >>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>> after the last sample() and rewrite the flow as:
> >>>>
> >>>>     actions:sample(..,emit_sample(..)),drop
> >>>>
> >>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>> ---
> >>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>> --- a/net/openvswitch/actions.c
> >>>> +++ b/net/openvswitch/actions.c
> >>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>
> >>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>> +{
> >>>> +	/* Do not emit packet drops inside sample(). */
> >>>> +	if (OVS_CB(skb)->probability)
> >>>> +		consume_skb(skb);
> >>>> +	else
> >>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +}
> >>>> +
> >>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>   * space. Return NULL if out of key spaces.
> >>>>   */
> >>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>  	if ((arg->probability != U32_MAX) &&
> >>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>  		if (last)
> >>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +			ovs_drop_skb_last_action(skb);
> >>
> >> Always consuming the skb at this point makes sense, since having smaple()
> >> as a last action is a reasonable thing to have.  But this looks more like
> >> a fix for the original drop reason patch set.
> >>
> >
> > I don't think consuming the skb at this point makes sense. It was very
> > intentionally changed to a drop since a very common use-case for
> > sampling is drop-sampling, i.e: replacing an empty action list (that
> > triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> > that replacement should not have any effect on the number of
> > OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> > the same way (only observed in one case).
> >
> >
> >>>>  		return 0;
> >>>>  	}
> >>>>
> >>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>  		}
> >>>>  	}
> >>>>
> >>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>> +	ovs_drop_skb_last_action(skb);
> >>>
> >>> I don't think I agree with this one.  If we have a sample() action with
> >>> a lot of different actions inside and we reached the end while the last
> >>> action didn't consume the skb, then we should report that.  E.g.
> >>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>
> >
> > What is the use case for such action list? Having an action branch
> > executed randomly doesn't make sense to me if it's not some
> > observability thing (which IMHO should not trigger drops).
>
> It is exactly my point.  A list of actions that doesn't end is some sort
> of a terminal action (output, drop, etc) does not make a lot of sense and
> hence should be signaled as an unexpected drop, so users can re-check the
> pipeline in case they missed the terminal action somehow.
>
> >
> >>> The only actions that are actually consuming the skb are "output",
> >>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>> consuming the skb "naturally" by stealing it when it is the last action.
> >>> "userspace" has an explicit check to consume the skb if it is the last
> >>> action.  "emit_sample" should have the similar check.  It should likely
> >>> be added at the point of action introduction instead of having a separate
> >>> patch.
> >>>
> >
> > Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> > packet does not continue it's way through the datapath.
>
> After "output" the packet leaves the datapath too, i.e. does not continue
> it's way through OVS datapath.
>

I meant a broader concept of "datapath". The packet continues. For the
userspace action this is true only for the CONTROLLER ofp action but
since the datapath does not know which action it's implementing, we
cannot do better.

> >
> > It would be very confusing if OVS starts monitoring drops and adds a bunch
> > of flows such as "actions:emit_sample()" and suddently it stops reporting such
> > drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> > we are just observing them.
>
> This might make sense from the higher logic in user space application, but
> it doesn't from the datapath perspective.  And also, if the user adds the
> 'emit_sample' action for drop monitring, they already know where to find
> packet samples, they don't need to use tools like dropwatch anymore.
> This packet is not dropped from the datapath perspective, it is sampled.
>
> >
> > And if we change emit_sample to trigger a drop if it's the last action,
> > then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> > which is also terribly confusing.
>
> If emit_sample is the last action, then skb should be consumed silently.
> The same as for "output" and "userspace".
>
> >
> > I think we should try to be clear and informative with what we
> > _actually_ drop and not require the user that is just running
> > "dropwatch" to understand the internals of the OVS module.
>
> If someone is already using sampling to watch their packet drops, why would
> they use dropwatch?
>
> >
> > So if you don't want to accept the "observational" nature of sample(),
> > the only other solution that does not bring even more confusion to OVS
> > drops would be to have userspace add explicit drop actions. WDYT?
> >
>
> These are not drops from the datapath perspective.  Users can add explicit
> drop actions if they want to, but I'm really not sure why they would do that
> if they are already capturing all these packets in psample, sFlow or IPFIX.

Because there is not a single "user". Tools and systems can be built on
top of tracepoints and samples and they might not be coordinated between
them. Some observability application can be always enabled and doing
constant network monitoring or statistics while other lower level tools
can be run at certain moments to troubleshoot issues.

In order to run dropwatch in a node you don't need to have rights to
access the OpenFlow controller and ask it to change the OpenFlow rules
or else dropwatch simply will not show actual packet drops.

To me it seems obvious that drop sampling (via emit_sample) "includes"
drop reporting via emit_sample. In both cases you get the packet
headers, but in one case you also get OFP controller metadata. Now even
if there is a system that uses both, does it make sense to push to them
the responsibility of dealing with them being mutually exclusive?

I think this makes debugging OVS datapath unnecessarily obscure when we
know the packet is actually being dropped intentionally by OVS.

What's the problem with having OVS write the following?
    "sample(50%, emit_sample()),drop(0)"

Thanks,
Adrián


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test
  2024-06-18  9:08       ` Adrián Moreno
@ 2024-06-18 13:27         ` Aaron Conole
  0 siblings, 0 replies; 57+ messages in thread
From: Aaron Conole @ 2024-06-18 13:27 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: netdev, echaudro, horms, i.maximets, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Shuah Khan, linux-kselftest, linux-kernel

Adrián Moreno <amorenoz@redhat.com> writes:

> On Mon, Jun 17, 2024 at 07:18:05AM GMT, Adrián Moreno wrote:
>> On Fri, Jun 14, 2024 at 01:07:33PM GMT, Aaron Conole wrote:
>> > Adrian Moreno <amorenoz@redhat.com> writes:
>> >
>> > > Add a test to verify sampling packets via psample works.
>> > >
>> > > In order to do that, create a subcommand in ovs-dpctl.py to listen to
>> > > on the psample multicast group and print samples.
>> > >
>> > > In order to also test simultaneous sFlow and psample actions and
>> > > packet truncation, add missing parsing support for "userspace" and
>> > > "trunc" actions.
>> >
>> > Maybe split that into a separate patch.  This has a bugfix and 3
>> > features being pushed in.  I know it's already getting long as a series,
>> > so maybe it's okay to fold the userspace attribute bugfix with the parse
>> > support (since it wasn't really usable before).
>> >
>>
>> OK. Sounds reasonable.
>>
>> > > Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> > > ---
>> > >  .../selftests/net/openvswitch/openvswitch.sh  |  99 +++++++++++++++-
>> > >  .../selftests/net/openvswitch/ovs-dpctl.py    | 112 +++++++++++++++++-
>> > >  2 files changed, 204 insertions(+), 7 deletions(-)
>> > >
>> > > diff --git a/tools/testing/selftests/net/openvswitch/openvswitch.sh b/tools/testing/selftests/net/openvswitch/openvswitch.sh
>> > > index 5cae53543849..f6e0ae3f6424 100755
>> > > --- a/tools/testing/selftests/net/openvswitch/openvswitch.sh
>> > > +++ b/tools/testing/selftests/net/openvswitch/openvswitch.sh
>> > > @@ -20,7 +20,8 @@ tests="
>> > >  	nat_related_v4				ip4-nat-related: ICMP related matches work with SNAT
>> > >  	netlink_checks				ovsnl: validate netlink attrs and settings
>> > >  	upcall_interfaces			ovs: test the upcall interfaces
>> > > -	drop_reason				drop: test drop reasons are emitted"
>> > > +	drop_reason				drop: test drop reasons are emitted
>> > > +	emit_sample 				emit_sample: Sampling packets with psample"
>> > >
>> > >  info() {
>> > >      [ $VERBOSE = 0 ] || echo $*
>> > > @@ -170,6 +171,19 @@ ovs_drop_reason_count()
>> > >  	return `echo "$perf_output" | grep "$pattern" | wc -l`
>> > >  }
>> > >
>> > > +ovs_test_flow_fails () {
>> > > +	ERR_MSG="Flow actions may not be safe on all matching packets"
>> > > +
>> > > +	PRE_TEST=$(dmesg | grep -c "${ERR_MSG}")
>> > > +	ovs_add_flow $@ &> /dev/null $@ && return 1
>> > > +	POST_TEST=$(dmesg | grep -c "${ERR_MSG}")
>> > > +
>> > > +	if [ "$PRE_TEST" == "$POST_TEST" ]; then
>> > > +		return 1
>> > > +	fi
>> > > +	return 0
>> > > +}
>> > > +
>> > >  usage() {
>> > >  	echo
>> > >  	echo "$0 [OPTIONS] [TEST]..."
>> > > @@ -184,6 +198,89 @@ usage() {
>> > >  	exit 1
>> > >  }
>> > >
>> > > +
>> > > +# emit_sample test
>> > > +# - use emit_sample to observe packets
>> > > +test_emit_sample() {
>> > > +	sbx_add "test_emit_sample" || return $?
>> > > +
>> > > +	# Add a datapath with per-vport dispatching.
>> > > +	ovs_add_dp "test_emit_sample" emit_sample -V 2:1 || return 1
>> > > +
>> > > +	info "create namespaces"
>> > > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
>> > > +		client c0 c1 172.31.110.10/24 -u || return 1
>> > > +	ovs_add_netns_and_veths "test_emit_sample" "emit_sample" \
>> > > +		server s0 s1 172.31.110.20/24 -u || return 1
>> > > +
>> > > +	# Check if emit_sample actions can be configured.
>> > > +	ovs_add_flow "test_emit_sample" emit_sample \
>> > > +	'in_port(1),eth(),eth_type(0x0806),arp()' 'emit_sample(group=1)'
>> > > +	if [ $? == 1 ]; then
>> > > +		info "no support for emit_sample - skipping"
>> > > +		ovs_exit_sig
>> > > +		return $ksft_skip
>> > > +	fi
>> > > +
>> > > +	ovs_del_flows "test_emit_sample" emit_sample
>> > > +
>> > > +	# Allow ARP
>> > > +	ovs_add_flow "test_emit_sample" emit_sample \
>> > > +		'in_port(1),eth(),eth_type(0x0806),arp()' '2' || return 1
>> > > +	ovs_add_flow "test_emit_sample" emit_sample \
>> > > +		'in_port(2),eth(),eth_type(0x0806),arp()' '1' || return 1
>> > > +
>> > > +	# Test action verification.
>> > > +	OLDIFS=$IFS
>> > > +	IFS='*'
>> > > +	min_key='in_port(1),eth(),eth_type(0x0800),ipv4()'
>> > > +	for testcase in \
>> > > +		"cookie to large"*"emit_sample(group=1,cookie=1615141312111009080706050403020100)" \
>> > > +		"no group with cookie"*"emit_sample(cookie=abcd)" \
>> > > +		"no group"*"sample()";
>> > > +	do
>> > > +		set -- $testcase;
>> > > +		ovs_test_flow_fails "test_emit_sample" emit_sample $min_key $2
>> > > +		if [ $? == 1 ]; then
>> > > +			info "failed - $1"
>> > > +			return 1
>> > > +		fi
>> > > +	done
>> > > +	IFS=$OLDIFS
>> > > +
>> > > +	# Sample first 14 bytes of all traffic.
>> > > +	ovs_add_flow "test_emit_sample" emit_sample \
>> > > +	"in_port(1),eth(),eth_type(0x0800),ipv4(src=172.31.110.10,proto=1),icmp()" "trunc(14),emit_sample(group=1,cookie=c0ffee),2"
>> > > +
>> > > +	# Sample all traffic. In this case, use a sample() action with both
>> > > +	# emit_sample and an upcall emulating simultaneous local sampling and
>> > > +	# sFlow / IPFIX.
>> > > +	nlpid=$(grep -E "listening on upcall packet handler" $ovs_dir/s0.out | cut -d ":" -f 2 | tr -d ' ')
>> > > +	ovs_add_flow "test_emit_sample" emit_sample \
>> > > +	"in_port(2),eth(),eth_type(0x0800),ipv4(src=172.31.110.20,proto=1),icmp()" "sample(sample=100%,actions(emit_sample(group=2,cookie=eeff0c),userspace(pid=${nlpid},userdata=eeff0c))),1"
>> > > +
>> > > +	# Record emit_sample data.
>> > > +	python3 $ovs_base/ovs-dpctl.py psample >$ovs_dir/psample.out 2>$ovs_dir/psample.err &
>> > > +	pid=$!
>> > > +	on_exit "ovs_sbx test_emit_sample kill -TERM $pid 2>/dev/null"
>> >
>> >   Maybe ovs_netns_spawn_daemon ?
>> >
>>
>> I'll take a look at it, thanks.
>>
>
> I've looked into ovs_netns_spawn_daemon and I think it'll not be useful
> for this command since it needs to run in the default namespace. I can
> add a new "ovs_spawn_daemon" so it's reusable. WDYT?

Okay

>> [...]


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-18 10:50           ` Adrián Moreno
@ 2024-06-18 15:44             ` Ilya Maximets
  2024-06-19  6:35               ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-18 15:44 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On 6/18/24 12:50, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>> On 6/18/24 09:00, Adrián Moreno wrote:
>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>> observability-oriented.
>>>>>>
>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>> local emit_sample.
>>>>>>
>>>>>> With this in mind, it doesn't make much sense to report
>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>
>>>>>> For instance, if the flow:
>>>>>>
>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>
>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>> confusing for users since the packet did reach its destination.
>>>>>>
>>>>>> This patch makes internal action execution silently consume the skb
>>>>>> instead of notifying a drop for this case.
>>>>>>
>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>
>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>
>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>
>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>> after the last sample() and rewrite the flow as:
>>>>>>
>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>
>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>> ---
>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>> --- a/net/openvswitch/actions.c
>>>>>> +++ b/net/openvswitch/actions.c
>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>
>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>> +{
>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>> +		consume_skb(skb);
>>>>>> +	else
>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +}
>>>>>> +
>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>   */
>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>  		if (last)
>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>
>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>> a fix for the original drop reason patch set.
>>>>
>>>
>>> I don't think consuming the skb at this point makes sense. It was very
>>> intentionally changed to a drop since a very common use-case for
>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>> that replacement should not have any effect on the number of
>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>> the same way (only observed in one case).
>>>
>>>
>>>>>>  		return 0;
>>>>>>  	}
>>>>>>
>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>  		}
>>>>>>  	}
>>>>>>
>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>
>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>> a lot of different actions inside and we reached the end while the last
>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>
>>>
>>> What is the use case for such action list? Having an action branch
>>> executed randomly doesn't make sense to me if it's not some
>>> observability thing (which IMHO should not trigger drops).
>>
>> It is exactly my point.  A list of actions that doesn't end is some sort
>> of a terminal action (output, drop, etc) does not make a lot of sense and
>> hence should be signaled as an unexpected drop, so users can re-check the
>> pipeline in case they missed the terminal action somehow.
>>
>>>
>>>>> The only actions that are actually consuming the skb are "output",
>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>> be added at the point of action introduction instead of having a separate
>>>>> patch.
>>>>>
>>>
>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>> packet does not continue it's way through the datapath.
>>
>> After "output" the packet leaves the datapath too, i.e. does not continue
>> it's way through OVS datapath.
>>
> 
> I meant a broader concept of "datapath". The packet continues. For the
> userspace action this is true only for the CONTROLLER ofp action but
> since the datapath does not know which action it's implementing, we
> cannot do better.

It's not only controller() action.  Packets can be brought to userspace
for various reason including just an explicit ask to execute some actions
in userspace.  In any case the packet sent to userspace kind of reached its
destination and it's not the "datapath drops the packet" situation.

> 
>>>
>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>> we are just observing them.
>>
>> This might make sense from the higher logic in user space application, but
>> it doesn't from the datapath perspective.  And also, if the user adds the
>> 'emit_sample' action for drop monitring, they already know where to find
>> packet samples, they don't need to use tools like dropwatch anymore.
>> This packet is not dropped from the datapath perspective, it is sampled.
>>
>>>
>>> And if we change emit_sample to trigger a drop if it's the last action,
>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>> which is also terribly confusing.
>>
>> If emit_sample is the last action, then skb should be consumed silently.
>> The same as for "output" and "userspace".
>>
>>>
>>> I think we should try to be clear and informative with what we
>>> _actually_ drop and not require the user that is just running
>>> "dropwatch" to understand the internals of the OVS module.
>>
>> If someone is already using sampling to watch their packet drops, why would
>> they use dropwatch?
>>
>>>
>>> So if you don't want to accept the "observational" nature of sample(),
>>> the only other solution that does not bring even more confusion to OVS
>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>
>>
>> These are not drops from the datapath perspective.  Users can add explicit
>> drop actions if they want to, but I'm really not sure why they would do that
>> if they are already capturing all these packets in psample, sFlow or IPFIX.
> 
> Because there is not a single "user". Tools and systems can be built on
> top of tracepoints and samples and they might not be coordinated between
> them. Some observability application can be always enabled and doing
> constant network monitoring or statistics while other lower level tools
> can be run at certain moments to troubleshoot issues.
> 
> In order to run dropwatch in a node you don't need to have rights to
> access the OpenFlow controller and ask it to change the OpenFlow rules
> or else dropwatch simply will not show actual packet drops.

The point is that these are not drops in this scenario.  The packet was
delivered to its destination and hence should not be reported as dropped.
In the observability use-case that you're describing even OpenFlow layer
in OVS doesn't know if these supposed to be treated as packet drops for
the user or if these are just samples with the sampling being the only
intended destination.  For OpenFlow and OVS userspace components these
two scenarios are indistinguishable.  Only the OpenFlow controller knows
that these rules were put in place because it was an ACL created by some
user or tool.  And since OVS in user space can't make such a distinction,
kernel can't make it either, and so shouldn't guess what the user two
levels of abstraction higher up meant.

> 
> To me it seems obvious that drop sampling (via emit_sample) "includes"
> drop reporting via emit_sample. In both cases you get the packet
> headers, but in one case you also get OFP controller metadata. Now even
> if there is a system that uses both, does it make sense to push to them
> the responsibility of dealing with them being mutually exclusive?
> 
> I think this makes debugging OVS datapath unnecessarily obscure when we
> know the packet is actually being dropped intentionally by OVS.

I don't think we know that we're in a drop sampling scenario.  We don't
have enough information even in OVS userspace to tell.

And having different behavior between "userspace" and "emit_sample" in
the kernel may cause even more confusion, because now two ways of sampling
packets will result in packets showing up in dropwatch in one case, but
not in the other.

> 
> What's the problem with having OVS write the following?
>     "sample(50%, emit_sample()),drop(0)"

It's a valid sequence of actions, but we shouldn't guess what the end
user meant by putting those actions into the kernel.  If we see such a
sequence in the kernel, then we should report an explicit drop.  If
there was only the "sample(50%, emit_sample())" then we should simply
consume the skb as it reached its destination in the psample.

For the question if OVS in user space should put explicit drop action
while preparing to emit sample, this doesn't sound reasonable for the
same reason - OVS in user space doesn't know what the intention was of
the user or tool that put the sampling action into OpenFlow pipeline.


I actually became more confused about what are we arguing about.
To recap:

                                     This patch     My proposal

1. emit_sample() is the last            consume        consume  
    inside the sample()

2. the end of the action list           consume        drop
    inside the sample()

3. emit_sample() is the last            drop           consume
    outside the sample()

4. the end of the action list           drop           drop
    outside the sample()

5. sample() is the last action          consume        consume
    and probability failed


I don't think cases 1 and 3 should differ, i.e. the behavior should
be the same regardless of emit_sample() being inside or outside of
the sample().  As a side point, OVS in user space will omit the 100%
rate sample() action and will just list inner actions instead.  This
means that 100% probability sampling will generate drops and 99% will
not.  Doesn't sound right.

Case 2 should likely never happen, but I'd like to see a drop reported
if that ever happens, because it is not a meaningful list of actions.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-18 15:44             ` Ilya Maximets
@ 2024-06-19  6:35               ` Adrián Moreno
  2024-06-19 18:21                 ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-19  6:35 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
> On 6/18/24 12:50, Adrián Moreno wrote:
> > On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> >> On 6/18/24 09:00, Adrián Moreno wrote:
> >>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >>>> On 6/17/24 13:55, Ilya Maximets wrote:
> >>>>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>>>> observability-oriented.
> >>>>>>
> >>>>>> Apart from some corner case in which it's used a replacement of clone()
> >>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>>>> local emit_sample.
> >>>>>>
> >>>>>> With this in mind, it doesn't make much sense to report
> >>>>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>>>
> >>>>>> For instance, if the flow:
> >>>>>>
> >>>>>>   actions:sample(..,emit_sample(..)),2
> >>>>>>
> >>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>>>> confusing for users since the packet did reach its destination.
> >>>>>>
> >>>>>> This patch makes internal action execution silently consume the skb
> >>>>>> instead of notifying a drop for this case.
> >>>>>>
> >>>>>> Unfortunately, this patch does not remove all potential sources of
> >>>>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>>>
> >>>>>>     actions:sample(..,emit_sample(..))
> >>>>>>
> >>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>>>
> >>>>>> Sadly, this case is difficult to solve without breaking the
> >>>>>> optimization by which the skb is not cloned on last sample actions.
> >>>>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>>>> after the last sample() and rewrite the flow as:
> >>>>>>
> >>>>>>     actions:sample(..,emit_sample(..)),drop
> >>>>>>
> >>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>>>> ---
> >>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>>>
> >>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>>>> --- a/net/openvswitch/actions.c
> >>>>>> +++ b/net/openvswitch/actions.c
> >>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>>>
> >>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>>>> +{
> >>>>>> +	/* Do not emit packet drops inside sample(). */
> >>>>>> +	if (OVS_CB(skb)->probability)
> >>>>>> +		consume_skb(skb);
> >>>>>> +	else
> >>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +}
> >>>>>> +
> >>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>>>   * space. Return NULL if out of key spaces.
> >>>>>>   */
> >>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>>>  	if ((arg->probability != U32_MAX) &&
> >>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>>>  		if (last)
> >>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +			ovs_drop_skb_last_action(skb);
> >>>>
> >>>> Always consuming the skb at this point makes sense, since having smaple()
> >>>> as a last action is a reasonable thing to have.  But this looks more like
> >>>> a fix for the original drop reason patch set.
> >>>>
> >>>
> >>> I don't think consuming the skb at this point makes sense. It was very
> >>> intentionally changed to a drop since a very common use-case for
> >>> sampling is drop-sampling, i.e: replacing an empty action list (that
> >>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> >>> that replacement should not have any effect on the number of
> >>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> >>> the same way (only observed in one case).
> >>>
> >>>
> >>>>>>  		return 0;
> >>>>>>  	}
> >>>>>>
> >>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>>>  		}
> >>>>>>  	}
> >>>>>>
> >>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>> +	ovs_drop_skb_last_action(skb);
> >>>>>
> >>>>> I don't think I agree with this one.  If we have a sample() action with
> >>>>> a lot of different actions inside and we reached the end while the last
> >>>>> action didn't consume the skb, then we should report that.  E.g.
> >>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>>>
> >>>
> >>> What is the use case for such action list? Having an action branch
> >>> executed randomly doesn't make sense to me if it's not some
> >>> observability thing (which IMHO should not trigger drops).
> >>
> >> It is exactly my point.  A list of actions that doesn't end is some sort
> >> of a terminal action (output, drop, etc) does not make a lot of sense and
> >> hence should be signaled as an unexpected drop, so users can re-check the
> >> pipeline in case they missed the terminal action somehow.
> >>
> >>>
> >>>>> The only actions that are actually consuming the skb are "output",
> >>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>>>> consuming the skb "naturally" by stealing it when it is the last action.
> >>>>> "userspace" has an explicit check to consume the skb if it is the last
> >>>>> action.  "emit_sample" should have the similar check.  It should likely
> >>>>> be added at the point of action introduction instead of having a separate
> >>>>> patch.
> >>>>>
> >>>
> >>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> >>> packet does not continue it's way through the datapath.
> >>
> >> After "output" the packet leaves the datapath too, i.e. does not continue
> >> it's way through OVS datapath.
> >>
> >
> > I meant a broader concept of "datapath". The packet continues. For the
> > userspace action this is true only for the CONTROLLER ofp action but
> > since the datapath does not know which action it's implementing, we
> > cannot do better.
>
> It's not only controller() action.  Packets can be brought to userspace
> for various reason including just an explicit ask to execute some actions
> in userspace.  In any case the packet sent to userspace kind of reached its
> destination and it's not the "datapath drops the packet" situation.
>
> >
> >>>
> >>> It would be very confusing if OVS starts monitoring drops and adds a bunch
> >>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> >>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> >>> we are just observing them.
> >>
> >> This might make sense from the higher logic in user space application, but
> >> it doesn't from the datapath perspective.  And also, if the user adds the
> >> 'emit_sample' action for drop monitring, they already know where to find
> >> packet samples, they don't need to use tools like dropwatch anymore.
> >> This packet is not dropped from the datapath perspective, it is sampled.
> >>
> >>>
> >>> And if we change emit_sample to trigger a drop if it's the last action,
> >>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> >>> which is also terribly confusing.
> >>
> >> If emit_sample is the last action, then skb should be consumed silently.
> >> The same as for "output" and "userspace".
> >>
> >>>
> >>> I think we should try to be clear and informative with what we
> >>> _actually_ drop and not require the user that is just running
> >>> "dropwatch" to understand the internals of the OVS module.
> >>
> >> If someone is already using sampling to watch their packet drops, why would
> >> they use dropwatch?
> >>
> >>>
> >>> So if you don't want to accept the "observational" nature of sample(),
> >>> the only other solution that does not bring even more confusion to OVS
> >>> drops would be to have userspace add explicit drop actions. WDYT?
> >>>
> >>
> >> These are not drops from the datapath perspective.  Users can add explicit
> >> drop actions if they want to, but I'm really not sure why they would do that
> >> if they are already capturing all these packets in psample, sFlow or IPFIX.
> >
> > Because there is not a single "user". Tools and systems can be built on
> > top of tracepoints and samples and they might not be coordinated between
> > them. Some observability application can be always enabled and doing
> > constant network monitoring or statistics while other lower level tools
> > can be run at certain moments to troubleshoot issues.
> >
> > In order to run dropwatch in a node you don't need to have rights to
> > access the OpenFlow controller and ask it to change the OpenFlow rules
> > or else dropwatch simply will not show actual packet drops.
>
> The point is that these are not drops in this scenario.  The packet was
> delivered to its destination and hence should not be reported as dropped.
> In the observability use-case that you're describing even OpenFlow layer
> in OVS doesn't know if these supposed to be treated as packet drops for
> the user or if these are just samples with the sampling being the only
> intended destination.  For OpenFlow and OVS userspace components these
> two scenarios are indistinguishable.  Only the OpenFlow controller knows
> that these rules were put in place because it was an ACL created by some
> user or tool.  And since OVS in user space can't make such a distinction,
> kernel can't make it either, and so shouldn't guess what the user two
> levels of abstraction higher up meant.
>
> >
> > To me it seems obvious that drop sampling (via emit_sample) "includes"
> > drop reporting via emit_sample. In both cases you get the packet
> > headers, but in one case you also get OFP controller metadata. Now even
> > if there is a system that uses both, does it make sense to push to them
> > the responsibility of dealing with them being mutually exclusive?
> >
> > I think this makes debugging OVS datapath unnecessarily obscure when we
> > know the packet is actually being dropped intentionally by OVS.
>
> I don't think we know that we're in a drop sampling scenario.  We don't
> have enough information even in OVS userspace to tell.
>
> And having different behavior between "userspace" and "emit_sample" in
> the kernel may cause even more confusion, because now two ways of sampling
> packets will result in packets showing up in dropwatch in one case, but
> not in the other.
>
> >
> > What's the problem with having OVS write the following?
> >     "sample(50%, emit_sample()),drop(0)"
>
> It's a valid sequence of actions, but we shouldn't guess what the end
> user meant by putting those actions into the kernel.  If we see such a
> sequence in the kernel, then we should report an explicit drop.  If
> there was only the "sample(50%, emit_sample())" then we should simply
> consume the skb as it reached its destination in the psample.
>
> For the question if OVS in user space should put explicit drop action
> while preparing to emit sample, this doesn't sound reasonable for the
> same reason - OVS in user space doesn't know what the intention was of
> the user or tool that put the sampling action into OpenFlow pipeline.
>

I don't see it that way. The spec says that packets whose action sets
(the result of classification) have no output action and no group action
must be dropped. Even if OFP sample action is an extension, I don't see
it invalidating that semantics.
So, IMHO, OVS does know that a flow that is just sampled is a drop.

> I actually became more confused about what are we arguing about.
> To recap:
>
>                                      This patch     My proposal
>
> 1. emit_sample() is the last            consume        consume
>     inside the sample()
>
> 2. the end of the action list           consume        drop
>     inside the sample()
>
> 3. emit_sample() is the last            drop           consume
>     outside the sample()
>
> 4. the end of the action list           drop           drop
>     outside the sample()
>
> 5. sample() is the last action          consume        consume
>     and probability failed
>
>
> I don't think cases 1 and 3 should differ, i.e. the behavior should
> be the same regardless of emit_sample() being inside or outside of
> the sample().  As a side point, OVS in user space will omit the 100%
> rate sample() action and will just list inner actions instead.  This
> means that 100% probability sampling will generate drops and 99% will
> not.  Doesn't sound right.
>

That's what I was refering to in the commit message, we still OVS to
write:
    actions:sample(..,emit_sample(..)),drop

> Case 2 should likely never happen, but I'd like to see a drop reported
> if that ever happens, because it is not a meaningful list of actions.
>
> Best regards, Ilya Maximets.
>

I think we could drop this patch if we agree that OVS could write
explicit drops when it knows the packet is being dropped and sampled
(the action only has OFP sample actions).

The drop could be placed inside the odp sample action to avoid
breaking the clone optimization:
    actions:sample(50%, actions(emit_sample(),drop)))

or outside if the sample itself is optimized out:
    actions:emit_sample(),drop

IIUC, if we don't do that, we are saying that sampling is incompatible
with decent drop reporting via kfree_skb infrastructure used by tools
like dropwatch or retis (among many others). And I think that is
unnecessarily and deliberately making OVS datapath more difficult to
troubleshoot.

Thanks,
Adrián


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-19  6:35               ` Adrián Moreno
@ 2024-06-19 18:21                 ` Ilya Maximets
  2024-06-19 20:40                   ` Adrián Moreno
  0 siblings, 1 reply; 57+ messages in thread
From: Ilya Maximets @ 2024-06-19 18:21 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On 6/19/24 08:35, Adrián Moreno wrote:
> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>> On 6/18/24 12:50, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>> observability-oriented.
>>>>>>>>
>>>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>> local emit_sample.
>>>>>>>>
>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>
>>>>>>>> For instance, if the flow:
>>>>>>>>
>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>
>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>
>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>
>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>
>>>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>>>
>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>>>
>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>
>>>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>>>
>>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>>>> ---
>>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>
>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>> +{
>>>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>>>> +		consume_skb(skb);
>>>>>>>> +	else
>>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>>>   */
>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>>>  		if (last)
>>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>>>
>>>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>>>> a fix for the original drop reason patch set.
>>>>>>
>>>>>
>>>>> I don't think consuming the skb at this point makes sense. It was very
>>>>> intentionally changed to a drop since a very common use-case for
>>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>>>> that replacement should not have any effect on the number of
>>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>>>> the same way (only observed in one case).
>>>>>
>>>>>
>>>>>>>>  		return 0;
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>  		}
>>>>>>>>  	}
>>>>>>>>
>>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>>>
>>>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>>>> a lot of different actions inside and we reached the end while the last
>>>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>>>
>>>>>
>>>>> What is the use case for such action list? Having an action branch
>>>>> executed randomly doesn't make sense to me if it's not some
>>>>> observability thing (which IMHO should not trigger drops).
>>>>
>>>> It is exactly my point.  A list of actions that doesn't end is some sort
>>>> of a terminal action (output, drop, etc) does not make a lot of sense and
>>>> hence should be signaled as an unexpected drop, so users can re-check the
>>>> pipeline in case they missed the terminal action somehow.
>>>>
>>>>>
>>>>>>> The only actions that are actually consuming the skb are "output",
>>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>>>> be added at the point of action introduction instead of having a separate
>>>>>>> patch.
>>>>>>>
>>>>>
>>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>>>> packet does not continue it's way through the datapath.
>>>>
>>>> After "output" the packet leaves the datapath too, i.e. does not continue
>>>> it's way through OVS datapath.
>>>>
>>>
>>> I meant a broader concept of "datapath". The packet continues. For the
>>> userspace action this is true only for the CONTROLLER ofp action but
>>> since the datapath does not know which action it's implementing, we
>>> cannot do better.
>>
>> It's not only controller() action.  Packets can be brought to userspace
>> for various reason including just an explicit ask to execute some actions
>> in userspace.  In any case the packet sent to userspace kind of reached its
>> destination and it's not the "datapath drops the packet" situation.
>>
>>>
>>>>>
>>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>>>> we are just observing them.
>>>>
>>>> This might make sense from the higher logic in user space application, but
>>>> it doesn't from the datapath perspective.  And also, if the user adds the
>>>> 'emit_sample' action for drop monitring, they already know where to find
>>>> packet samples, they don't need to use tools like dropwatch anymore.
>>>> This packet is not dropped from the datapath perspective, it is sampled.
>>>>
>>>>>
>>>>> And if we change emit_sample to trigger a drop if it's the last action,
>>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>>>> which is also terribly confusing.
>>>>
>>>> If emit_sample is the last action, then skb should be consumed silently.
>>>> The same as for "output" and "userspace".
>>>>
>>>>>
>>>>> I think we should try to be clear and informative with what we
>>>>> _actually_ drop and not require the user that is just running
>>>>> "dropwatch" to understand the internals of the OVS module.
>>>>
>>>> If someone is already using sampling to watch their packet drops, why would
>>>> they use dropwatch?
>>>>
>>>>>
>>>>> So if you don't want to accept the "observational" nature of sample(),
>>>>> the only other solution that does not bring even more confusion to OVS
>>>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>>>
>>>>
>>>> These are not drops from the datapath perspective.  Users can add explicit
>>>> drop actions if they want to, but I'm really not sure why they would do that
>>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
>>>
>>> Because there is not a single "user". Tools and systems can be built on
>>> top of tracepoints and samples and they might not be coordinated between
>>> them. Some observability application can be always enabled and doing
>>> constant network monitoring or statistics while other lower level tools
>>> can be run at certain moments to troubleshoot issues.
>>>
>>> In order to run dropwatch in a node you don't need to have rights to
>>> access the OpenFlow controller and ask it to change the OpenFlow rules
>>> or else dropwatch simply will not show actual packet drops.
>>
>> The point is that these are not drops in this scenario.  The packet was
>> delivered to its destination and hence should not be reported as dropped.
>> In the observability use-case that you're describing even OpenFlow layer
>> in OVS doesn't know if these supposed to be treated as packet drops for
>> the user or if these are just samples with the sampling being the only
>> intended destination.  For OpenFlow and OVS userspace components these
>> two scenarios are indistinguishable.  Only the OpenFlow controller knows
>> that these rules were put in place because it was an ACL created by some
>> user or tool.  And since OVS in user space can't make such a distinction,
>> kernel can't make it either, and so shouldn't guess what the user two
>> levels of abstraction higher up meant.
>>
>>>
>>> To me it seems obvious that drop sampling (via emit_sample) "includes"
>>> drop reporting via emit_sample. In both cases you get the packet
>>> headers, but in one case you also get OFP controller metadata. Now even
>>> if there is a system that uses both, does it make sense to push to them
>>> the responsibility of dealing with them being mutually exclusive?
>>>
>>> I think this makes debugging OVS datapath unnecessarily obscure when we
>>> know the packet is actually being dropped intentionally by OVS.
>>
>> I don't think we know that we're in a drop sampling scenario.  We don't
>> have enough information even in OVS userspace to tell.
>>
>> And having different behavior between "userspace" and "emit_sample" in
>> the kernel may cause even more confusion, because now two ways of sampling
>> packets will result in packets showing up in dropwatch in one case, but
>> not in the other.
>>
>>>
>>> What's the problem with having OVS write the following?
>>>     "sample(50%, emit_sample()),drop(0)"
>>
>> It's a valid sequence of actions, but we shouldn't guess what the end
>> user meant by putting those actions into the kernel.  If we see such a
>> sequence in the kernel, then we should report an explicit drop.  If
>> there was only the "sample(50%, emit_sample())" then we should simply
>> consume the skb as it reached its destination in the psample.
>>
>> For the question if OVS in user space should put explicit drop action
>> while preparing to emit sample, this doesn't sound reasonable for the
>> same reason - OVS in user space doesn't know what the intention was of
>> the user or tool that put the sampling action into OpenFlow pipeline.
>>
> 
> I don't see it that way. The spec says that packets whose action sets
> (the result of classification) have no output action and no group action
> must be dropped. Even if OFP sample action is an extension, I don't see
> it invalidating that semantics.
> So, IMHO, OVS does know that a flow that is just sampled is a drop.

This applies to "action sets", but most users are actually using "action
lists" supplied via "Apply-actions" OF instruction and the action sets
always remain empty.  So, from the OF perspective, strictly speaking, we
are dropping every single packet.  So, this is not a good analogy.

> 
>> I actually became more confused about what are we arguing about.
>> To recap:
>>
>>                                      This patch     My proposal
>>
>> 1. emit_sample() is the last            consume        consume
>>     inside the sample()
>>
>> 2. the end of the action list           consume        drop
>>     inside the sample()
>>
>> 3. emit_sample() is the last            drop           consume
>>     outside the sample()
>>
>> 4. the end of the action list           drop           drop
>>     outside the sample()
>>
>> 5. sample() is the last action          consume        consume
>>     and probability failed
>>
>>
>> I don't think cases 1 and 3 should differ, i.e. the behavior should
>> be the same regardless of emit_sample() being inside or outside of
>> the sample().  As a side point, OVS in user space will omit the 100%
>> rate sample() action and will just list inner actions instead.  This
>> means that 100% probability sampling will generate drops and 99% will
>> not.  Doesn't sound right.
>>
> 
> That's what I was refering to in the commit message, we still OVS to
> write:
>     actions:sample(..,emit_sample(..)),drop
> 
>> Case 2 should likely never happen, but I'd like to see a drop reported
>> if that ever happens, because it is not a meaningful list of actions.
>>
>> Best regards, Ilya Maximets.
>>
> 
> I think we could drop this patch if we agree that OVS could write
> explicit drops when it knows the packet is being dropped and sampled
> (the action only has OFP sample actions).
> 
> The drop could be placed inside the odp sample action to avoid
> breaking the clone optimization:
>     actions:sample(50%, actions(emit_sample(),drop)))
> 
> or outside if the sample itself is optimized out:
>     actions:emit_sample(),drop
> 
> IIUC, if we don't do that, we are saying that sampling is incompatible
> with decent drop reporting via kfree_skb infrastructure used by tools
> like dropwatch or retis (among many others). And I think that is
> unnecessarily and deliberately making OVS datapath more difficult to
> troubleshoot.

This makes some sense, so let's ensure that semantics is consistent
within the kernel and discuss how to make the tools happy from the
user space perspective.

But we shouldn't simply drop this patch, we still need to consume the
skb after emit_sample() when it is the last action.  The same as we
do for the userpsace() action.  Though it should be done at the point
of the action introduction.  Having both actions consistent will allow
us to solve the observability problem for both in the same way by
adding explicit drop actions from user space.

On a side note:
I wonder if probability-induced drop needs a separate reason... i.e.
it could have been consumed by emit_smaple()/userspace() but wasn't.

Best regards, Ilya Maximets.

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-19 18:21                 ` Ilya Maximets
@ 2024-06-19 20:40                   ` Adrián Moreno
  2024-06-19 20:56                     ` Ilya Maximets
  0 siblings, 1 reply; 57+ messages in thread
From: Adrián Moreno @ 2024-06-19 20:40 UTC (permalink / raw)
  To: Ilya Maximets
  Cc: netdev, aconole, echaudro, horms, dev, Pravin B Shelar,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	linux-kernel

On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote:
> On 6/19/24 08:35, Adrián Moreno wrote:
> > On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
> >> On 6/18/24 12:50, Adrián Moreno wrote:
> >>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
> >>>> On 6/18/24 09:00, Adrián Moreno wrote:
> >>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
> >>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
> >>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
> >>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
> >>>>>>>> observability-oriented.
> >>>>>>>>
> >>>>>>>> Apart from some corner case in which it's used a replacement of clone()
> >>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
> >>>>>>>> local emit_sample.
> >>>>>>>>
> >>>>>>>> With this in mind, it doesn't make much sense to report
> >>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
> >>>>>>>>
> >>>>>>>> For instance, if the flow:
> >>>>>>>>
> >>>>>>>>   actions:sample(..,emit_sample(..)),2
> >>>>>>>>
> >>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
> >>>>>>>> confusing for users since the packet did reach its destination.
> >>>>>>>>
> >>>>>>>> This patch makes internal action execution silently consume the skb
> >>>>>>>> instead of notifying a drop for this case.
> >>>>>>>>
> >>>>>>>> Unfortunately, this patch does not remove all potential sources of
> >>>>>>>> confusion since, if the sample action itself is the last action, e.g:
> >>>>>>>>
> >>>>>>>>     actions:sample(..,emit_sample(..))
> >>>>>>>>
> >>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
> >>>>>>>>
> >>>>>>>> Sadly, this case is difficult to solve without breaking the
> >>>>>>>> optimization by which the skb is not cloned on last sample actions.
> >>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
> >>>>>>>> after the last sample() and rewrite the flow as:
> >>>>>>>>
> >>>>>>>>     actions:sample(..,emit_sample(..)),drop
> >>>>>>>>
> >>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> >>>>>>>> ---
> >>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
> >>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> >>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
> >>>>>>>> --- a/net/openvswitch/actions.c
> >>>>>>>> +++ b/net/openvswitch/actions.c
> >>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
> >>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
> >>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
> >>>>>>>>
> >>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
> >>>>>>>> +{
> >>>>>>>> +	/* Do not emit packet drops inside sample(). */
> >>>>>>>> +	if (OVS_CB(skb)->probability)
> >>>>>>>> +		consume_skb(skb);
> >>>>>>>> +	else
> >>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +}
> >>>>>>>> +
> >>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
> >>>>>>>>   * space. Return NULL if out of key spaces.
> >>>>>>>>   */
> >>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
> >>>>>>>>  	if ((arg->probability != U32_MAX) &&
> >>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
> >>>>>>>>  		if (last)
> >>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +			ovs_drop_skb_last_action(skb);
> >>>>>>
> >>>>>> Always consuming the skb at this point makes sense, since having smaple()
> >>>>>> as a last action is a reasonable thing to have.  But this looks more like
> >>>>>> a fix for the original drop reason patch set.
> >>>>>>
> >>>>>
> >>>>> I don't think consuming the skb at this point makes sense. It was very
> >>>>> intentionally changed to a drop since a very common use-case for
> >>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
> >>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
> >>>>> that replacement should not have any effect on the number of
> >>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
> >>>>> the same way (only observed in one case).
> >>>>>
> >>>>>
> >>>>>>>>  		return 0;
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>>>>>>  		}
> >>>>>>>>  	}
> >>>>>>>>
> >>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
> >>>>>>>> +	ovs_drop_skb_last_action(skb);
> >>>>>>>
> >>>>>>> I don't think I agree with this one.  If we have a sample() action with
> >>>>>>> a lot of different actions inside and we reached the end while the last
> >>>>>>> action didn't consume the skb, then we should report that.  E.g.
> >>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
> >>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
> >>>>>>>
> >>>>>
> >>>>> What is the use case for such action list? Having an action branch
> >>>>> executed randomly doesn't make sense to me if it's not some
> >>>>> observability thing (which IMHO should not trigger drops).
> >>>>
> >>>> It is exactly my point.  A list of actions that doesn't end is some sort
> >>>> of a terminal action (output, drop, etc) does not make a lot of sense and
> >>>> hence should be signaled as an unexpected drop, so users can re-check the
> >>>> pipeline in case they missed the terminal action somehow.
> >>>>
> >>>>>
> >>>>>>> The only actions that are actually consuming the skb are "output",
> >>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
> >>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
> >>>>>>> "userspace" has an explicit check to consume the skb if it is the last
> >>>>>>> action.  "emit_sample" should have the similar check.  It should likely
> >>>>>>> be added at the point of action introduction instead of having a separate
> >>>>>>> patch.
> >>>>>>>
> >>>>>
> >>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
> >>>>> packet does not continue it's way through the datapath.
> >>>>
> >>>> After "output" the packet leaves the datapath too, i.e. does not continue
> >>>> it's way through OVS datapath.
> >>>>
> >>>
> >>> I meant a broader concept of "datapath". The packet continues. For the
> >>> userspace action this is true only for the CONTROLLER ofp action but
> >>> since the datapath does not know which action it's implementing, we
> >>> cannot do better.
> >>
> >> It's not only controller() action.  Packets can be brought to userspace
> >> for various reason including just an explicit ask to execute some actions
> >> in userspace.  In any case the packet sent to userspace kind of reached its
> >> destination and it's not the "datapath drops the packet" situation.
> >>
> >>>
> >>>>>
> >>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
> >>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
> >>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
> >>>>> we are just observing them.
> >>>>
> >>>> This might make sense from the higher logic in user space application, but
> >>>> it doesn't from the datapath perspective.  And also, if the user adds the
> >>>> 'emit_sample' action for drop monitring, they already know where to find
> >>>> packet samples, they don't need to use tools like dropwatch anymore.
> >>>> This packet is not dropped from the datapath perspective, it is sampled.
> >>>>
> >>>>>
> >>>>> And if we change emit_sample to trigger a drop if it's the last action,
> >>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
> >>>>> which is also terribly confusing.
> >>>>
> >>>> If emit_sample is the last action, then skb should be consumed silently.
> >>>> The same as for "output" and "userspace".
> >>>>
> >>>>>
> >>>>> I think we should try to be clear and informative with what we
> >>>>> _actually_ drop and not require the user that is just running
> >>>>> "dropwatch" to understand the internals of the OVS module.
> >>>>
> >>>> If someone is already using sampling to watch their packet drops, why would
> >>>> they use dropwatch?
> >>>>
> >>>>>
> >>>>> So if you don't want to accept the "observational" nature of sample(),
> >>>>> the only other solution that does not bring even more confusion to OVS
> >>>>> drops would be to have userspace add explicit drop actions. WDYT?
> >>>>>
> >>>>
> >>>> These are not drops from the datapath perspective.  Users can add explicit
> >>>> drop actions if they want to, but I'm really not sure why they would do that
> >>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
> >>>
> >>> Because there is not a single "user". Tools and systems can be built on
> >>> top of tracepoints and samples and they might not be coordinated between
> >>> them. Some observability application can be always enabled and doing
> >>> constant network monitoring or statistics while other lower level tools
> >>> can be run at certain moments to troubleshoot issues.
> >>>
> >>> In order to run dropwatch in a node you don't need to have rights to
> >>> access the OpenFlow controller and ask it to change the OpenFlow rules
> >>> or else dropwatch simply will not show actual packet drops.
> >>
> >> The point is that these are not drops in this scenario.  The packet was
> >> delivered to its destination and hence should not be reported as dropped.
> >> In the observability use-case that you're describing even OpenFlow layer
> >> in OVS doesn't know if these supposed to be treated as packet drops for
> >> the user or if these are just samples with the sampling being the only
> >> intended destination.  For OpenFlow and OVS userspace components these
> >> two scenarios are indistinguishable.  Only the OpenFlow controller knows
> >> that these rules were put in place because it was an ACL created by some
> >> user or tool.  And since OVS in user space can't make such a distinction,
> >> kernel can't make it either, and so shouldn't guess what the user two
> >> levels of abstraction higher up meant.
> >>
> >>>
> >>> To me it seems obvious that drop sampling (via emit_sample) "includes"
> >>> drop reporting via emit_sample. In both cases you get the packet
> >>> headers, but in one case you also get OFP controller metadata. Now even
> >>> if there is a system that uses both, does it make sense to push to them
> >>> the responsibility of dealing with them being mutually exclusive?
> >>>
> >>> I think this makes debugging OVS datapath unnecessarily obscure when we
> >>> know the packet is actually being dropped intentionally by OVS.
> >>
> >> I don't think we know that we're in a drop sampling scenario.  We don't
> >> have enough information even in OVS userspace to tell.
> >>
> >> And having different behavior between "userspace" and "emit_sample" in
> >> the kernel may cause even more confusion, because now two ways of sampling
> >> packets will result in packets showing up in dropwatch in one case, but
> >> not in the other.
> >>
> >>>
> >>> What's the problem with having OVS write the following?
> >>>     "sample(50%, emit_sample()),drop(0)"
> >>
> >> It's a valid sequence of actions, but we shouldn't guess what the end
> >> user meant by putting those actions into the kernel.  If we see such a
> >> sequence in the kernel, then we should report an explicit drop.  If
> >> there was only the "sample(50%, emit_sample())" then we should simply
> >> consume the skb as it reached its destination in the psample.
> >>
> >> For the question if OVS in user space should put explicit drop action
> >> while preparing to emit sample, this doesn't sound reasonable for the
> >> same reason - OVS in user space doesn't know what the intention was of
> >> the user or tool that put the sampling action into OpenFlow pipeline.
> >>
> >
> > I don't see it that way. The spec says that packets whose action sets
> > (the result of classification) have no output action and no group action
> > must be dropped. Even if OFP sample action is an extension, I don't see
> > it invalidating that semantics.
> > So, IMHO, OVS does know that a flow that is just sampled is a drop.
>
> This applies to "action sets", but most users are actually using "action
> lists" supplied via "Apply-actions" OF instruction and the action sets
> always remain empty.  So, from the OF perspective, strictly speaking, we
> are dropping every single packet.  So, this is not a good analogy.
>
> >
> >> I actually became more confused about what are we arguing about.
> >> To recap:
> >>
> >>                                      This patch     My proposal
> >>
> >> 1. emit_sample() is the last            consume        consume
> >>     inside the sample()
> >>
> >> 2. the end of the action list           consume        drop
> >>     inside the sample()
> >>
> >> 3. emit_sample() is the last            drop           consume
> >>     outside the sample()
> >>
> >> 4. the end of the action list           drop           drop
> >>     outside the sample()
> >>
> >> 5. sample() is the last action          consume        consume
> >>     and probability failed
> >>
> >>
> >> I don't think cases 1 and 3 should differ, i.e. the behavior should
> >> be the same regardless of emit_sample() being inside or outside of
> >> the sample().  As a side point, OVS in user space will omit the 100%
> >> rate sample() action and will just list inner actions instead.  This
> >> means that 100% probability sampling will generate drops and 99% will
> >> not.  Doesn't sound right.
> >>
> >
> > That's what I was refering to in the commit message, we still OVS to
> > write:
> >     actions:sample(..,emit_sample(..)),drop
> >
> >> Case 2 should likely never happen, but I'd like to see a drop reported
> >> if that ever happens, because it is not a meaningful list of actions.
> >>
> >> Best regards, Ilya Maximets.
> >>
> >
> > I think we could drop this patch if we agree that OVS could write
> > explicit drops when it knows the packet is being dropped and sampled
> > (the action only has OFP sample actions).
> >
> > The drop could be placed inside the odp sample action to avoid
> > breaking the clone optimization:
> >     actions:sample(50%, actions(emit_sample(),drop)))
> >
> > or outside if the sample itself is optimized out:
> >     actions:emit_sample(),drop
> >
> > IIUC, if we don't do that, we are saying that sampling is incompatible
> > with decent drop reporting via kfree_skb infrastructure used by tools
> > like dropwatch or retis (among many others). And I think that is
> > unnecessarily and deliberately making OVS datapath more difficult to
> > troubleshoot.
>
> This makes some sense, so let's ensure that semantics is consistent
> within the kernel and discuss how to make the tools happy from the
> user space perspective.
>
> But we shouldn't simply drop this patch, we still need to consume the
> skb after emit_sample() when it is the last action.  The same as we
> do for the userpsace() action.  Though it should be done at the point
> of the action introduction.  Having both actions consistent will allow
> us to solve the observability problem for both in the same way by
> adding explicit drop actions from user space.

OK. I'll resend the series dropping this patch (and consuming the skb
apropriately).

>
> On a side note:
> I wonder if probability-induced drop needs a separate reason... i.e.
> it could have been consumed by emit_smaple()/userspace() but wasn't.
>

You mean in sample action "get_random_u32() > arg->probability"?
It only makes sense to drop it if the last action so currently uses
OVS_DROP_LAST_ACTION.

> Best regards, Ilya Maximets.
>

Thanks for the great discussion.
Adrián


^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample
  2024-06-19 20:40                   ` Adrián Moreno
@ 2024-06-19 20:56                     ` Ilya Maximets
  0 siblings, 0 replies; 57+ messages in thread
From: Ilya Maximets @ 2024-06-19 20:56 UTC (permalink / raw)
  To: Adrián Moreno
  Cc: i.maximets, netdev, aconole, echaudro, horms, dev,
	Pravin B Shelar, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, linux-kernel

On 6/19/24 22:40, Adrián Moreno wrote:
> On Wed, Jun 19, 2024 at 08:21:02PM GMT, Ilya Maximets wrote:
>> On 6/19/24 08:35, Adrián Moreno wrote:
>>> On Tue, Jun 18, 2024 at 05:44:05PM GMT, Ilya Maximets wrote:
>>>> On 6/18/24 12:50, Adrián Moreno wrote:
>>>>> On Tue, Jun 18, 2024 at 12:22:23PM GMT, Ilya Maximets wrote:
>>>>>> On 6/18/24 09:00, Adrián Moreno wrote:
>>>>>>> On Mon, Jun 17, 2024 at 02:10:37PM GMT, Ilya Maximets wrote:
>>>>>>>> On 6/17/24 13:55, Ilya Maximets wrote:
>>>>>>>>> On 6/3/24 20:56, Adrian Moreno wrote:
>>>>>>>>>> The OVS_ACTION_ATTR_SAMPLE action is, in essence,
>>>>>>>>>> observability-oriented.
>>>>>>>>>>
>>>>>>>>>> Apart from some corner case in which it's used a replacement of clone()
>>>>>>>>>> for old kernels, it's really only used for sFlow, IPFIX and now,
>>>>>>>>>> local emit_sample.
>>>>>>>>>>
>>>>>>>>>> With this in mind, it doesn't make much sense to report
>>>>>>>>>> OVS_DROP_LAST_ACTION inside sample actions.
>>>>>>>>>>
>>>>>>>>>> For instance, if the flow:
>>>>>>>>>>
>>>>>>>>>>   actions:sample(..,emit_sample(..)),2
>>>>>>>>>>
>>>>>>>>>> triggers a OVS_DROP_LAST_ACTION skb drop event, it would be extremely
>>>>>>>>>> confusing for users since the packet did reach its destination.
>>>>>>>>>>
>>>>>>>>>> This patch makes internal action execution silently consume the skb
>>>>>>>>>> instead of notifying a drop for this case.
>>>>>>>>>>
>>>>>>>>>> Unfortunately, this patch does not remove all potential sources of
>>>>>>>>>> confusion since, if the sample action itself is the last action, e.g:
>>>>>>>>>>
>>>>>>>>>>     actions:sample(..,emit_sample(..))
>>>>>>>>>>
>>>>>>>>>> we actually _should_ generate a OVS_DROP_LAST_ACTION event, but we aren't.
>>>>>>>>>>
>>>>>>>>>> Sadly, this case is difficult to solve without breaking the
>>>>>>>>>> optimization by which the skb is not cloned on last sample actions.
>>>>>>>>>> But, given explicit drop actions are now supported, OVS can just add one
>>>>>>>>>> after the last sample() and rewrite the flow as:
>>>>>>>>>>
>>>>>>>>>>     actions:sample(..,emit_sample(..)),drop
>>>>>>>>>>
>>>>>>>>>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>>  net/openvswitch/actions.c | 13 +++++++++++--
>>>>>>>>>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
>>>>>>>>>> index 33f6d93ba5e4..54fc1abcff95 100644
>>>>>>>>>> --- a/net/openvswitch/actions.c
>>>>>>>>>> +++ b/net/openvswitch/actions.c
>>>>>>>>>> @@ -82,6 +82,15 @@ static struct action_fifo __percpu *action_fifos;
>>>>>>>>>>  static struct action_flow_keys __percpu *flow_keys;
>>>>>>>>>>  static DEFINE_PER_CPU(int, exec_actions_level);
>>>>>>>>>>
>>>>>>>>>> +static inline void ovs_drop_skb_last_action(struct sk_buff *skb)
>>>>>>>>>> +{
>>>>>>>>>> +	/* Do not emit packet drops inside sample(). */
>>>>>>>>>> +	if (OVS_CB(skb)->probability)
>>>>>>>>>> +		consume_skb(skb);
>>>>>>>>>> +	else
>>>>>>>>>> +		ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +}
>>>>>>>>>> +
>>>>>>>>>>  /* Make a clone of the 'key', using the pre-allocated percpu 'flow_keys'
>>>>>>>>>>   * space. Return NULL if out of key spaces.
>>>>>>>>>>   */
>>>>>>>>>> @@ -1061,7 +1070,7 @@ static int sample(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>>>  	if ((arg->probability != U32_MAX) &&
>>>>>>>>>>  	    (!arg->probability || get_random_u32() > arg->probability)) {
>>>>>>>>>>  		if (last)
>>>>>>>>>> -			ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +			ovs_drop_skb_last_action(skb);
>>>>>>>>
>>>>>>>> Always consuming the skb at this point makes sense, since having smaple()
>>>>>>>> as a last action is a reasonable thing to have.  But this looks more like
>>>>>>>> a fix for the original drop reason patch set.
>>>>>>>>
>>>>>>>
>>>>>>> I don't think consuming the skb at this point makes sense. It was very
>>>>>>> intentionally changed to a drop since a very common use-case for
>>>>>>> sampling is drop-sampling, i.e: replacing an empty action list (that
>>>>>>> triggers OVS_DROP_LAST_ACTION) with a sample(emit_sample()). Ideally,
>>>>>>> that replacement should not have any effect on the number of
>>>>>>> OVS_DROP_LAST_ACTION being reported as the packets are being treated in
>>>>>>> the same way (only observed in one case).
>>>>>>>
>>>>>>>
>>>>>>>>>>  		return 0;
>>>>>>>>>>  	}
>>>>>>>>>>
>>>>>>>>>> @@ -1579,7 +1588,7 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
>>>>>>>>>>  		}
>>>>>>>>>>  	}
>>>>>>>>>>
>>>>>>>>>> -	ovs_kfree_skb_reason(skb, OVS_DROP_LAST_ACTION);
>>>>>>>>>> +	ovs_drop_skb_last_action(skb);
>>>>>>>>>
>>>>>>>>> I don't think I agree with this one.  If we have a sample() action with
>>>>>>>>> a lot of different actions inside and we reached the end while the last
>>>>>>>>> action didn't consume the skb, then we should report that.  E.g.
>>>>>>>>> "sample(emit_sample(),push_vlan(),set(eth())),2"  should report that the
>>>>>>>>> cloned skb was dropped.  "sample(push_vlan(),emit_sample())" should not.
>>>>>>>>>
>>>>>>>
>>>>>>> What is the use case for such action list? Having an action branch
>>>>>>> executed randomly doesn't make sense to me if it's not some
>>>>>>> observability thing (which IMHO should not trigger drops).
>>>>>>
>>>>>> It is exactly my point.  A list of actions that doesn't end is some sort
>>>>>> of a terminal action (output, drop, etc) does not make a lot of sense and
>>>>>> hence should be signaled as an unexpected drop, so users can re-check the
>>>>>> pipeline in case they missed the terminal action somehow.
>>>>>>
>>>>>>>
>>>>>>>>> The only actions that are actually consuming the skb are "output",
>>>>>>>>> "userspace", "recirc" and now "emit_sample".  "output" and "recirc" are
>>>>>>>>> consuming the skb "naturally" by stealing it when it is the last action.
>>>>>>>>> "userspace" has an explicit check to consume the skb if it is the last
>>>>>>>>> action.  "emit_sample" should have the similar check.  It should likely
>>>>>>>>> be added at the point of action introduction instead of having a separate
>>>>>>>>> patch.
>>>>>>>>>
>>>>>>>
>>>>>>> Unlinke "output", "recirc", "userspace", etc. with emit_sample the
>>>>>>> packet does not continue it's way through the datapath.
>>>>>>
>>>>>> After "output" the packet leaves the datapath too, i.e. does not continue
>>>>>> it's way through OVS datapath.
>>>>>>
>>>>>
>>>>> I meant a broader concept of "datapath". The packet continues. For the
>>>>> userspace action this is true only for the CONTROLLER ofp action but
>>>>> since the datapath does not know which action it's implementing, we
>>>>> cannot do better.
>>>>
>>>> It's not only controller() action.  Packets can be brought to userspace
>>>> for various reason including just an explicit ask to execute some actions
>>>> in userspace.  In any case the packet sent to userspace kind of reached its
>>>> destination and it's not the "datapath drops the packet" situation.
>>>>
>>>>>
>>>>>>>
>>>>>>> It would be very confusing if OVS starts monitoring drops and adds a bunch
>>>>>>> of flows such as "actions:emit_sample()" and suddently it stops reporting such
>>>>>>> drops via standard kfree_skb_reason. Packets _are_ being dropped here,
>>>>>>> we are just observing them.
>>>>>>
>>>>>> This might make sense from the higher logic in user space application, but
>>>>>> it doesn't from the datapath perspective.  And also, if the user adds the
>>>>>> 'emit_sample' action for drop monitring, they already know where to find
>>>>>> packet samples, they don't need to use tools like dropwatch anymore.
>>>>>> This packet is not dropped from the datapath perspective, it is sampled.
>>>>>>
>>>>>>>
>>>>>>> And if we change emit_sample to trigger a drop if it's the last action,
>>>>>>> then "sample(50%, emit_sample()),2" will trigger a drop half of the times
>>>>>>> which is also terribly confusing.
>>>>>>
>>>>>> If emit_sample is the last action, then skb should be consumed silently.
>>>>>> The same as for "output" and "userspace".
>>>>>>
>>>>>>>
>>>>>>> I think we should try to be clear and informative with what we
>>>>>>> _actually_ drop and not require the user that is just running
>>>>>>> "dropwatch" to understand the internals of the OVS module.
>>>>>>
>>>>>> If someone is already using sampling to watch their packet drops, why would
>>>>>> they use dropwatch?
>>>>>>
>>>>>>>
>>>>>>> So if you don't want to accept the "observational" nature of sample(),
>>>>>>> the only other solution that does not bring even more confusion to OVS
>>>>>>> drops would be to have userspace add explicit drop actions. WDYT?
>>>>>>>
>>>>>>
>>>>>> These are not drops from the datapath perspective.  Users can add explicit
>>>>>> drop actions if they want to, but I'm really not sure why they would do that
>>>>>> if they are already capturing all these packets in psample, sFlow or IPFIX.
>>>>>
>>>>> Because there is not a single "user". Tools and systems can be built on
>>>>> top of tracepoints and samples and they might not be coordinated between
>>>>> them. Some observability application can be always enabled and doing
>>>>> constant network monitoring or statistics while other lower level tools
>>>>> can be run at certain moments to troubleshoot issues.
>>>>>
>>>>> In order to run dropwatch in a node you don't need to have rights to
>>>>> access the OpenFlow controller and ask it to change the OpenFlow rules
>>>>> or else dropwatch simply will not show actual packet drops.
>>>>
>>>> The point is that these are not drops in this scenario.  The packet was
>>>> delivered to its destination and hence should not be reported as dropped.
>>>> In the observability use-case that you're describing even OpenFlow layer
>>>> in OVS doesn't know if these supposed to be treated as packet drops for
>>>> the user or if these are just samples with the sampling being the only
>>>> intended destination.  For OpenFlow and OVS userspace components these
>>>> two scenarios are indistinguishable.  Only the OpenFlow controller knows
>>>> that these rules were put in place because it was an ACL created by some
>>>> user or tool.  And since OVS in user space can't make such a distinction,
>>>> kernel can't make it either, and so shouldn't guess what the user two
>>>> levels of abstraction higher up meant.
>>>>
>>>>>
>>>>> To me it seems obvious that drop sampling (via emit_sample) "includes"
>>>>> drop reporting via emit_sample. In both cases you get the packet
>>>>> headers, but in one case you also get OFP controller metadata. Now even
>>>>> if there is a system that uses both, does it make sense to push to them
>>>>> the responsibility of dealing with them being mutually exclusive?
>>>>>
>>>>> I think this makes debugging OVS datapath unnecessarily obscure when we
>>>>> know the packet is actually being dropped intentionally by OVS.
>>>>
>>>> I don't think we know that we're in a drop sampling scenario.  We don't
>>>> have enough information even in OVS userspace to tell.
>>>>
>>>> And having different behavior between "userspace" and "emit_sample" in
>>>> the kernel may cause even more confusion, because now two ways of sampling
>>>> packets will result in packets showing up in dropwatch in one case, but
>>>> not in the other.
>>>>
>>>>>
>>>>> What's the problem with having OVS write the following?
>>>>>     "sample(50%, emit_sample()),drop(0)"
>>>>
>>>> It's a valid sequence of actions, but we shouldn't guess what the end
>>>> user meant by putting those actions into the kernel.  If we see such a
>>>> sequence in the kernel, then we should report an explicit drop.  If
>>>> there was only the "sample(50%, emit_sample())" then we should simply
>>>> consume the skb as it reached its destination in the psample.
>>>>
>>>> For the question if OVS in user space should put explicit drop action
>>>> while preparing to emit sample, this doesn't sound reasonable for the
>>>> same reason - OVS in user space doesn't know what the intention was of
>>>> the user or tool that put the sampling action into OpenFlow pipeline.
>>>>
>>>
>>> I don't see it that way. The spec says that packets whose action sets
>>> (the result of classification) have no output action and no group action
>>> must be dropped. Even if OFP sample action is an extension, I don't see
>>> it invalidating that semantics.
>>> So, IMHO, OVS does know that a flow that is just sampled is a drop.
>>
>> This applies to "action sets", but most users are actually using "action
>> lists" supplied via "Apply-actions" OF instruction and the action sets
>> always remain empty.  So, from the OF perspective, strictly speaking, we
>> are dropping every single packet.  So, this is not a good analogy.
>>
>>>
>>>> I actually became more confused about what are we arguing about.
>>>> To recap:
>>>>
>>>>                                      This patch     My proposal
>>>>
>>>> 1. emit_sample() is the last            consume        consume
>>>>     inside the sample()
>>>>
>>>> 2. the end of the action list           consume        drop
>>>>     inside the sample()
>>>>
>>>> 3. emit_sample() is the last            drop           consume
>>>>     outside the sample()
>>>>
>>>> 4. the end of the action list           drop           drop
>>>>     outside the sample()
>>>>
>>>> 5. sample() is the last action          consume        consume
>>>>     and probability failed
>>>>
>>>>
>>>> I don't think cases 1 and 3 should differ, i.e. the behavior should
>>>> be the same regardless of emit_sample() being inside or outside of
>>>> the sample().  As a side point, OVS in user space will omit the 100%
>>>> rate sample() action and will just list inner actions instead.  This
>>>> means that 100% probability sampling will generate drops and 99% will
>>>> not.  Doesn't sound right.
>>>>
>>>
>>> That's what I was refering to in the commit message, we still OVS to
>>> write:
>>>     actions:sample(..,emit_sample(..)),drop
>>>
>>>> Case 2 should likely never happen, but I'd like to see a drop reported
>>>> if that ever happens, because it is not a meaningful list of actions.
>>>>
>>>> Best regards, Ilya Maximets.
>>>>
>>>
>>> I think we could drop this patch if we agree that OVS could write
>>> explicit drops when it knows the packet is being dropped and sampled
>>> (the action only has OFP sample actions).
>>>
>>> The drop could be placed inside the odp sample action to avoid
>>> breaking the clone optimization:
>>>     actions:sample(50%, actions(emit_sample(),drop)))
>>>
>>> or outside if the sample itself is optimized out:
>>>     actions:emit_sample(),drop
>>>
>>> IIUC, if we don't do that, we are saying that sampling is incompatible
>>> with decent drop reporting via kfree_skb infrastructure used by tools
>>> like dropwatch or retis (among many others). And I think that is
>>> unnecessarily and deliberately making OVS datapath more difficult to
>>> troubleshoot.
>>
>> This makes some sense, so let's ensure that semantics is consistent
>> within the kernel and discuss how to make the tools happy from the
>> user space perspective.
>>
>> But we shouldn't simply drop this patch, we still need to consume the
>> skb after emit_sample() when it is the last action.  The same as we
>> do for the userpsace() action.  Though it should be done at the point
>> of the action introduction.  Having both actions consistent will allow
>> us to solve the observability problem for both in the same way by
>> adding explicit drop actions from user space.
> 
> OK. I'll resend the series dropping this patch (and consuming the skb
> apropriately).

Thanks!

> 
>>
>> On a side note:
>> I wonder if probability-induced drop needs a separate reason... i.e.
>> it could have been consumed by emit_smaple()/userspace() but wasn't.
>>
> 
> You mean in sample action "get_random_u32() > arg->probability"?
> It only makes sense to drop it if the last action so currently uses
> OVS_DROP_LAST_ACTION.

Sure, but, for example:
  actions:sample(50%,userspace())
In 50% cases we will consume the skb, in 50% we will report a LAST_ACTION
drop.  Looks a little inconsistent.  That's why I was saying that always
consuming on probability check failure is a sane option.  But if we have
  actions:sample(50%,userspace(),drop)
Then reporting a drop makes more sense.  So, I was thinking that maybe the
LAST_ACTION is just not the right drop reason to report.  e.g. something
like OVS_DROP_SAMPLE_PROBABILITY may be more appropriate to report in both
cases.

Anyways, this is only kind of related to this set and may be a separate
change if we decide it is needed.

> 
>> Best regards, Ilya Maximets.
>>
> 
> Thanks for the great discussion.
> Adrián
> 


^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2024-06-19 20:56 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-03 18:56 [PATCH net-next v2 0/9] net: openvswitch: Add sample multicasting Adrian Moreno
2024-06-03 18:56 ` [PATCH net-next v2 1/9] net: psample: add user cookie Adrian Moreno
2024-06-14 16:13   ` Simon Horman
2024-06-03 18:56 ` [PATCH net-next v2 2/9] net: sched: act_sample: add action cookie to sample Adrian Moreno
2024-06-14 16:14   ` Simon Horman
2024-06-17 10:00   ` Ilya Maximets
2024-06-18  7:38     ` Adrián Moreno
2024-06-18  9:42       ` Ilya Maximets
2024-06-03 18:56 ` [PATCH net-next v2 3/9] net: psample: skip packet copy if no listeners Adrian Moreno
2024-06-14 16:15   ` Simon Horman
2024-06-03 18:56 ` [PATCH net-next v2 4/9] net: psample: allow using rate as probability Adrian Moreno
2024-06-14 16:11   ` Simon Horman
2024-06-17  6:32     ` Adrián Moreno
2024-06-17 10:30       ` Simon Horman
2024-06-03 18:56 ` [PATCH net-next v2 5/9] net: openvswitch: add emit_sample action Adrian Moreno
2024-06-05  0:29   ` kernel test robot
2024-06-05 19:31     ` Adrián Moreno
2024-06-05 20:06       ` Simon Horman
2024-06-05 19:51   ` Simon Horman
2024-06-06  8:42     ` Adrián Moreno
2024-06-10 15:46   ` [ovs-dev] " Aaron Conole
2024-06-11  8:39     ` Adrián Moreno
2024-06-11 13:54       ` Aaron Conole
2024-06-11 15:42         ` Adrián Moreno
2024-06-14 16:13   ` Simon Horman
2024-06-17 10:44   ` Ilya Maximets
2024-06-18  7:33     ` Adrián Moreno
2024-06-18  9:47       ` Ilya Maximets
2024-06-18 10:08         ` Ilya Maximets
2024-06-03 18:56 ` [PATCH net-next v2 6/9] net: openvswitch: store sampling probability in cb Adrian Moreno
2024-06-04  6:09   ` kernel test robot
2024-06-04  8:49   ` kernel test robot
2024-06-05 19:34     ` Adrián Moreno
2024-06-14 16:55   ` Aaron Conole
2024-06-17  7:08     ` Adrián Moreno
2024-06-17 11:26       ` Ilya Maximets
2024-06-18  7:36         ` Adrián Moreno
2024-06-03 18:56 ` [PATCH net-next v2 7/9] net: openvswitch: do not notify drops inside sample Adrian Moreno
2024-06-14 16:17   ` Simon Horman
2024-06-17 11:55   ` Ilya Maximets
2024-06-17 12:10     ` Ilya Maximets
2024-06-18  7:00       ` Adrián Moreno
2024-06-18 10:22         ` Ilya Maximets
2024-06-18 10:50           ` Adrián Moreno
2024-06-18 15:44             ` Ilya Maximets
2024-06-19  6:35               ` Adrián Moreno
2024-06-19 18:21                 ` Ilya Maximets
2024-06-19 20:40                   ` Adrián Moreno
2024-06-19 20:56                     ` Ilya Maximets
2024-06-03 18:56 ` [PATCH net-next v2 8/9] selftests: openvswitch: add emit_sample action Adrian Moreno
2024-06-03 18:56 ` [PATCH net-next v2 9/9] selftests: openvswitch: add emit_sample test Adrian Moreno
2024-06-05 19:43   ` Simon Horman
2024-06-10  9:20     ` Adrián Moreno
2024-06-14 17:07   ` Aaron Conole
2024-06-17  7:18     ` Adrián Moreno
2024-06-18  9:08       ` Adrián Moreno
2024-06-18 13:27         ` Aaron Conole

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).