Netdev List
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/3] Add support for SO_PRIORITY cmsg
@ 2024-11-18 14:51 Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 1/3] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-11-18 14:51 UTC (permalink / raw)
  To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

Introduce a new helper function, `sk_set_prio_allowed`,
to centralize the logic for validating priority settings.
Add support for the `SO_PRIORITY` control message,
enabling user-space applications to set socket priority
via control messages (cmsg).

Patch Overview:

Patch 1/3: Introduce `sk_set_prio_allowed` helper function.
Patch 2/3: Add support for setting `SO_PRIORITY` via control messages
Patch 3/3: Add test for SO_PRIORITY setting via control messages

v4:

- Carry Eric's and Willem's "Reviewed-by" tags from v3 to 
  patch 1/3 since that is resubmitted without changes.
- Updated description in patch 2/3.
- Missing ipc6.sockc.priority field added in ping_v6_sendmsg()
  in patch 2/3.
- Update cmsg_so_priority.sh to test SO_PRIORITY sockopt and cmsg
  setting with VLAN priority tagging in patch 3/3. (Ido Schimmel) 
- Rebased on net-next.

v3:

https://lore.kernel.org/netdev/20241107132231.9271-1-annaemesenyiri@gmail.com/
- Updated cover letter text.
- Removed priority field from ipcm_cookie.
- Removed cork->tos value check from ip_setup_cork, so
  cork->priority will now take its value from ipc->sockc.priority.
- Replaced ipc->priority with ipc->sockc.priority
  in ip_cmsg_send().
- Modified the error handling for the SO_PRIORITY
  case in __sock_cmsg_send().
- Added missing initialization for ipc6.sockc.priority.
- Introduced cmsg_so_priority.sh test script.
- Modified cmsg_sender.c to set priority via control message (cmsg).
- Rebased on net-next.

v2:

https://lore.kernel.org/netdev/20241102125136.5030-1-annaemesenyiri@gmail.com/
- Introduced sk_set_prio_allowed helper to check capability
  for setting priority.
- Removed new fields and changed sockcm_cookie::priority
  from char to u32 to align with sk_buff::priority.
- Moved the cork->tos value check for priority setting
  from __ip_make_skb() to ip_setup_cork().
- Rebased on net-next.

v1:

https://lore.kernel.org/all/20241029144142.31382-1-annaemesenyiri@gmail.com/

Anna Emese Nyiri (3):
  Introduce sk_set_prio_allowed helper function
  support SO_PRIORITY cmsg
  test SO_PRIORITY ancillary data with cmsg_sender

 include/net/inet_sock.h                       |   2 +-
 include/net/ip.h                              |   2 +-
 include/net/sock.h                            |   4 +-
 net/can/raw.c                                 |   2 +-
 net/core/sock.c                               |  18 ++-
 net/ipv4/ip_output.c                          |   4 +-
 net/ipv4/ip_sockglue.c                        |   2 +-
 net/ipv4/raw.c                                |   2 +-
 net/ipv6/ip6_output.c                         |   3 +-
 net/ipv6/ping.c                               |   1 +
 net/ipv6/raw.c                                |   3 +-
 net/ipv6/udp.c                                |   1 +
 net/packet/af_packet.c                        |   2 +-
 tools/testing/selftests/net/cmsg_sender.c     |  11 +-
 .../testing/selftests/net/cmsg_so_priority.sh | 147 ++++++++++++++++++
 15 files changed, 189 insertions(+), 15 deletions(-)
 create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh

-- 
2.43.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH net-next v4 1/3] sock: Introduce sk_set_prio_allowed helper function
  2024-11-18 14:51 [PATCH net-next v4 0/3] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
@ 2024-11-18 14:51 ` Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
  2 siblings, 0 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-11-18 14:51 UTC (permalink / raw)
  To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

Simplify priority setting permissions with the 'sk_set_prio_allowed'
function, centralizing the validation logic. This change is made in
anticipation of a second caller in a following patch.
No functional changes.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
 net/core/sock.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 74729d20cd00..9016f984d44e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -454,6 +454,13 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen,
 	return 0;
 }
 
+static bool sk_set_prio_allowed(const struct sock *sk, int val)
+{
+	return ((val >= TC_PRIO_BESTEFFORT && val <= TC_PRIO_INTERACTIVE) ||
+		sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
+		sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN));
+}
+
 static bool sock_needs_netstamp(const struct sock *sk)
 {
 	switch (sk->sk_family) {
@@ -1193,9 +1200,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
 	/* handle options which do not require locking the socket. */
 	switch (optname) {
 	case SO_PRIORITY:
-		if ((val >= 0 && val <= 6) ||
-		    sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
-		    sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+		if (sk_set_prio_allowed(sk, val)) {
 			sock_set_priority(sk, val);
 			return 0;
 		}
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg
  2024-11-18 14:51 [PATCH net-next v4 0/3] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 1/3] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
@ 2024-11-18 14:51 ` Anna Emese Nyiri
  2024-11-18 20:20   ` Willem de Bruijn
  2024-11-18 14:51 ` [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
  2 siblings, 1 reply; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-11-18 14:51 UTC (permalink / raw)
  To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

The Linux socket API currently allows setting SO_PRIORITY at the
socket level, applying a uniform priority to all packets sent through
that socket. The exception to this is IP_TOS, when the priority value
is calculated during the handling of
ancillary data, as implemented in commit <f02db315b8d88>
("ipv4: IP_TOS and IP_TTL can be specified as ancillary data").
However, this is a computed
value, and there is currently no mechanism to set a custom priority
via control messages prior to this patch.

According to this patch, if SO_PRIORITY is specified as ancillary data,
the packet is sent with the priority value set through
sockc->priority, overriding the socket-level values
set via the traditional setsockopt() method. This is analogous to
the existing support for SO_MARK, as implemented in commit
<c6af0c227a22> ("ip: support SO_MARK cmsg").

If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that
takes precedence is the last one in the cmsg list.

This patch has the side effect that raw_send_hdrinc now interprets cmsg
IP_TOS.

Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
 include/net/inet_sock.h | 2 +-
 include/net/ip.h        | 2 +-
 include/net/sock.h      | 4 +++-
 net/can/raw.c           | 2 +-
 net/core/sock.c         | 7 +++++++
 net/ipv4/ip_output.c    | 4 ++--
 net/ipv4/ip_sockglue.c  | 2 +-
 net/ipv4/raw.c          | 2 +-
 net/ipv6/ip6_output.c   | 3 ++-
 net/ipv6/ping.c         | 1 +
 net/ipv6/raw.c          | 3 ++-
 net/ipv6/udp.c          | 1 +
 net/packet/af_packet.c  | 2 +-
 13 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 56d8bc5593d3..3ccbad881d74 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -172,7 +172,7 @@ struct inet_cork {
 	u8			tx_flags;
 	__u8			ttl;
 	__s16			tos;
-	char			priority;
+	u32			priority;
 	__u16			gso_size;
 	u32			ts_opt_id;
 	u64			transmit_time;
diff --git a/include/net/ip.h b/include/net/ip.h
index 0e548c1f2a0e..9f5e33e371fc 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -81,7 +81,6 @@ struct ipcm_cookie {
 	__u8			protocol;
 	__u8			ttl;
 	__s16			tos;
-	char			priority;
 	__u16			gso_size;
 };
 
@@ -96,6 +95,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
 	ipcm_init(ipcm);
 
 	ipcm->sockc.mark = READ_ONCE(inet->sk.sk_mark);
+	ipcm->sockc.priority = READ_ONCE(inet->sk.sk_priority);
 	ipcm->sockc.tsflags = READ_ONCE(inet->sk.sk_tsflags);
 	ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
 	ipcm->addr = inet->inet_saddr;
diff --git a/include/net/sock.h b/include/net/sock.h
index 7464e9f9f47c..316a34d6c48b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1814,13 +1814,15 @@ struct sockcm_cookie {
 	u32 mark;
 	u32 tsflags;
 	u32 ts_opt_id;
+	u32 priority;
 };
 
 static inline void sockcm_init(struct sockcm_cookie *sockc,
 			       const struct sock *sk)
 {
 	*sockc = (struct sockcm_cookie) {
-		.tsflags = READ_ONCE(sk->sk_tsflags)
+		.tsflags = READ_ONCE(sk->sk_tsflags),
+		.priority = READ_ONCE(sk->sk_priority),
 	};
 }
 
diff --git a/net/can/raw.c b/net/can/raw.c
index 255c0a8f39d6..46e8ed9d64da 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	}
 
 	skb->dev = dev;
-	skb->priority = READ_ONCE(sk->sk_priority);
+	skb->priority = sockc.priority;
 	skb->mark = READ_ONCE(sk->sk_mark);
 	skb->tstamp = sockc.transmit_time;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 9016f984d44e..a3d9941c1d32 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2947,6 +2947,13 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
 	case SCM_RIGHTS:
 	case SCM_CREDENTIALS:
 		break;
+	case SO_PRIORITY:
+		if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
+			return -EINVAL;
+		if (!sk_set_prio_allowed(sk, *(u32 *)CMSG_DATA(cmsg)))
+			return -EPERM;
+		sockc->priority = *(u32 *)CMSG_DATA(cmsg);
+		break;
 	default:
 		return -EINVAL;
 	}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 0065b1996c94..cd3e788600cc 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1328,7 +1328,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
 	cork->ttl = ipc->ttl;
 	cork->tos = ipc->tos;
 	cork->mark = ipc->sockc.mark;
-	cork->priority = ipc->priority;
+	cork->priority = ipc->sockc.priority;
 	cork->transmit_time = ipc->sockc.transmit_time;
 	cork->tx_flags = 0;
 	sock_tx_timestamp(sk, &ipc->sockc, &cork->tx_flags);
@@ -1465,7 +1465,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
 		ip_options_build(skb, opt, cork->addr, rt);
 	}
 
-	skb->priority = (cork->tos != -1) ? cork->priority: READ_ONCE(sk->sk_priority);
+	skb->priority = cork->priority;
 	skb->mark = cork->mark;
 	if (sk_is_tcp(sk))
 		skb_set_delivery_time(skb, cork->transmit_time, SKB_CLOCK_MONOTONIC);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index cf377377b52d..f6a03b418dde 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -315,7 +315,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
 			if (val < 0 || val > 255)
 				return -EINVAL;
 			ipc->tos = val;
-			ipc->priority = rt_tos2priority(ipc->tos);
+			ipc->sockc.priority = rt_tos2priority(ipc->tos);
 			break;
 		case IP_PROTOCOL:
 			if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 0e9e01967ec9..4304a68d1db0 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -358,7 +358,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
 	skb_reserve(skb, hlen);
 
 	skb->protocol = htons(ETH_P_IP);
-	skb->priority = READ_ONCE(sk->sk_priority);
+	skb->priority = sockc->priority;
 	skb->mark = sockc->mark;
 	skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid);
 	skb_dst_set(skb, &rt->dst);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f7b4608bb316..ec9673b7ab16 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
 	cork->base.mark = ipc6->sockc.mark;
+	cork->base.priority = ipc6->sockc.priority;
 	sock_tx_timestamp(sk, &ipc6->sockc, &cork->base.tx_flags);
 	if (ipc6->sockc.tsflags & SOCKCM_FLAG_TS_OPT_ID) {
 		cork->base.flags |= IPCORK_TS_OPT_ID;
@@ -1939,7 +1940,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 	hdr->saddr = fl6->saddr;
 	hdr->daddr = *final_dst;
 
-	skb->priority = READ_ONCE(sk->sk_priority);
+	skb->priority = cork->base.priority;
 	skb->mark = cork->base.mark;
 	if (sk_is_tcp(sk))
 		skb_set_delivery_time(skb, cork->base.transmit_time, SKB_CLOCK_MONOTONIC);
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 88b3fcacd4f9..46b8adf6e7f8 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -119,6 +119,7 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		return -EINVAL;
 
 	ipcm6_init_sk(&ipc6, sk);
+	ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
 	ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
 	ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8476a3944a88..a45aba090aa4 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -619,7 +619,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
 	skb_reserve(skb, hlen);
 
 	skb->protocol = htons(ETH_P_IPV6);
-	skb->priority = READ_ONCE(sk->sk_priority);
+	skb->priority = sockc->priority;
 	skb->mark = sockc->mark;
 	skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid);
 
@@ -780,6 +780,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	ipcm6_init(&ipc6);
 	ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
 	ipc6.sockc.mark = fl6.flowi6_mark;
+	ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
 
 	if (sin6) {
 		if (addr_len < SIN6_LEN_RFC2133)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 0cef8ae5d1ea..dcce9fd33e98 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1353,6 +1353,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	ipc6.gso_size = READ_ONCE(up->gso_size);
 	ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
 	ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
+	ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
 
 	/* destination address check */
 	if (sin6) {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 886c0dd47b66..f8d87d622699 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3126,7 +3126,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 	skb->protocol = proto;
 	skb->dev = dev;
-	skb->priority = READ_ONCE(sk->sk_priority);
+	skb->priority = sockc.priority;
 	skb->mark = sockc.mark;
 	skb_set_delivery_type_by_clockid(skb, sockc.transmit_time, sk->sk_clockid);
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
  2024-11-18 14:51 [PATCH net-next v4 0/3] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 1/3] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
  2024-11-18 14:51 ` [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
@ 2024-11-18 14:51 ` Anna Emese Nyiri
  2024-11-18 20:35   ` Willem de Bruijn
  2024-11-20 13:10   ` Ido Schimmel
  2 siblings, 2 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-11-18 14:51 UTC (permalink / raw)
  To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY
ancillary data.

cmsg_so_priority.sh script added to validate SO_PRIORITY behavior 
by creating VLAN device with egress QoS mapping and testing packet
priorities using flower filters. Verify that packets with different
priorities are correctly matched and counted by filters for multiple
protocols and IP versions.

Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
 tools/testing/selftests/net/cmsg_sender.c     |  11 +-
 .../testing/selftests/net/cmsg_so_priority.sh | 147 ++++++++++++++++++
 2 files changed, 157 insertions(+), 1 deletion(-)
 create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh

diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index 876c2db02a63..5267eacc35df 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -52,6 +52,7 @@ struct options {
 		unsigned int tclass;
 		unsigned int hlimit;
 		unsigned int priority;
+		unsigned int priority_cmsg;
 	} sockopt;
 	struct {
 		unsigned int family;
@@ -59,6 +60,7 @@ struct options {
 		unsigned int proto;
 	} sock;
 	struct option_cmsg_u32 mark;
+	struct option_cmsg_u32 priority_cmsg;
 	struct {
 		bool ena;
 		unsigned int delay;
@@ -97,6 +99,7 @@ static void __attribute__((noreturn)) cs_usage(const char *bin)
 	       "\n"
 	       "\t\t-m val  Set SO_MARK with given value\n"
 	       "\t\t-M val  Set SO_MARK via setsockopt\n"
+		   "\t\t-Q val  Set SO_PRIORITY via cmsg\n"
 	       "\t\t-d val  Set SO_TXTIME with given delay (usec)\n"
 	       "\t\t-t      Enable time stamp reporting\n"
 	       "\t\t-f val  Set don't fragment via cmsg\n"
@@ -115,7 +118,7 @@ static void cs_parse_args(int argc, char *argv[])
 {
 	int o;
 
-	while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:")) != -1) {
+	while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:Q:")) != -1) {
 		switch (o) {
 		case 's':
 			opt.silent_send = true;
@@ -148,6 +151,10 @@ static void cs_parse_args(int argc, char *argv[])
 			opt.mark.ena = true;
 			opt.mark.val = atoi(optarg);
 			break;
+		case 'Q':
+			opt.priority_cmsg.ena = true;
+			opt.priority_cmsg.val = atoi(optarg);
+			break;
 		case 'M':
 			opt.sockopt.mark = atoi(optarg);
 			break;
@@ -252,6 +259,8 @@ cs_write_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
 
 	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
 			  SOL_SOCKET, SO_MARK, &opt.mark);
+	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
+			SOL_SOCKET, SO_PRIORITY, &opt.priority_cmsg);
 	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
 			  SOL_IPV6, IPV6_DONTFRAG, &opt.v6.dontfrag);
 	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh
new file mode 100755
index 000000000000..e5919c5ed1a4
--- /dev/null
+++ b/tools/testing/selftests/net/cmsg_so_priority.sh
@@ -0,0 +1,147 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+IP4=192.0.2.1/24
+TGT4=192.0.2.2/24
+TGT4_NO_MASK=192.0.2.2
+TGT4_RAW=192.0.2.3/24
+TGT4_RAW_NO_MASK=192.0.2.3
+IP6=2001:db8::1/64
+TGT6=2001:db8::2/64
+TGT6_NO_MASK=2001:db8::2
+TGT6_RAW=2001:db8::3/64
+TGT6_RAW_NO_MASK=2001:db8::3
+PORT=1234
+DELAY=4000
+
+
+create_filter() {
+
+    local ns=$1
+    local dev=$2
+    local handle=$3
+    local vlan_prio=$4
+    local ip_type=$5
+    local proto=$6
+    local dst_ip=$7
+
+    local cmd="tc -n $ns filter add dev $dev egress pref 1 handle $handle \
+    proto 802.1q flower vlan_prio $vlan_prio vlan_ethtype $ip_type"
+
+    if [[ "$proto" == "u" ]]; then
+        ip_proto="udp"
+    elif [[ "$ip_type" == "ipv4" && "$proto" == "i" ]]; then
+        ip_proto="icmp"
+    elif [[ "$ip_type" == "ipv6" && "$proto" == "i" ]]; then
+        ip_proto="icmpv6"
+    fi
+
+    if [[ "$proto" != "r" ]]; then
+        cmd="$cmd ip_proto $ip_proto"
+    fi
+
+    cmd="$cmd dst_ip $dst_ip action pass"
+
+    eval $cmd
+}
+
+TOTAL_TESTS=0
+FAILED_TESTS=0
+
+check_result() {
+    ((TOTAL_TESTS++))
+    if [ "$1" -ne 0 ]; then
+        ((FAILED_TESTS++))
+    fi
+}
+
+cleanup() {
+    ip link del dummy1 2>/dev/null
+    ip -n ns1 link del dummy1.10 2>/dev/null
+    ip netns del ns1 2>/dev/null
+}
+
+trap cleanup EXIT
+
+
+
+ip netns add ns1
+
+ip -n ns1 link set dev lo up
+ip -n ns1 link add name dummy1 up type dummy
+
+ip -n ns1 link add link dummy1 name dummy1.10 up type vlan id 10 \
+        egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+ip -n ns1 address add $IP4 dev dummy1.10
+ip -n ns1 address add $IP6 dev dummy1.10
+
+ip netns exec ns1 bash -c "
+sysctl -w net.ipv4.ping_group_range='0 2147483647'
+exit"
+
+
+ip -n ns1 neigh add $TGT4_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev \
+        dummy1.10
+ip -n ns1 neigh add $TGT6_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev dummy1.10
+ip -n ns1 neigh add $TGT4_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
+ip -n ns1 neigh add $TGT6_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
+
+tc -n ns1 qdisc add dev dummy1 clsact
+
+FILTER_COUNTER=10
+
+for i in 4 6; do
+    for proto in u i r; do
+        echo "Test IPV$i, prot: $proto"
+        for priority in {0..7}; do
+            if [[ $i == 4 && $proto == "r" ]]; then
+                TGT=$TGT4_RAW_NO_MASK
+            elif [[ $i == 6 && $proto == "r" ]]; then
+                TGT=$TGT6_RAW_NO_MASK
+            elif [ $i == 4 ]; then
+                TGT=$TGT4_NO_MASK
+            else
+                TGT=$TGT6_NO_MASK
+            fi
+
+            handle="${FILTER_COUNTER}${priority}"
+
+            create_filter ns1 dummy1 $handle $priority ipv$i $proto $TGT
+
+            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
+                | jq ".[] | select(.options.handle == ${handle}) | \
+                .options.actions[0].stats.packets")
+
+            if [[ $pkts == 0 ]]; then
+                check_result 0
+            else
+                echo "prio $priority: expected 0, got $pkts"
+                check_result 1
+            fi
+
+            ip netns exec ns1 ./cmsg_sender -$i -Q $priority -d "${DELAY}" -p $proto $TGT $PORT
+            ip netns exec ns1 ./cmsg_sender -$i -P $priority -d "${DELAY}" -p $proto $TGT $PORT
+
+
+            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
+                | jq ".[] | select(.options.handle == ${handle}) | \
+                .options.actions[0].stats.packets")
+            if [[ $pkts == 2 ]]; then
+                check_result 0
+            else
+                echo "prio $priority: expected 2, got $pkts"
+                check_result 1
+            fi
+        done
+        FILTER_COUNTER=$((FILTER_COUNTER + 10))
+    done
+done
+
+if [ $FAILED_TESTS -ne 0 ]; then
+    echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed"
+    exit 1
+else
+    echo "OK - All $TOTAL_TESTS tests passed"
+    exit 0
+fi
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg
  2024-11-18 14:51 ` [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
@ 2024-11-18 20:20   ` Willem de Bruijn
  0 siblings, 0 replies; 8+ messages in thread
From: Willem de Bruijn @ 2024-11-18 20:20 UTC (permalink / raw)
  To: Anna Emese Nyiri, netdev
  Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

Anna Emese Nyiri wrote:
> The Linux socket API currently allows setting SO_PRIORITY at the
> socket level, applying a uniform priority to all packets sent through
> that socket. The exception to this is IP_TOS, when the priority value
> is calculated during the handling of
> ancillary data, as implemented in commit <f02db315b8d88>
> ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data").
> However, this is a computed
> value, and there is currently no mechanism to set a custom priority
> via control messages prior to this patch.
> 
> According to this patch, if SO_PRIORITY is specified as ancillary data,
> the packet is sent with the priority value set through
> sockc->priority, overriding the socket-level values
> set via the traditional setsockopt() method. This is analogous to
> the existing support for SO_MARK, as implemented in commit
> <c6af0c227a22> ("ip: support SO_MARK cmsg").
> 
> If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that
> takes precedence is the last one in the cmsg list.
> 
> This patch has the side effect that raw_send_hdrinc now interprets cmsg
> IP_TOS.
> 
> Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

Good catch on ipv6 ping.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
  2024-11-18 14:51 ` [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
@ 2024-11-18 20:35   ` Willem de Bruijn
  2024-11-19 17:20     ` Anna Nyiri
  2024-11-20 13:10   ` Ido Schimmel
  1 sibling, 1 reply; 8+ messages in thread
From: Willem de Bruijn @ 2024-11-18 20:35 UTC (permalink / raw)
  To: Anna Emese Nyiri, netdev
  Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemb, idosch

Anna Emese Nyiri wrote:
> Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY
> ancillary data.
> 
> cmsg_so_priority.sh script added to validate SO_PRIORITY behavior 
> by creating VLAN device with egress QoS mapping and testing packet
> priorities using flower filters. Verify that packets with different
> priorities are correctly matched and counted by filters for multiple
> protocols and IP versions.
> 
> Suggested-by: Ido Schimmel <idosch@idosch.org>
> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
> ---
>  tools/testing/selftests/net/cmsg_sender.c     |  11 +-
>  .../testing/selftests/net/cmsg_so_priority.sh | 147 ++++++++++++++++++
>  2 files changed, 157 insertions(+), 1 deletion(-)
>  create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh
> 

> +++ b/tools/testing/selftests/net/cmsg_so_priority.sh
> @@ -0,0 +1,147 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +IP4=192.0.2.1/24
> +TGT4=192.0.2.2/24
> +TGT4_NO_MASK=192.0.2.2

nit, avoid duplication:

TGT4_NO_MASK=192.0.2.2
TGT4="${TGT4_NO_MASK}/24"

etc.

Or even drop the versions with the suffix and add that
explicitly where used.

> +TGT4_RAW=192.0.2.3/24
> +TGT4_RAW_NO_MASK=192.0.2.3
> +IP6=2001:db8::1/64
> +TGT6=2001:db8::2/64
> +TGT6_NO_MASK=2001:db8::2
> +TGT6_RAW=2001:db8::3/64
> +TGT6_RAW_NO_MASK=2001:db8::3
> +PORT=1234
> +DELAY=4000
> +
> +
> +create_filter() {
> +
> +    local ns=$1
> +    local dev=$2
> +    local handle=$3
> +    local vlan_prio=$4
> +    local ip_type=$5
> +    local proto=$6
> +    local dst_ip=$7
> +
> +    local cmd="tc -n $ns filter add dev $dev egress pref 1 handle $handle \
> +    proto 802.1q flower vlan_prio $vlan_prio vlan_ethtype $ip_type"

nit: indentation on line break. Break inside string is ideally avoided too.

perhaps just avoid the string and below call

    tc -n "$ns" filter add dev "$dev" \
            egress pref 1 handle "$handle" proto 802.1q \
            dst_ip "$dst_ip" "$ip_proto_opt"
            flower vlan_prio "$vlan_prio" vlan_ethtype "$ip_type" \
	    action pass
> +
> +    if [[ "$proto" == "u" ]]; then
> +        ip_proto="udp"
> +    elif [[ "$ip_type" == "ipv4" && "$proto" == "i" ]]; then
> +        ip_proto="icmp"
> +    elif [[ "$ip_type" == "ipv6" && "$proto" == "i" ]]; then
> +        ip_proto="icmpv6"
> +    fi
> +
> +    if [[ "$proto" != "r" ]]; then
> +        cmd="$cmd ip_proto $ip_proto"
> +    fi
> +
> +    cmd="$cmd dst_ip $dst_ip action pass"
> +
> +    eval $cmd
> +}
> +
> +TOTAL_TESTS=0
> +FAILED_TESTS=0
> +
> +check_result() {
> +    ((TOTAL_TESTS++))
> +    if [ "$1" -ne 0 ]; then
> +        ((FAILED_TESTS++))
> +    fi
> +}
> +
> +cleanup() {
> +    ip link del dummy1 2>/dev/null

Both devices are in ns1.

No need to clean up the devices explicitly. Just deleting the netns
will remove them.

> +    ip -n ns1 link del dummy1.10 2>/dev/null
> +    ip netns del ns1 2>/dev/null

> +}
> +
> +trap cleanup EXIT
> +
> +
> +
> +ip netns add ns1
> +
> +ip -n ns1 link set dev lo up
> +ip -n ns1 link add name dummy1 up type dummy
> +
> +ip -n ns1 link add link dummy1 name dummy1.10 up type vlan id 10 \
> +        egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
> +
> +ip -n ns1 address add $IP4 dev dummy1.10
> +ip -n ns1 address add $IP6 dev dummy1.10
> +
> +ip netns exec ns1 bash -c "
> +sysctl -w net.ipv4.ping_group_range='0 2147483647'
> +exit"

Same point on indentation on line continuation.

No need for explicit exit.

> +
> +
> +ip -n ns1 neigh add $TGT4_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev \
> +        dummy1.10
> +ip -n ns1 neigh add $TGT6_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev dummy1.10
> +ip -n ns1 neigh add $TGT4_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> +ip -n ns1 neigh add $TGT6_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> +
> +tc -n ns1 qdisc add dev dummy1 clsact
> +
> +FILTER_COUNTER=10
> +
> +for i in 4 6; do
> +    for proto in u i r; do
> +        echo "Test IPV$i, prot: $proto"
> +        for priority in {0..7}; do
> +            if [[ $i == 4 && $proto == "r" ]]; then
> +                TGT=$TGT4_RAW_NO_MASK
> +            elif [[ $i == 6 && $proto == "r" ]]; then
> +                TGT=$TGT6_RAW_NO_MASK
> +            elif [ $i == 4 ]; then
> +                TGT=$TGT4_NO_MASK
> +            else
> +                TGT=$TGT6_NO_MASK
> +            fi
> +
> +            handle="${FILTER_COUNTER}${priority}"
> +
> +            create_filter ns1 dummy1 $handle $priority ipv$i $proto $TGT
> +
> +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> +                | jq ".[] | select(.options.handle == ${handle}) | \

Can jq be assumed installed on all machines?

> +                .options.actions[0].stats.packets")
> +
> +            if [[ $pkts == 0 ]]; then
> +                check_result 0
> +            else
> +                echo "prio $priority: expected 0, got $pkts"
> +                check_result 1
> +            fi
> +
> +            ip netns exec ns1 ./cmsg_sender -$i -Q $priority -d "${DELAY}" -p $proto $TGT $PORT
> +            ip netns exec ns1 ./cmsg_sender -$i -P $priority -d "${DELAY}" -p $proto $TGT $PORT
> +
> +
> +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> +                | jq ".[] | select(.options.handle == ${handle}) | \
> +                .options.actions[0].stats.packets")
> +            if [[ $pkts == 2 ]]; then
> +                check_result 0
> +            else
> +                echo "prio $priority: expected 2, got $pkts"
> +                check_result 1

I'd test -Q and -p separately. A bit of extra code to repeat the pkts
read. But worth it.

> +            fi
> +        done
> +        FILTER_COUNTER=$((FILTER_COUNTER + 10))
> +    done
> +done
> +
> +if [ $FAILED_TESTS -ne 0 ]; then
> +    echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed"
> +    exit 1
> +else
> +    echo "OK - All $TOTAL_TESTS tests passed"
> +    exit 0
> +fi
> -- 
> 2.43.0
> 



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
  2024-11-18 20:35   ` Willem de Bruijn
@ 2024-11-19 17:20     ` Anna Nyiri
  0 siblings, 0 replies; 8+ messages in thread
From: Anna Nyiri @ 2024-11-19 17:20 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: netdev, fejes, edumazet, kuba, pabeni, willemb, idosch

Thank you for the feedback, I will update the test.
Since the features are working well and the issues in the test are
cosmetic, could it be merged in its current state?
After the merge, I would submit the correction as a fix.

Willem de Bruijn <willemdebruijn.kernel@gmail.com> ezt írta (időpont:
2024. nov. 18., H, 21:35):
>
> Anna Emese Nyiri wrote:
> > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY
> > ancillary data.
> >
> > cmsg_so_priority.sh script added to validate SO_PRIORITY behavior
> > by creating VLAN device with egress QoS mapping and testing packet
> > priorities using flower filters. Verify that packets with different
> > priorities are correctly matched and counted by filters for multiple
> > protocols and IP versions.
> >
> > Suggested-by: Ido Schimmel <idosch@idosch.org>
> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
> > ---
> >  tools/testing/selftests/net/cmsg_sender.c     |  11 +-
> >  .../testing/selftests/net/cmsg_so_priority.sh | 147 ++++++++++++++++++
> >  2 files changed, 157 insertions(+), 1 deletion(-)
> >  create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh
> >
>
> > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh
> > @@ -0,0 +1,147 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +IP4=192.0.2.1/24
> > +TGT4=192.0.2.2/24
> > +TGT4_NO_MASK=192.0.2.2
>
> nit, avoid duplication:
>
> TGT4_NO_MASK=192.0.2.2
> TGT4="${TGT4_NO_MASK}/24"
>
> etc.
>
> Or even drop the versions with the suffix and add that
> explicitly where used.
>
> > +TGT4_RAW=192.0.2.3/24
> > +TGT4_RAW_NO_MASK=192.0.2.3
> > +IP6=2001:db8::1/64
> > +TGT6=2001:db8::2/64
> > +TGT6_NO_MASK=2001:db8::2
> > +TGT6_RAW=2001:db8::3/64
> > +TGT6_RAW_NO_MASK=2001:db8::3
> > +PORT=1234
> > +DELAY=4000
> > +
> > +
> > +create_filter() {
> > +
> > +    local ns=$1
> > +    local dev=$2
> > +    local handle=$3
> > +    local vlan_prio=$4
> > +    local ip_type=$5
> > +    local proto=$6
> > +    local dst_ip=$7
> > +
> > +    local cmd="tc -n $ns filter add dev $dev egress pref 1 handle $handle \
> > +    proto 802.1q flower vlan_prio $vlan_prio vlan_ethtype $ip_type"
>
> nit: indentation on line break. Break inside string is ideally avoided too.
>
> perhaps just avoid the string and below call
>
>     tc -n "$ns" filter add dev "$dev" \
>             egress pref 1 handle "$handle" proto 802.1q \
>             dst_ip "$dst_ip" "$ip_proto_opt"
>             flower vlan_prio "$vlan_prio" vlan_ethtype "$ip_type" \
>             action pass
> > +
> > +    if [[ "$proto" == "u" ]]; then
> > +        ip_proto="udp"
> > +    elif [[ "$ip_type" == "ipv4" && "$proto" == "i" ]]; then
> > +        ip_proto="icmp"
> > +    elif [[ "$ip_type" == "ipv6" && "$proto" == "i" ]]; then
> > +        ip_proto="icmpv6"
> > +    fi
> > +
> > +    if [[ "$proto" != "r" ]]; then
> > +        cmd="$cmd ip_proto $ip_proto"
> > +    fi
> > +
> > +    cmd="$cmd dst_ip $dst_ip action pass"
> > +
> > +    eval $cmd
> > +}
> > +
> > +TOTAL_TESTS=0
> > +FAILED_TESTS=0
> > +
> > +check_result() {
> > +    ((TOTAL_TESTS++))
> > +    if [ "$1" -ne 0 ]; then
> > +        ((FAILED_TESTS++))
> > +    fi
> > +}
> > +
> > +cleanup() {
> > +    ip link del dummy1 2>/dev/null
>
> Both devices are in ns1.
>
> No need to clean up the devices explicitly. Just deleting the netns
> will remove them.
>
> > +    ip -n ns1 link del dummy1.10 2>/dev/null
> > +    ip netns del ns1 2>/dev/null
>
> > +}
> > +
> > +trap cleanup EXIT
> > +
> > +
> > +
> > +ip netns add ns1
> > +
> > +ip -n ns1 link set dev lo up
> > +ip -n ns1 link add name dummy1 up type dummy
> > +
> > +ip -n ns1 link add link dummy1 name dummy1.10 up type vlan id 10 \
> > +        egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
> > +
> > +ip -n ns1 address add $IP4 dev dummy1.10
> > +ip -n ns1 address add $IP6 dev dummy1.10
> > +
> > +ip netns exec ns1 bash -c "
> > +sysctl -w net.ipv4.ping_group_range='0 2147483647'
> > +exit"
>
> Same point on indentation on line continuation.
>
> No need for explicit exit.
>
> > +
> > +
> > +ip -n ns1 neigh add $TGT4_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev \
> > +        dummy1.10
> > +ip -n ns1 neigh add $TGT6_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev dummy1.10
> > +ip -n ns1 neigh add $TGT4_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> > +ip -n ns1 neigh add $TGT6_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> > +
> > +tc -n ns1 qdisc add dev dummy1 clsact
> > +
> > +FILTER_COUNTER=10
> > +
> > +for i in 4 6; do
> > +    for proto in u i r; do
> > +        echo "Test IPV$i, prot: $proto"
> > +        for priority in {0..7}; do
> > +            if [[ $i == 4 && $proto == "r" ]]; then
> > +                TGT=$TGT4_RAW_NO_MASK
> > +            elif [[ $i == 6 && $proto == "r" ]]; then
> > +                TGT=$TGT6_RAW_NO_MASK
> > +            elif [ $i == 4 ]; then
> > +                TGT=$TGT4_NO_MASK
> > +            else
> > +                TGT=$TGT6_NO_MASK
> > +            fi
> > +
> > +            handle="${FILTER_COUNTER}${priority}"
> > +
> > +            create_filter ns1 dummy1 $handle $priority ipv$i $proto $TGT
> > +
> > +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> > +                | jq ".[] | select(.options.handle == ${handle}) | \
>
> Can jq be assumed installed on all machines?
>
> > +                .options.actions[0].stats.packets")
> > +
> > +            if [[ $pkts == 0 ]]; then
> > +                check_result 0
> > +            else
> > +                echo "prio $priority: expected 0, got $pkts"
> > +                check_result 1
> > +            fi
> > +
> > +            ip netns exec ns1 ./cmsg_sender -$i -Q $priority -d "${DELAY}" -p $proto $TGT $PORT
> > +            ip netns exec ns1 ./cmsg_sender -$i -P $priority -d "${DELAY}" -p $proto $TGT $PORT
> > +
> > +
> > +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> > +                | jq ".[] | select(.options.handle == ${handle}) | \
> > +                .options.actions[0].stats.packets")
> > +            if [[ $pkts == 2 ]]; then
> > +                check_result 0
> > +            else
> > +                echo "prio $priority: expected 2, got $pkts"
> > +                check_result 1
>
> I'd test -Q and -p separately. A bit of extra code to repeat the pkts
> read. But worth it.
>
> > +            fi
> > +        done
> > +        FILTER_COUNTER=$((FILTER_COUNTER + 10))
> > +    done
> > +done
> > +
> > +if [ $FAILED_TESTS -ne 0 ]; then
> > +    echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed"
> > +    exit 1
> > +else
> > +    echo "OK - All $TOTAL_TESTS tests passed"
> > +    exit 0
> > +fi
> > --
> > 2.43.0
> >
>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
  2024-11-18 14:51 ` [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
  2024-11-18 20:35   ` Willem de Bruijn
@ 2024-11-20 13:10   ` Ido Schimmel
  1 sibling, 0 replies; 8+ messages in thread
From: Ido Schimmel @ 2024-11-20 13:10 UTC (permalink / raw)
  To: Anna Emese Nyiri; +Cc: netdev, fejes, edumazet, kuba, pabeni, willemb

On Mon, Nov 18, 2024 at 03:51:47PM +0100, Anna Emese Nyiri wrote:
> Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY
> ancillary data.
> 
> cmsg_so_priority.sh script added to validate SO_PRIORITY behavior 
> by creating VLAN device with egress QoS mapping and testing packet
> priorities using flower filters. Verify that packets with different
> priorities are correctly matched and counted by filters for multiple
> protocols and IP versions.
> 
> Suggested-by: Ido Schimmel <idosch@idosch.org>
> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
> ---
>  tools/testing/selftests/net/cmsg_sender.c     |  11 +-
>  .../testing/selftests/net/cmsg_so_priority.sh | 147 ++++++++++++++++++
>  2 files changed, 157 insertions(+), 1 deletion(-)
>  create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh

Please add cmsg_so_priority.sh to tools/testing/selftests/net/Makefile
so that the test will be exercised as part of the netdev CI.

> 
> diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
> index 876c2db02a63..5267eacc35df 100644
> --- a/tools/testing/selftests/net/cmsg_sender.c
> +++ b/tools/testing/selftests/net/cmsg_sender.c
> @@ -52,6 +52,7 @@ struct options {
>  		unsigned int tclass;
>  		unsigned int hlimit;
>  		unsigned int priority;
> +		unsigned int priority_cmsg;

Why do you need this? Looks like it's unused

>  	} sockopt;
>  	struct {
>  		unsigned int family;
> @@ -59,6 +60,7 @@ struct options {
>  		unsigned int proto;
>  	} sock;
>  	struct option_cmsg_u32 mark;
> +	struct option_cmsg_u32 priority_cmsg;

To be consistent with other cmsg variables I would just name it
'priority' instead of 'priority_cmsg'

>  	struct {
>  		bool ena;
>  		unsigned int delay;
> @@ -97,6 +99,7 @@ static void __attribute__((noreturn)) cs_usage(const char *bin)
>  	       "\n"
>  	       "\t\t-m val  Set SO_MARK with given value\n"
>  	       "\t\t-M val  Set SO_MARK via setsockopt\n"

While at it, please add documentation for "-P" (SO_PRIORITY via
setsockopt). I noticed it is missing.

> +		   "\t\t-Q val  Set SO_PRIORITY via cmsg\n"

The alignment here is off.

>  	       "\t\t-d val  Set SO_TXTIME with given delay (usec)\n"
>  	       "\t\t-t      Enable time stamp reporting\n"
>  	       "\t\t-f val  Set don't fragment via cmsg\n"
> @@ -115,7 +118,7 @@ static void cs_parse_args(int argc, char *argv[])
>  {
>  	int o;
>  
> -	while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:")) != -1) {
> +	while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:Q:")) != -1) {
>  		switch (o) {
>  		case 's':
>  			opt.silent_send = true;
> @@ -148,6 +151,10 @@ static void cs_parse_args(int argc, char *argv[])
>  			opt.mark.ena = true;
>  			opt.mark.val = atoi(optarg);
>  			break;
> +		case 'Q':
> +			opt.priority_cmsg.ena = true;
> +			opt.priority_cmsg.val = atoi(optarg);
> +			break;
>  		case 'M':
>  			opt.sockopt.mark = atoi(optarg);
>  			break;
> @@ -252,6 +259,8 @@ cs_write_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
>  
>  	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
>  			  SOL_SOCKET, SO_MARK, &opt.mark);
> +	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
> +			SOL_SOCKET, SO_PRIORITY, &opt.priority_cmsg);
>  	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
>  			  SOL_IPV6, IPV6_DONTFRAG, &opt.v6.dontfrag);
>  	ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
> diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh
> new file mode 100755
> index 000000000000..e5919c5ed1a4
> --- /dev/null
> +++ b/tools/testing/selftests/net/cmsg_so_priority.sh
> @@ -0,0 +1,147 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +
> +IP4=192.0.2.1/24
> +TGT4=192.0.2.2/24
> +TGT4_NO_MASK=192.0.2.2
> +TGT4_RAW=192.0.2.3/24
> +TGT4_RAW_NO_MASK=192.0.2.3
> +IP6=2001:db8::1/64
> +TGT6=2001:db8::2/64
> +TGT6_NO_MASK=2001:db8::2
> +TGT6_RAW=2001:db8::3/64
> +TGT6_RAW_NO_MASK=2001:db8::3
> +PORT=1234
> +DELAY=4000
> +
> +
> +create_filter() {
> +

Unnecessary blank line

> +    local ns=$1
> +    local dev=$2

These two are always the same so no need to parameterize them

> +    local handle=$3
> +    local vlan_prio=$4
> +    local ip_type=$5
> +    local proto=$6
> +    local dst_ip=$7
> +
> +    local cmd="tc -n $ns filter add dev $dev egress pref 1 handle $handle \
> +    proto 802.1q flower vlan_prio $vlan_prio vlan_ethtype $ip_type"
> +
> +    if [[ "$proto" == "u" ]]; then
> +        ip_proto="udp"
> +    elif [[ "$ip_type" == "ipv4" && "$proto" == "i" ]]; then
> +        ip_proto="icmp"
> +    elif [[ "$ip_type" == "ipv6" && "$proto" == "i" ]]; then
> +        ip_proto="icmpv6"
> +    fi
> +
> +    if [[ "$proto" != "r" ]]; then
> +        cmd="$cmd ip_proto $ip_proto"
> +    fi
> +
> +    cmd="$cmd dst_ip $dst_ip action pass"
> +
> +    eval $cmd
> +}
> +
> +TOTAL_TESTS=0
> +FAILED_TESTS=0
> +
> +check_result() {
> +    ((TOTAL_TESTS++))
> +    if [ "$1" -ne 0 ]; then
> +        ((FAILED_TESTS++))
> +    fi
> +}
> +
> +cleanup() {
> +    ip link del dummy1 2>/dev/null
> +    ip -n ns1 link del dummy1.10 2>/dev/null
> +    ip netns del ns1 2>/dev/null
> +}
> +
> +trap cleanup EXIT
> +
> +
> +
> +ip netns add ns1

Please use cmsg_so_mark.sh as a reference and check how it's creating
the namespace. It's done via the setup_ns() helper in lib.sh:

setup_ns NS

It will generate a random name for the namespace, allowing multiple
tests to be run in parallel without conflicts.

> +
> +ip -n ns1 link set dev lo up
> +ip -n ns1 link add name dummy1 up type dummy
> +
> +ip -n ns1 link add link dummy1 name dummy1.10 up type vlan id 10 \
> +        egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
> +
> +ip -n ns1 address add $IP4 dev dummy1.10
> +ip -n ns1 address add $IP6 dev dummy1.10
> +
> +ip netns exec ns1 bash -c "
> +sysctl -w net.ipv4.ping_group_range='0 2147483647'
> +exit"

Can be:

ip netns exec ns1 sysctl -wq net.ipv4.ping_group_range='0 2147483647'

Note the '-q' option.

> +
> +
> +ip -n ns1 neigh add $TGT4_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev \
> +        dummy1.10
> +ip -n ns1 neigh add $TGT6_NO_MASK lladdr 00:11:22:33:44:55 nud permanent dev dummy1.10
> +ip -n ns1 neigh add $TGT4_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> +ip -n ns1 neigh add $TGT6_RAW_NO_MASK lladdr 00:11:22:33:44:66 nud permanent dev dummy1.10
> +
> +tc -n ns1 qdisc add dev dummy1 clsact
> +
> +FILTER_COUNTER=10
> +
> +for i in 4 6; do
> +    for proto in u i r; do
> +        echo "Test IPV$i, prot: $proto"
> +        for priority in {0..7}; do
> +            if [[ $i == 4 && $proto == "r" ]]; then
> +                TGT=$TGT4_RAW_NO_MASK
> +            elif [[ $i == 6 && $proto == "r" ]]; then
> +                TGT=$TGT6_RAW_NO_MASK
> +            elif [ $i == 4 ]; then
> +                TGT=$TGT4_NO_MASK
> +            else
> +                TGT=$TGT6_NO_MASK
> +            fi
> +
> +            handle="${FILTER_COUNTER}${priority}"
> +
> +            create_filter ns1 dummy1 $handle $priority ipv$i $proto $TGT
> +
> +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> +                | jq ".[] | select(.options.handle == ${handle}) | \
> +                .options.actions[0].stats.packets")
> +
> +            if [[ $pkts == 0 ]]; then
> +                check_result 0
> +            else
> +                echo "prio $priority: expected 0, got $pkts"
> +                check_result 1
> +            fi
> +
> +            ip netns exec ns1 ./cmsg_sender -$i -Q $priority -d "${DELAY}" -p $proto $TGT $PORT
> +            ip netns exec ns1 ./cmsg_sender -$i -P $priority -d "${DELAY}" -p $proto $TGT $PORT
> +
> +
> +            pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \
> +                | jq ".[] | select(.options.handle == ${handle}) | \
> +                .options.actions[0].stats.packets")
> +            if [[ $pkts == 2 ]]; then
> +                check_result 0
> +            else
> +                echo "prio $priority: expected 2, got $pkts"
> +                check_result 1
> +            fi
> +        done
> +        FILTER_COUNTER=$((FILTER_COUNTER + 10))
> +    done
> +done
> +
> +if [ $FAILED_TESTS -ne 0 ]; then
> +    echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed"
> +    exit 1
> +else
> +    echo "OK - All $TOTAL_TESTS tests passed"
> +    exit 0
> +fi
> -- 
> 2.43.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-11-20 13:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-18 14:51 [PATCH net-next v4 0/3] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
2024-11-18 14:51 ` [PATCH net-next v4 1/3] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
2024-11-18 14:51 ` [PATCH net-next v4 2/3] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
2024-11-18 20:20   ` Willem de Bruijn
2024-11-18 14:51 ` [PATCH net-next v4 3/3] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
2024-11-18 20:35   ` Willem de Bruijn
2024-11-19 17:20     ` Anna Nyiri
2024-11-20 13:10   ` Ido Schimmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox