* [PATCH net-next v7 1/4] sock: Introduce sk_set_prio_allowed helper function
2024-12-13 8:44 [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
@ 2024-12-13 8:44 ` Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 2/4] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
` (3 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-12-13 8:44 UTC (permalink / raw)
To: netdev
Cc: fejes, edumazet, kuba, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc, Anna Emese Nyiri
Simplify priority setting permissions with the 'sk_set_prio_allowed'
function, centralizing the validation logic. This change is made in
anticipation of a second caller in a following patch.
No functional changes.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
net/core/sock.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/net/core/sock.c b/net/core/sock.c
index 74729d20cd00..9016f984d44e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -454,6 +454,13 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen,
return 0;
}
+static bool sk_set_prio_allowed(const struct sock *sk, int val)
+{
+ return ((val >= TC_PRIO_BESTEFFORT && val <= TC_PRIO_INTERACTIVE) ||
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
+ sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN));
+}
+
static bool sock_needs_netstamp(const struct sock *sk)
{
switch (sk->sk_family) {
@@ -1193,9 +1200,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
/* handle options which do not require locking the socket. */
switch (optname) {
case SO_PRIORITY:
- if ((val >= 0 && val <= 6) ||
- sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) ||
- sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) {
+ if (sk_set_prio_allowed(sk, val)) {
sock_set_priority(sk, val);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next v7 2/4] sock: support SO_PRIORITY cmsg
2024-12-13 8:44 [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 1/4] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
@ 2024-12-13 8:44 ` Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 3/4] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
` (2 subsequent siblings)
4 siblings, 0 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-12-13 8:44 UTC (permalink / raw)
To: netdev
Cc: fejes, edumazet, kuba, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc, Anna Emese Nyiri
The Linux socket API currently allows setting SO_PRIORITY at the
socket level, applying a uniform priority to all packets sent through
that socket. The exception to this is IP_TOS, when the priority value
is calculated during the handling of
ancillary data, as implemented in commit <f02db315b8d88>
("ipv4: IP_TOS and IP_TTL can be specified as ancillary data").
However, this is a computed
value, and there is currently no mechanism to set a custom priority
via control messages prior to this patch.
According to this patch, if SO_PRIORITY is specified as ancillary data,
the packet is sent with the priority value set through
sockc->priority, overriding the socket-level values
set via the traditional setsockopt() method. This is analogous to
the existing support for SO_MARK, as implemented in commit
<c6af0c227a22> ("ip: support SO_MARK cmsg").
If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that
takes precedence is the last one in the cmsg list.
This patch has the side effect that raw_send_hdrinc now interprets cmsg
IP_TOS.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
include/net/inet_sock.h | 2 +-
include/net/ip.h | 2 +-
include/net/sock.h | 4 +++-
net/can/raw.c | 2 +-
net/core/sock.c | 7 +++++++
net/ipv4/ip_output.c | 4 ++--
net/ipv4/ip_sockglue.c | 2 +-
net/ipv4/raw.c | 2 +-
net/ipv6/ip6_output.c | 3 ++-
net/ipv6/ping.c | 1 +
net/ipv6/raw.c | 3 ++-
net/ipv6/udp.c | 1 +
net/packet/af_packet.c | 2 +-
13 files changed, 24 insertions(+), 11 deletions(-)
diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h
index 56d8bc5593d3..3ccbad881d74 100644
--- a/include/net/inet_sock.h
+++ b/include/net/inet_sock.h
@@ -172,7 +172,7 @@ struct inet_cork {
u8 tx_flags;
__u8 ttl;
__s16 tos;
- char priority;
+ u32 priority;
__u16 gso_size;
u32 ts_opt_id;
u64 transmit_time;
diff --git a/include/net/ip.h b/include/net/ip.h
index 0e548c1f2a0e..9f5e33e371fc 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -81,7 +81,6 @@ struct ipcm_cookie {
__u8 protocol;
__u8 ttl;
__s16 tos;
- char priority;
__u16 gso_size;
};
@@ -96,6 +95,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm,
ipcm_init(ipcm);
ipcm->sockc.mark = READ_ONCE(inet->sk.sk_mark);
+ ipcm->sockc.priority = READ_ONCE(inet->sk.sk_priority);
ipcm->sockc.tsflags = READ_ONCE(inet->sk.sk_tsflags);
ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if);
ipcm->addr = inet->inet_saddr;
diff --git a/include/net/sock.h b/include/net/sock.h
index 7464e9f9f47c..316a34d6c48b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1814,13 +1814,15 @@ struct sockcm_cookie {
u32 mark;
u32 tsflags;
u32 ts_opt_id;
+ u32 priority;
};
static inline void sockcm_init(struct sockcm_cookie *sockc,
const struct sock *sk)
{
*sockc = (struct sockcm_cookie) {
- .tsflags = READ_ONCE(sk->sk_tsflags)
+ .tsflags = READ_ONCE(sk->sk_tsflags),
+ .priority = READ_ONCE(sk->sk_priority),
};
}
diff --git a/net/can/raw.c b/net/can/raw.c
index 255c0a8f39d6..46e8ed9d64da 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
}
skb->dev = dev;
- skb->priority = READ_ONCE(sk->sk_priority);
+ skb->priority = sockc.priority;
skb->mark = READ_ONCE(sk->sk_mark);
skb->tstamp = sockc.transmit_time;
diff --git a/net/core/sock.c b/net/core/sock.c
index 9016f984d44e..a3d9941c1d32 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2947,6 +2947,13 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg,
case SCM_RIGHTS:
case SCM_CREDENTIALS:
break;
+ case SO_PRIORITY:
+ if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32)))
+ return -EINVAL;
+ if (!sk_set_prio_allowed(sk, *(u32 *)CMSG_DATA(cmsg)))
+ return -EPERM;
+ sockc->priority = *(u32 *)CMSG_DATA(cmsg);
+ break;
default:
return -EINVAL;
}
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a59204a8d850..f45a083f2c13 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1333,7 +1333,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
cork->ttl = ipc->ttl;
cork->tos = ipc->tos;
cork->mark = ipc->sockc.mark;
- cork->priority = ipc->priority;
+ cork->priority = ipc->sockc.priority;
cork->transmit_time = ipc->sockc.transmit_time;
cork->tx_flags = 0;
sock_tx_timestamp(sk, &ipc->sockc, &cork->tx_flags);
@@ -1470,7 +1470,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk,
ip_options_build(skb, opt, cork->addr, rt);
}
- skb->priority = (cork->tos != -1) ? cork->priority: READ_ONCE(sk->sk_priority);
+ skb->priority = cork->priority;
skb->mark = cork->mark;
if (sk_is_tcp(sk))
skb_set_delivery_time(skb, cork->transmit_time, SKB_CLOCK_MONOTONIC);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index cf377377b52d..f6a03b418dde 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -315,7 +315,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc,
if (val < 0 || val > 255)
return -EINVAL;
ipc->tos = val;
- ipc->priority = rt_tos2priority(ipc->tos);
+ ipc->sockc.priority = rt_tos2priority(ipc->tos);
break;
case IP_PROTOCOL:
if (cmsg->cmsg_len != CMSG_LEN(sizeof(int)))
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 0e9e01967ec9..4304a68d1db0 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -358,7 +358,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
skb_reserve(skb, hlen);
skb->protocol = htons(ETH_P_IP);
- skb->priority = READ_ONCE(sk->sk_priority);
+ skb->priority = sockc->priority;
skb->mark = sockc->mark;
skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid);
skb_dst_set(skb, &rt->dst);
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3d672dea9f56..993106876604 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
cork->base.gso_size = ipc6->gso_size;
cork->base.tx_flags = 0;
cork->base.mark = ipc6->sockc.mark;
+ cork->base.priority = ipc6->sockc.priority;
sock_tx_timestamp(sk, &ipc6->sockc, &cork->base.tx_flags);
if (ipc6->sockc.tsflags & SOCKCM_FLAG_TS_OPT_ID) {
cork->base.flags |= IPCORK_TS_OPT_ID;
@@ -1942,7 +1943,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
hdr->saddr = fl6->saddr;
hdr->daddr = *final_dst;
- skb->priority = READ_ONCE(sk->sk_priority);
+ skb->priority = cork->base.priority;
skb->mark = cork->base.mark;
if (sk_is_tcp(sk))
skb_set_delivery_time(skb, cork->base.transmit_time, SKB_CLOCK_MONOTONIC);
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 88b3fcacd4f9..46b8adf6e7f8 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -119,6 +119,7 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
return -EINVAL;
ipcm6_init_sk(&ipc6, sk);
+ ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8476a3944a88..a45aba090aa4 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -619,7 +619,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
skb_reserve(skb, hlen);
skb->protocol = htons(ETH_P_IPV6);
- skb->priority = READ_ONCE(sk->sk_priority);
+ skb->priority = sockc->priority;
skb->mark = sockc->mark;
skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid);
@@ -780,6 +780,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
ipcm6_init(&ipc6);
ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
ipc6.sockc.mark = fl6.flowi6_mark;
+ ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
if (sin6) {
if (addr_len < SIN6_LEN_RFC2133)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d766fd798ecf..7c14c449804c 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1448,6 +1448,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
ipc6.gso_size = READ_ONCE(up->gso_size);
ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags);
ipc6.sockc.mark = READ_ONCE(sk->sk_mark);
+ ipc6.sockc.priority = READ_ONCE(sk->sk_priority);
/* destination address check */
if (sin6) {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 886c0dd47b66..f8d87d622699 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3126,7 +3126,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
skb->protocol = proto;
skb->dev = dev;
- skb->priority = READ_ONCE(sk->sk_priority);
+ skb->priority = sockc.priority;
skb->mark = sockc.mark;
skb_set_delivery_type_by_clockid(skb, sockc.transmit_time, sk->sk_clockid);
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next v7 3/4] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
2024-12-13 8:44 [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 1/4] sock: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 2/4] sock: support SO_PRIORITY cmsg Anna Emese Nyiri
@ 2024-12-13 8:44 ` Anna Emese Nyiri
2024-12-13 8:44 ` [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option Anna Emese Nyiri
2024-12-17 2:30 ` [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg patchwork-bot+netdevbpf
4 siblings, 0 replies; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-12-13 8:44 UTC (permalink / raw)
To: netdev
Cc: fejes, edumazet, kuba, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc, Anna Emese Nyiri, Ido Schimmel
Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY
ancillary data.
cmsg_so_priority.sh script added to validate SO_PRIORITY behavior
by creating VLAN device with egress QoS mapping and testing packet
priorities using flower filters. Verify that packets with different
priorities are correctly matched and counted by filters for multiple
protocols and IP versions.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Suggested-by: Ido Schimmel <idosch@idosch.org>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
tools/testing/selftests/net/Makefile | 1 +
tools/testing/selftests/net/cmsg_sender.c | 11 +-
.../testing/selftests/net/cmsg_so_priority.sh | 151 ++++++++++++++++++
3 files changed, 162 insertions(+), 1 deletion(-)
create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index cb2fc601de66..f09bd96cc978 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -32,6 +32,7 @@ TEST_PROGS += ioam6.sh
TEST_PROGS += gro.sh
TEST_PROGS += gre_gso.sh
TEST_PROGS += cmsg_so_mark.sh
+TEST_PROGS += cmsg_so_priority.sh
TEST_PROGS += cmsg_time.sh cmsg_ipv6.sh
TEST_PROGS += netns-name.sh
TEST_PROGS += nl_netdev.py
diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c
index 876c2db02a63..bc314382e4e1 100644
--- a/tools/testing/selftests/net/cmsg_sender.c
+++ b/tools/testing/selftests/net/cmsg_sender.c
@@ -59,6 +59,7 @@ struct options {
unsigned int proto;
} sock;
struct option_cmsg_u32 mark;
+ struct option_cmsg_u32 priority;
struct {
bool ena;
unsigned int delay;
@@ -97,6 +98,8 @@ static void __attribute__((noreturn)) cs_usage(const char *bin)
"\n"
"\t\t-m val Set SO_MARK with given value\n"
"\t\t-M val Set SO_MARK via setsockopt\n"
+ "\t\t-P val Set SO_PRIORITY via setsockopt\n"
+ "\t\t-Q val Set SO_PRIORITY via cmsg\n"
"\t\t-d val Set SO_TXTIME with given delay (usec)\n"
"\t\t-t Enable time stamp reporting\n"
"\t\t-f val Set don't fragment via cmsg\n"
@@ -115,7 +118,7 @@ static void cs_parse_args(int argc, char *argv[])
{
int o;
- while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:")) != -1) {
+ while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:Q:")) != -1) {
switch (o) {
case 's':
opt.silent_send = true;
@@ -148,6 +151,10 @@ static void cs_parse_args(int argc, char *argv[])
opt.mark.ena = true;
opt.mark.val = atoi(optarg);
break;
+ case 'Q':
+ opt.priority.ena = true;
+ opt.priority.val = atoi(optarg);
+ break;
case 'M':
opt.sockopt.mark = atoi(optarg);
break;
@@ -252,6 +259,8 @@ cs_write_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz)
ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
SOL_SOCKET, SO_MARK, &opt.mark);
+ ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
+ SOL_SOCKET, SO_PRIORITY, &opt.priority);
ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
SOL_IPV6, IPV6_DONTFRAG, &opt.v6.dontfrag);
ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len,
diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh
new file mode 100755
index 000000000000..35dde2c5b67f
--- /dev/null
+++ b/tools/testing/selftests/net/cmsg_so_priority.sh
@@ -0,0 +1,151 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+source lib.sh
+
+readonly KSFT_SKIP=4
+
+IP4=192.0.2.1/24
+TGT4=192.0.2.2
+TGT4_RAW=192.0.2.3
+IP6=2001:db8::1/64
+TGT6=2001:db8::2
+TGT6_RAW=2001:db8::3
+PORT=1234
+TOTAL_TESTS=0
+FAILED_TESTS=0
+
+if ! command -v jq &> /dev/null; then
+ echo "SKIP cmsg_so_priroity.sh test: jq is not installed." >&2
+ exit "$KSFT_SKIP"
+fi
+
+check_result() {
+ ((TOTAL_TESTS++))
+ if [ "$1" -ne 0 ]; then
+ ((FAILED_TESTS++))
+ fi
+}
+
+cleanup()
+{
+ cleanup_ns $NS
+}
+
+trap cleanup EXIT
+
+setup_ns NS
+
+create_filter() {
+ local handle=$1
+ local vlan_prio=$2
+ local ip_type=$3
+ local proto=$4
+ local dst_ip=$5
+ local ip_proto
+
+ if [[ "$proto" == "u" ]]; then
+ ip_proto="udp"
+ elif [[ "$ip_type" == "ipv4" && "$proto" == "i" ]]; then
+ ip_proto="icmp"
+ elif [[ "$ip_type" == "ipv6" && "$proto" == "i" ]]; then
+ ip_proto="icmpv6"
+ fi
+
+ tc -n $NS filter add dev dummy1 \
+ egress pref 1 handle "$handle" proto 802.1q \
+ flower vlan_prio "$vlan_prio" vlan_ethtype "$ip_type" \
+ dst_ip "$dst_ip" ${ip_proto:+ip_proto $ip_proto} \
+ action pass
+}
+
+ip -n $NS link set dev lo up
+ip -n $NS link add name dummy1 up type dummy
+
+ip -n $NS link add link dummy1 name dummy1.10 up type vlan id 10 \
+ egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7
+
+ip -n $NS address add $IP4 dev dummy1.10
+ip -n $NS address add $IP6 dev dummy1.10 nodad
+
+ip netns exec $NS sysctl -wq net.ipv4.ping_group_range='0 2147483647'
+
+ip -n $NS neigh add $TGT4 lladdr 00:11:22:33:44:55 nud permanent \
+ dev dummy1.10
+ip -n $NS neigh add $TGT6 lladdr 00:11:22:33:44:55 nud permanent \
+ dev dummy1.10
+ip -n $NS neigh add $TGT4_RAW lladdr 00:11:22:33:44:66 nud permanent \
+ dev dummy1.10
+ip -n $NS neigh add $TGT6_RAW lladdr 00:11:22:33:44:66 nud permanent \
+ dev dummy1.10
+
+tc -n $NS qdisc add dev dummy1 clsact
+
+FILTER_COUNTER=10
+
+for i in 4 6; do
+ for proto in u i r; do
+ echo "Test IPV$i, prot: $proto"
+ for priority in {0..7}; do
+ if [[ $i == 4 && $proto == "r" ]]; then
+ TGT=$TGT4_RAW
+ elif [[ $i == 6 && $proto == "r" ]]; then
+ TGT=$TGT6_RAW
+ elif [ $i == 4 ]; then
+ TGT=$TGT4
+ else
+ TGT=$TGT6
+ fi
+
+ handle="${FILTER_COUNTER}${priority}"
+
+ create_filter $handle $priority ipv$i $proto $TGT
+
+ pkts=$(tc -n $NS -j -s filter show dev dummy1 egress \
+ | jq ".[] | select(.options.handle == ${handle}) | \
+ .options.actions[0].stats.packets")
+
+ if [[ $pkts == 0 ]]; then
+ check_result 0
+ else
+ echo "prio $priority: expected 0, got $pkts"
+ check_result 1
+ fi
+
+ ip netns exec $NS ./cmsg_sender -$i -Q $priority \
+ -p $proto $TGT $PORT
+
+ pkts=$(tc -n $NS -j -s filter show dev dummy1 egress \
+ | jq ".[] | select(.options.handle == ${handle}) | \
+ .options.actions[0].stats.packets")
+ if [[ $pkts == 1 ]]; then
+ check_result 0
+ else
+ echo "prio $priority -Q: expected 1, got $pkts"
+ check_result 1
+ fi
+
+ ip netns exec $NS ./cmsg_sender -$i -P $priority \
+ -p $proto $TGT $PORT
+
+ pkts=$(tc -n $NS -j -s filter show dev dummy1 egress \
+ | jq ".[] | select(.options.handle == ${handle}) | \
+ .options.actions[0].stats.packets")
+ if [[ $pkts == 2 ]]; then
+ check_result 0
+ else
+ echo "prio $priority -P: expected 2, got $pkts"
+ check_result 1
+ fi
+ done
+ FILTER_COUNTER=$((FILTER_COUNTER + 10))
+ done
+done
+
+if [ $FAILED_TESTS -ne 0 ]; then
+ echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed"
+ exit 1
+else
+ echo "OK - All $TOTAL_TESTS tests passed"
+ exit 0
+fi
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option
2024-12-13 8:44 [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
` (2 preceding siblings ...)
2024-12-13 8:44 ` [PATCH net-next v7 3/4] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri
@ 2024-12-13 8:44 ` Anna Emese Nyiri
2024-12-17 2:20 ` Jakub Kicinski
2024-12-17 2:30 ` [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg patchwork-bot+netdevbpf
4 siblings, 1 reply; 8+ messages in thread
From: Anna Emese Nyiri @ 2024-12-13 8:44 UTC (permalink / raw)
To: netdev
Cc: fejes, edumazet, kuba, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc, Anna Emese Nyiri
Add new socket option, SO_RCVPRIORITY, to include SO_PRIORITY in the
ancillary data returned by recvmsg().
This is analogous to the existing support for SO_RCVMARK,
as implemented in commit <6fd1d51cfa253>
("net: SO_RCVMARK socket option for SO_MARK with recvmsg()").
Reviewed-by: Willem de Bruijn <willemb@google.com>
Suggested-by: Ferenc Fejes <fejes@inf.elte.hu>
Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com>
---
arch/alpha/include/uapi/asm/socket.h | 2 ++
arch/mips/include/uapi/asm/socket.h | 2 ++
arch/parisc/include/uapi/asm/socket.h | 2 ++
arch/sparc/include/uapi/asm/socket.h | 2 ++
include/net/sock.h | 4 +++-
include/uapi/asm-generic/socket.h | 2 ++
net/core/sock.c | 8 ++++++++
net/socket.c | 11 +++++++++++
tools/include/uapi/asm-generic/socket.h | 2 ++
9 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h
index 302507bf9b5d..3df5f2dd4c0f 100644
--- a/arch/alpha/include/uapi/asm/socket.h
+++ b/arch/alpha/include/uapi/asm/socket.h
@@ -148,6 +148,8 @@
#define SCM_TS_OPT_ID 81
+#define SO_RCVPRIORITY 82
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64
diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h
index d118d4731580..22fa8f19924a 100644
--- a/arch/mips/include/uapi/asm/socket.h
+++ b/arch/mips/include/uapi/asm/socket.h
@@ -159,6 +159,8 @@
#define SCM_TS_OPT_ID 81
+#define SO_RCVPRIORITY 82
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64
diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h
index d268d69bfcd2..aa9cd4b951fe 100644
--- a/arch/parisc/include/uapi/asm/socket.h
+++ b/arch/parisc/include/uapi/asm/socket.h
@@ -140,6 +140,8 @@
#define SCM_TS_OPT_ID 0x404C
+#define SO_RCVPRIORITY 0x404D
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64
diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h
index 113cd9f353e3..5b464a568664 100644
--- a/arch/sparc/include/uapi/asm/socket.h
+++ b/arch/sparc/include/uapi/asm/socket.h
@@ -141,6 +141,8 @@
#define SCM_TS_OPT_ID 0x005a
+#define SO_RCVPRIORITY 0x005b
+
#if !defined(__KERNEL__)
diff --git a/include/net/sock.h b/include/net/sock.h
index 316a34d6c48b..d4bdd3286e03 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -953,6 +953,7 @@ enum sock_flags {
SOCK_XDP, /* XDP is attached */
SOCK_TSTAMP_NEW, /* Indicates 64 bit timestamps always */
SOCK_RCVMARK, /* Receive SO_MARK ancillary data with packet */
+ SOCK_RCVPRIORITY, /* Receive SO_PRIORITY ancillary data with packet */
};
#define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE))
@@ -2660,7 +2661,8 @@ static inline void sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
{
#define FLAGS_RECV_CMSGS ((1UL << SOCK_RXQ_OVFL) | \
(1UL << SOCK_RCVTSTAMP) | \
- (1UL << SOCK_RCVMARK))
+ (1UL << SOCK_RCVMARK) |\
+ (1UL << SOCK_RCVPRIORITY))
#define TSFLAGS_ANY (SOF_TIMESTAMPING_SOFTWARE | \
SOF_TIMESTAMPING_RAW_HARDWARE)
diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h
index deacfd6dd197..aa5016ff3d91 100644
--- a/include/uapi/asm-generic/socket.h
+++ b/include/uapi/asm-generic/socket.h
@@ -143,6 +143,8 @@
#define SCM_TS_OPT_ID 81
+#define SO_RCVPRIORITY 82
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
diff --git a/net/core/sock.c b/net/core/sock.c
index a3d9941c1d32..f9f4d976141e 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1518,6 +1518,10 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
case SO_RCVMARK:
sock_valbool_flag(sk, SOCK_RCVMARK, valbool);
break;
+
+ case SO_RCVPRIORITY:
+ sock_valbool_flag(sk, SOCK_RCVPRIORITY, valbool);
+ break;
case SO_RXQ_OVFL:
sock_valbool_flag(sk, SOCK_RXQ_OVFL, valbool);
@@ -1947,6 +1951,10 @@ int sk_getsockopt(struct sock *sk, int level, int optname,
v.val = sock_flag(sk, SOCK_RCVMARK);
break;
+ case SO_RCVPRIORITY:
+ v.val = sock_flag(sk, SOCK_RCVPRIORITY);
+ break;
+
case SO_RXQ_OVFL:
v.val = sock_flag(sk, SOCK_RXQ_OVFL);
break;
diff --git a/net/socket.c b/net/socket.c
index 9a117248f18f..79d08b734f7c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1008,12 +1008,23 @@ static void sock_recv_mark(struct msghdr *msg, struct sock *sk,
}
}
+static void sock_recv_priority(struct msghdr *msg, struct sock *sk,
+ struct sk_buff *skb)
+{
+ if (sock_flag(sk, SOCK_RCVPRIORITY) && skb) {
+ __u32 priority = skb->priority;
+
+ put_cmsg(msg, SOL_SOCKET, SO_PRIORITY, sizeof(__u32), &priority);
+ }
+}
+
void __sock_recv_cmsgs(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb)
{
sock_recv_timestamp(msg, sk, skb);
sock_recv_drops(msg, sk, skb);
sock_recv_mark(msg, sk, skb);
+ sock_recv_priority(msg, sk, skb);
}
EXPORT_SYMBOL_GPL(__sock_recv_cmsgs);
diff --git a/tools/include/uapi/asm-generic/socket.h b/tools/include/uapi/asm-generic/socket.h
index 281df9139d2b..ffff554a5230 100644
--- a/tools/include/uapi/asm-generic/socket.h
+++ b/tools/include/uapi/asm-generic/socket.h
@@ -126,6 +126,8 @@
#define SCM_TS_OPT_ID 78
+#define SO_RCVPRIORITY 79
+
#if !defined(__KERNEL__)
#if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__))
--
2.43.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option
2024-12-13 8:44 ` [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option Anna Emese Nyiri
@ 2024-12-17 2:20 ` Jakub Kicinski
2024-12-18 11:57 ` Anna Nyiri
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Kicinski @ 2024-12-17 2:20 UTC (permalink / raw)
To: Anna Emese Nyiri
Cc: netdev, fejes, edumazet, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc
On Fri, 13 Dec 2024 09:44:57 +0100 Anna Emese Nyiri wrote:
> Add new socket option, SO_RCVPRIORITY, to include SO_PRIORITY in the
> ancillary data returned by recvmsg().
> This is analogous to the existing support for SO_RCVMARK,
> as implemented in commit <6fd1d51cfa253>
> ("net: SO_RCVMARK socket option for SO_MARK with recvmsg()").
Could you follow up with a test? The functionality is pretty
straightforward but it'd nonetheless be good to exercise it,
even if it's a trivial C program which sends a UDP packet to
itself over loopback?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option
2024-12-17 2:20 ` Jakub Kicinski
@ 2024-12-18 11:57 ` Anna Nyiri
0 siblings, 0 replies; 8+ messages in thread
From: Anna Nyiri @ 2024-12-18 11:57 UTC (permalink / raw)
To: Jakub Kicinski
Cc: netdev, fejes, edumazet, pabeni, willemb, idosch, horms, dsahern,
linux-can, socketcan, mkl, linux-kselftest, shuah, tsbogend,
kaiyuanz, James.Bottomley, richard.henderson, arnd, almasrymina,
asml.silence, linux-mips, andreas, mattst88, kerneljasonxing,
sparclinux, linux-alpha, linux-arch, deller, vadim.fedorenko,
linux-parisc
Jakub Kicinski <kuba@kernel.org> ezt írta (időpont: 2024. dec. 17., K, 3:20):
>
> On Fri, 13 Dec 2024 09:44:57 +0100 Anna Emese Nyiri wrote:
> > Add new socket option, SO_RCVPRIORITY, to include SO_PRIORITY in the
> > ancillary data returned by recvmsg().
> > This is analogous to the existing support for SO_RCVMARK,
> > as implemented in commit <6fd1d51cfa253>
> > ("net: SO_RCVMARK socket option for SO_MARK with recvmsg()").
>
> Could you follow up with a test? The functionality is pretty
> straightforward but it'd nonetheless be good to exercise it,
> even if it's a trivial C program which sends a UDP packet to
> itself over loopback?
Sure, I will send the test after the Christmas holidays.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg
2024-12-13 8:44 [PATCH net-next v7 0/4] Add support for SO_PRIORITY cmsg Anna Emese Nyiri
` (3 preceding siblings ...)
2024-12-13 8:44 ` [PATCH net-next v7 4/4] sock: Introduce SO_RCVPRIORITY socket option Anna Emese Nyiri
@ 2024-12-17 2:30 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-12-17 2:30 UTC (permalink / raw)
To: Anna Emese Nyiri
Cc: netdev, fejes, edumazet, kuba, pabeni, willemb, idosch, horms,
dsahern, linux-can, socketcan, mkl, linux-kselftest, shuah,
tsbogend, kaiyuanz, James.Bottomley, richard.henderson, arnd,
almasrymina, asml.silence, linux-mips, andreas, mattst88,
kerneljasonxing, sparclinux, linux-alpha, linux-arch, deller,
vadim.fedorenko, linux-parisc
Hello:
This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 13 Dec 2024 09:44:53 +0100 you wrote:
> Introduce a new helper function, `sk_set_prio_allowed`,
> to centralize the logic for validating priority settings.
> Add support for the `SO_PRIORITY` control message,
> enabling user-space applications to set socket priority
> via control messages (cmsg).
>
> Patch Overview:
>
> [...]
Here is the summary with links:
- [net-next,v7,1/4] sock: Introduce sk_set_prio_allowed helper function
https://git.kernel.org/netdev/net-next/c/77ec16be758e
- [net-next,v7,2/4] sock: support SO_PRIORITY cmsg
https://git.kernel.org/netdev/net-next/c/a32f3e9d1ed1
- [net-next,v7,3/4] selftests: net: test SO_PRIORITY ancillary data with cmsg_sender
https://git.kernel.org/netdev/net-next/c/cda7d5abe089
- [net-next,v7,4/4] sock: Introduce SO_RCVPRIORITY socket option
(no matching commit)
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 8+ messages in thread