* [PATCH net-next v3 0/3] SO_PRIORITY cmsg patch summary
@ 2024-11-07 13:22 Anna Emese Nyiri
2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Anna Emese Nyiri @ 2024-11-07 13:22 UTC (permalink / raw)
To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni,
willemdebruijn.kernel
Introduce a new helper function, `sk_set_prio_allowed`,
to centralize the logic for validating priority settings.
Add support for the `SO_PRIORITY` control message,
enabling user-space applications to set socket priority
via control messages (cmsg).
Patch Overview:
Patch 1/3: Introduce `sk_set_prio_allowed` helper function.
Patch 2/3: Add support for setting `SO_PRIORITY` via control messages
Patch 3/3: Add test for SO_PRIORITY setting via control messages
v3:
- Updated cover letter text.
- Removed priority field from ipcm_cookie.
- Removed cork->tos value check from ip_setup_cork, so
cork->priority will now take its value from ipc->sockc.priority.
- Replaced ipc->priority with ipc->sockc.priority
in ip_cmsg_send().
- Modified the error handling for the SO_PRIORITY
case in __sock_cmsg_send().
- Added missing initialization for ipc6.sockc.priority.
- Introduced cmsg_so_priority.sh test script.
- Modified cmsg_sender.c to set priority via control message (cmsg).
- Rebased on net-next
v2:
https://lore.kernel.org/netdev/20241102125136.5030-1-annaemesenyiri@gmail.com/
- Introduced sk_set_prio_allowed helper to check capability
for setting priority.
- Removed new fields and changed sockcm_cookie::priority
from char to u32 to align with sk_buff::priority.
- Moved the cork->tos value check for priority setting
from __ip_make_skb() to ip_setup_cork().
- Rebased on net-next.
v1:
https://lore.kernel.org/all/20241029144142.31382-1-annaemesenyiri@gmail.com/
Anna Emese Nyiri (3):
Introduce sk_set_prio_allowed helper function
support SO_PRIORITY cmsg
test SO_PRIORITY ancillary data with cmsg_sender
include/net/inet_sock.h | 2 +-
include/net/ip.h | 2 +-
include/net/sock.h | 4 +-
net/can/raw.c | 2 +-
net/core/sock.c | 18 ++-
net/ipv4/ip_output.c | 4 +-
net/ipv4/ip_sockglue.c | 2 +-
net/ipv4/raw.c | 2 +-
net/ipv6/ip6_output.c | 3 +-
net/ipv6/raw.c | 3 +-
net/ipv6/udp.c | 1 +
net/packet/af_packet.c | 2 +-
tools/testing/selftests/net/cmsg_sender.c | 11 +-
.../testing/selftests/net/cmsg_so_priority.sh | 115 ++++++++++++++++++
14 files changed, 156 insertions(+), 15 deletions(-)
create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh
--
2.43.0
^ permalink raw reply [flat|nested] 13+ messages in thread* [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function 2024-11-07 13:22 [PATCH net-next v3 0/3] SO_PRIORITY cmsg patch summary Anna Emese Nyiri @ 2024-11-07 13:22 ` Anna Emese Nyiri 2024-11-07 13:48 ` Eric Dumazet 2024-11-07 15:55 ` Willem de Bruijn 2024-11-07 13:22 ` [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg Anna Emese Nyiri 2024-11-07 13:22 ` [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri 2 siblings, 2 replies; 13+ messages in thread From: Anna Emese Nyiri @ 2024-11-07 13:22 UTC (permalink / raw) To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel Simplify priority setting permissions with the `sk_set_prio_allowed` function, centralizing the validation logic. This change is made in anticipation of a second caller in a following patch. No functional changes. Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> --- net/core/sock.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/net/core/sock.c b/net/core/sock.c index 7f398bd07fb7..5ecf6f1a470c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -454,6 +454,13 @@ static int sock_set_timeout(long *timeo_p, sockptr_t optval, int optlen, return 0; } +static bool sk_set_prio_allowed(const struct sock *sk, int val) +{ + return ((val >= TC_PRIO_BESTEFFORT && val <= TC_PRIO_INTERACTIVE) || + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) || + sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)); +} + static bool sock_needs_netstamp(const struct sock *sk) { switch (sk->sk_family) { @@ -1187,9 +1194,7 @@ int sk_setsockopt(struct sock *sk, int level, int optname, /* handle options which do not require locking the socket. */ switch (optname) { case SO_PRIORITY: - if ((val >= 0 && val <= 6) || - sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_RAW) || - sockopt_ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) { + if (sk_set_prio_allowed(sk, val)) { sock_set_priority(sk, val); return 0; } -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function 2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri @ 2024-11-07 13:48 ` Eric Dumazet 2024-11-07 15:55 ` Willem de Bruijn 1 sibling, 0 replies; 13+ messages in thread From: Eric Dumazet @ 2024-11-07 13:48 UTC (permalink / raw) To: Anna Emese Nyiri; +Cc: netdev, fejes, kuba, pabeni, willemdebruijn.kernel On Thu, Nov 7, 2024 at 2:23 PM Anna Emese Nyiri <annaemesenyiri@gmail.com> wrote: > > Simplify priority setting permissions with the `sk_set_prio_allowed` > function, centralizing the validation logic. This change is made in > anticipation of a second caller in a following patch. > No functional changes. > > Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > --- Reviewed-by: Eric Dumazet <edumazet@google.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function 2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri 2024-11-07 13:48 ` Eric Dumazet @ 2024-11-07 15:55 ` Willem de Bruijn 1 sibling, 0 replies; 13+ messages in thread From: Willem de Bruijn @ 2024-11-07 15:55 UTC (permalink / raw) To: Anna Emese Nyiri, netdev Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel Anna Emese Nyiri wrote: > Simplify priority setting permissions with the `sk_set_prio_allowed` > function, centralizing the validation logic. This change is made in > anticipation of a second caller in a following patch. > No functional changes. > > Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg 2024-11-07 13:22 [PATCH net-next v3 0/3] SO_PRIORITY cmsg patch summary Anna Emese Nyiri 2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri @ 2024-11-07 13:22 ` Anna Emese Nyiri 2024-11-07 13:50 ` Eric Dumazet 2024-11-07 16:17 ` Willem de Bruijn 2024-11-07 13:22 ` [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri 2 siblings, 2 replies; 13+ messages in thread From: Anna Emese Nyiri @ 2024-11-07 13:22 UTC (permalink / raw) To: netdev; +Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel The Linux socket API currently allows setting SO_PRIORITY at the socket level, applying a uniform priority to all packets sent through that socket. The exception to this is IP_TOS, when the priority value is calculated during the handling of ancillary data, as implemented in commit <f02db315b8d88> ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data"). However, this is a computed value, and there is currently no mechanism to set a custom priority via control messages prior to this patch. According to this pacth, if SO_PRIORITY is specified as ancillary data, the packet is sent with the priority value set through sockc->priority, overriding the socket-level values set via the traditional setsockopt() method. This is analogous to the existing support for SO_MARK, as implemented in commit <c6af0c227a22> ("ip: support SO_MARK cmsg"). Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> --- include/net/inet_sock.h | 2 +- include/net/ip.h | 2 +- include/net/sock.h | 4 +++- net/can/raw.c | 2 +- net/core/sock.c | 7 +++++++ net/ipv4/ip_output.c | 4 ++-- net/ipv4/ip_sockglue.c | 2 +- net/ipv4/raw.c | 2 +- net/ipv6/ip6_output.c | 3 ++- net/ipv6/raw.c | 3 ++- net/ipv6/udp.c | 1 + net/packet/af_packet.c | 2 +- 12 files changed, 23 insertions(+), 11 deletions(-) diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 56d8bc5593d3..3ccbad881d74 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -172,7 +172,7 @@ struct inet_cork { u8 tx_flags; __u8 ttl; __s16 tos; - char priority; + u32 priority; __u16 gso_size; u32 ts_opt_id; u64 transmit_time; diff --git a/include/net/ip.h b/include/net/ip.h index 0e548c1f2a0e..9f5e33e371fc 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -81,7 +81,6 @@ struct ipcm_cookie { __u8 protocol; __u8 ttl; __s16 tos; - char priority; __u16 gso_size; }; @@ -96,6 +95,7 @@ static inline void ipcm_init_sk(struct ipcm_cookie *ipcm, ipcm_init(ipcm); ipcm->sockc.mark = READ_ONCE(inet->sk.sk_mark); + ipcm->sockc.priority = READ_ONCE(inet->sk.sk_priority); ipcm->sockc.tsflags = READ_ONCE(inet->sk.sk_tsflags); ipcm->oif = READ_ONCE(inet->sk.sk_bound_dev_if); ipcm->addr = inet->inet_saddr; diff --git a/include/net/sock.h b/include/net/sock.h index 7464e9f9f47c..316a34d6c48b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1814,13 +1814,15 @@ struct sockcm_cookie { u32 mark; u32 tsflags; u32 ts_opt_id; + u32 priority; }; static inline void sockcm_init(struct sockcm_cookie *sockc, const struct sock *sk) { *sockc = (struct sockcm_cookie) { - .tsflags = READ_ONCE(sk->sk_tsflags) + .tsflags = READ_ONCE(sk->sk_tsflags), + .priority = READ_ONCE(sk->sk_priority), }; } diff --git a/net/can/raw.c b/net/can/raw.c index 255c0a8f39d6..46e8ed9d64da 100644 --- a/net/can/raw.c +++ b/net/can/raw.c @@ -962,7 +962,7 @@ static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size) } skb->dev = dev; - skb->priority = READ_ONCE(sk->sk_priority); + skb->priority = sockc.priority; skb->mark = READ_ONCE(sk->sk_mark); skb->tstamp = sockc.transmit_time; diff --git a/net/core/sock.c b/net/core/sock.c index 5ecf6f1a470c..68e2af168da7 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2941,6 +2941,13 @@ int __sock_cmsg_send(struct sock *sk, struct cmsghdr *cmsg, case SCM_RIGHTS: case SCM_CREDENTIALS: break; + case SO_PRIORITY: + if (cmsg->cmsg_len != CMSG_LEN(sizeof(u32))) + return -EINVAL; + if (!sk_set_prio_allowed(sk, *(u32 *)CMSG_DATA(cmsg))) + return -EPERM; + sockc->priority = *(u32 *)CMSG_DATA(cmsg); + break; default: return -EINVAL; } diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 0065b1996c94..cd3e788600cc 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1328,7 +1328,7 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork, cork->ttl = ipc->ttl; cork->tos = ipc->tos; cork->mark = ipc->sockc.mark; - cork->priority = ipc->priority; + cork->priority = ipc->sockc.priority; cork->transmit_time = ipc->sockc.transmit_time; cork->tx_flags = 0; sock_tx_timestamp(sk, &ipc->sockc, &cork->tx_flags); @@ -1465,7 +1465,7 @@ struct sk_buff *__ip_make_skb(struct sock *sk, ip_options_build(skb, opt, cork->addr, rt); } - skb->priority = (cork->tos != -1) ? cork->priority: READ_ONCE(sk->sk_priority); + skb->priority = cork->priority; skb->mark = cork->mark; if (sk_is_tcp(sk)) skb_set_delivery_time(skb, cork->transmit_time, SKB_CLOCK_MONOTONIC); diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index cf377377b52d..f6a03b418dde 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -315,7 +315,7 @@ int ip_cmsg_send(struct sock *sk, struct msghdr *msg, struct ipcm_cookie *ipc, if (val < 0 || val > 255) return -EINVAL; ipc->tos = val; - ipc->priority = rt_tos2priority(ipc->tos); + ipc->sockc.priority = rt_tos2priority(ipc->tos); break; case IP_PROTOCOL: if (cmsg->cmsg_len != CMSG_LEN(sizeof(int))) diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 0e9e01967ec9..4304a68d1db0 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -358,7 +358,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4, skb_reserve(skb, hlen); skb->protocol = htons(ETH_P_IP); - skb->priority = READ_ONCE(sk->sk_priority); + skb->priority = sockc->priority; skb->mark = sockc->mark; skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid); skb_dst_set(skb, &rt->dst); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index f7b4608bb316..ec9673b7ab16 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -1401,6 +1401,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork, cork->base.gso_size = ipc6->gso_size; cork->base.tx_flags = 0; cork->base.mark = ipc6->sockc.mark; + cork->base.priority = ipc6->sockc.priority; sock_tx_timestamp(sk, &ipc6->sockc, &cork->base.tx_flags); if (ipc6->sockc.tsflags & SOCKCM_FLAG_TS_OPT_ID) { cork->base.flags |= IPCORK_TS_OPT_ID; @@ -1939,7 +1940,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk, hdr->saddr = fl6->saddr; hdr->daddr = *final_dst; - skb->priority = READ_ONCE(sk->sk_priority); + skb->priority = cork->base.priority; skb->mark = cork->base.mark; if (sk_is_tcp(sk)) skb_set_delivery_time(skb, cork->base.transmit_time, SKB_CLOCK_MONOTONIC); diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 8476a3944a88..a45aba090aa4 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -619,7 +619,7 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length, skb_reserve(skb, hlen); skb->protocol = htons(ETH_P_IPV6); - skb->priority = READ_ONCE(sk->sk_priority); + skb->priority = sockc->priority; skb->mark = sockc->mark; skb_set_delivery_type_by_clockid(skb, sockc->transmit_time, sk->sk_clockid); @@ -780,6 +780,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) ipcm6_init(&ipc6); ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags); ipc6.sockc.mark = fl6.flowi6_mark; + ipc6.sockc.priority = READ_ONCE(sk->sk_priority); if (sin6) { if (addr_len < SIN6_LEN_RFC2133) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 0cef8ae5d1ea..dcce9fd33e98 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -1353,6 +1353,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len) ipc6.gso_size = READ_ONCE(up->gso_size); ipc6.sockc.tsflags = READ_ONCE(sk->sk_tsflags); ipc6.sockc.mark = READ_ONCE(sk->sk_mark); + ipc6.sockc.priority = READ_ONCE(sk->sk_priority); /* destination address check */ if (sin6) { diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 886c0dd47b66..f8d87d622699 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -3126,7 +3126,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len) skb->protocol = proto; skb->dev = dev; - skb->priority = READ_ONCE(sk->sk_priority); + skb->priority = sockc.priority; skb->mark = sockc.mark; skb_set_delivery_type_by_clockid(skb, sockc.transmit_time, sk->sk_clockid); -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg 2024-11-07 13:22 ` [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg Anna Emese Nyiri @ 2024-11-07 13:50 ` Eric Dumazet 2024-11-07 16:17 ` Willem de Bruijn 1 sibling, 0 replies; 13+ messages in thread From: Eric Dumazet @ 2024-11-07 13:50 UTC (permalink / raw) To: Anna Emese Nyiri; +Cc: netdev, fejes, kuba, pabeni, willemdebruijn.kernel On Thu, Nov 7, 2024 at 2:23 PM Anna Emese Nyiri <annaemesenyiri@gmail.com> wrote: > > The Linux socket API currently allows setting SO_PRIORITY at the > socket level, applying a uniform priority to all packets sent through > that socket. The exception to this is IP_TOS, when the priority value > is calculated during the handling of > ancillary data, as implemented in commit <f02db315b8d88> > ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data"). > However, this is a computed > value, and there is currently no mechanism to set a custom priority > via control messages prior to this patch. > > According to this pacth, if SO_PRIORITY is specified as ancillary data, > the packet is sent with the priority value set through > sockc->priority, overriding the socket-level values > set via the traditional setsockopt() method. This is analogous to > the existing support for SO_MARK, as implemented in commit > <c6af0c227a22> ("ip: support SO_MARK cmsg"). > > Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg 2024-11-07 13:22 ` [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg Anna Emese Nyiri 2024-11-07 13:50 ` Eric Dumazet @ 2024-11-07 16:17 ` Willem de Bruijn 1 sibling, 0 replies; 13+ messages in thread From: Willem de Bruijn @ 2024-11-07 16:17 UTC (permalink / raw) To: Anna Emese Nyiri, netdev Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel Anna Emese Nyiri wrote: > The Linux socket API currently allows setting SO_PRIORITY at the > socket level, applying a uniform priority to all packets sent through > that socket. The exception to this is IP_TOS, when the priority value > is calculated during the handling of > ancillary data, as implemented in commit <f02db315b8d88> > ("ipv4: IP_TOS and IP_TTL can be specified as ancillary data"). > However, this is a computed > value, and there is currently no mechanism to set a custom priority > via control messages prior to this patch. > > According to this pacth, if SO_PRIORITY is specified as ancillary data, typo: patch > the packet is sent with the priority value set through > sockc->priority, overriding the socket-level values > set via the traditional setsockopt() method. This is analogous to > the existing support for SO_MARK, as implemented in commit > <c6af0c227a22> ("ip: support SO_MARK cmsg"). If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that takes precedence is the last one in the cmsg list. > Suggested-by: Ferenc Fejes <fejes@inf.elte.hu> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> > diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c > index cf377377b52d..f6a03b418dde 100644 > diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c > index 0e9e01967ec9..4304a68d1db0 100644 > --- a/net/ipv4/raw.c > +++ b/net/ipv4/raw.c > @@ -358,7 +358,7 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4, > skb_reserve(skb, hlen); > > skb->protocol = htons(ETH_P_IP); > - skb->priority = READ_ONCE(sk->sk_priority); > + skb->priority = sockc->priority; This has the side effect that raw_send_hdrinc will now interpret cmsg IP_TOS, where it previously did not (as only sockcm_cookie was passed, not all of ipcm_cookie). This is an improvement, but good to make explicit. ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-07 13:22 [PATCH net-next v3 0/3] SO_PRIORITY cmsg patch summary Anna Emese Nyiri 2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri 2024-11-07 13:22 ` [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg Anna Emese Nyiri @ 2024-11-07 13:22 ` Anna Emese Nyiri 2024-11-07 17:22 ` Willem de Bruijn 2 siblings, 1 reply; 13+ messages in thread From: Anna Emese Nyiri @ 2024-11-07 13:22 UTC (permalink / raw) To: netdev Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel, Ido Schimmel Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY ancillary data. Add the `cmsg_so_priority.sh` script, which sets up two network namespaces (red and green) and uses the `cmsg_sender.c` program to send messages between them. The script sets packet priorities both via `setsockopt` and control messages (cmsg) and verifies whether packets are routed to the same queue based on priority settings. Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> --- tools/testing/selftests/net/cmsg_sender.c | 11 +- .../testing/selftests/net/cmsg_so_priority.sh | 115 ++++++++++++++++++ 2 files changed, 125 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh diff --git a/tools/testing/selftests/net/cmsg_sender.c b/tools/testing/selftests/net/cmsg_sender.c index 876c2db02a63..6fbe93dd63d2 100644 --- a/tools/testing/selftests/net/cmsg_sender.c +++ b/tools/testing/selftests/net/cmsg_sender.c @@ -52,6 +52,7 @@ struct options { unsigned int tclass; unsigned int hlimit; unsigned int priority; + unsigned int priority_cmsg; } sockopt; struct { unsigned int family; @@ -59,6 +60,7 @@ struct options { unsigned int proto; } sock; struct option_cmsg_u32 mark; + struct option_cmsg_u32 priority_cmsg; struct { bool ena; unsigned int delay; @@ -97,6 +99,7 @@ static void __attribute__((noreturn)) cs_usage(const char *bin) "\n" "\t\t-m val Set SO_MARK with given value\n" "\t\t-M val Set SO_MARK via setsockopt\n" + "\t\t-Q val Set SO_PRIORITY via cmsg\n" "\t\t-d val Set SO_TXTIME with given delay (usec)\n" "\t\t-t Enable time stamp reporting\n" "\t\t-f val Set don't fragment via cmsg\n" @@ -115,7 +118,7 @@ static void cs_parse_args(int argc, char *argv[]) { int o; - while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:")) != -1) { + while ((o = getopt(argc, argv, "46sS:p:P:m:M:n:d:tf:F:c:C:l:L:H:Q:")) != -1) { switch (o) { case 's': opt.silent_send = true; @@ -148,6 +151,10 @@ static void cs_parse_args(int argc, char *argv[]) opt.mark.ena = true; opt.mark.val = atoi(optarg); break; + case 'Q': + opt.priority_cmsg.ena = true; + opt.priority_cmsg.val = atoi(optarg); + break; case 'M': opt.sockopt.mark = atoi(optarg); break; @@ -252,6 +259,8 @@ cs_write_cmsg(int fd, struct msghdr *msg, char *cbuf, size_t cbuf_sz) ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len, SOL_SOCKET, SO_MARK, &opt.mark); + ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len, + SOL_SOCKET, SO_PRIORITY, &opt.priority_cmsg); ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len, SOL_IPV6, IPV6_DONTFRAG, &opt.v6.dontfrag); ca_write_cmsg_u32(cbuf, cbuf_sz, &cmsg_len, diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh new file mode 100755 index 000000000000..706d29b251e9 --- /dev/null +++ b/tools/testing/selftests/net/cmsg_so_priority.sh @@ -0,0 +1,115 @@ +#!/bin/bash + +source lib.sh + +IP4=192.168.0.2/16 +TGT4=192.168.0.3/16 +TGT4_NO_MASK=192.168.0.3 +IP6=2001:db8::2/64 +TGT6=2001:db8::3/64 +TGT6_NO_MASK=2001:db8::3 +IP4BR=192.168.0.1/16 +IP6BR=2001:db8::1/64 +PORT=8080 +DELAY=400000 +QUEUE_NUM=4 + + +cleanup() { + ip netns del red 2>/dev/null + ip netns del green 2>/dev/null + ip link del br0 2>/dev/null + ip link del vethcab0 2>/dev/null + ip link del vethcab1 2>/dev/null +} + +trap cleanup EXIT + +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) + +queue_config="" +for ((i=0; i<$QUEUE_NUM; i++)); do + queue_config+=" 1@$i" +done + +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') + +ip netns add red +ip netns add green +ip link add br0 type bridge +ip link set br0 up +ip addr add $IP4BR dev br0 +ip addr add $IP6BR dev br0 + +ip link add vethcab0 type veth peer name red0 +ip link set vethcab0 master br0 +ip link set red0 netns red +ip netns exec red bash -c " +ip link set lo up +ip link set red0 up +ip addr add $IP4 dev red0 +ip addr add $IP6 dev red0 +sysctl -w net.ipv4.ping_group_range='0 2147483647' +exit" +ip link set vethcab0 up + +ip link add vethcab1 type veth peer name green0 +ip link set vethcab1 master br0 +ip link set green0 netns green +ip netns exec green bash -c " +ip link set lo up +ip link set green0 up +ip addr add $TGT4 dev green0 +ip addr add $TGT6 dev green0 +exit" +ip link set vethcab1 up + +ip netns exec red bash -c " +sudo ethtool -L red0 tx $QUEUE_NUM rx $QUEUE_NUM +sudo tc qdisc add dev red0 root mqprio num_tc $QUEUE_NUM queues $queue_config map $map_config hw 0 +exit" + +get_queue_bytes() { + ip netns exec red sudo tc -s qdisc show dev red0 | grep 'Sent' | awk '{print $2}' +} + +TOTAL_TESTS=0 +FAILED_TESTS=0 + +check_result() { + ((TOTAL_TESTS++)) + if [ "$1" -ne 0 ]; then + ((FAILED_TESTS++)) + fi +} + + +for i in 4 6; do + [ $i == 4 ] && TGT=$TGT4_NO_MASK || TGT=$TGT6_NO_MASK + + for p in u i r; do + echo "Test IPV$i, prot: $p" + for value in "${priority_values[@]}"; do + ip netns exec red ./cmsg_sender -$i -Q $value -d "${DELAY}" -p $p $TGT $PORT + setsockopt_priority_bytes_num=($(get_queue_bytes)) + + ip netns exec red ./cmsg_sender -$i -P $value -d "${DELAY}" -p $p $TGT $PORT + cmsg_priority_bytes_num=($(get_queue_bytes)) + + if [[ "${cmsg_priority_bytes_num[$actual_queue]}" != \ + "${setsockopt_priority_bytes_num[$actual_queue]}" ]]; then + check_result 0 + else + check_result 1 + fi + done + done +done + +if [ $FAILED_TESTS -ne 0 ]; then + echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed" + exit 1 +else + echo "OK - All $TOTAL_TESTS tests passed" + exit 0 +fi -- 2.43.0 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-07 13:22 ` [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri @ 2024-11-07 17:22 ` Willem de Bruijn 2024-11-08 14:46 ` Willem de Bruijn 0 siblings, 1 reply; 13+ messages in thread From: Willem de Bruijn @ 2024-11-07 17:22 UTC (permalink / raw) To: Anna Emese Nyiri, netdev Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel, Ido Schimmel Anna Emese Nyiri wrote: > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY > ancillary data. > > Add the `cmsg_so_priority.sh` script, which sets up two network > namespaces (red and green) and uses the `cmsg_sender.c` program to > send messages between them. The script sets packet priorities both > via `setsockopt` and control messages (cmsg) and verifies whether > packets are routed to the same queue based on priority settings. qdisc. queue is a more generic term, generally referring to hw queues. > Suggested-by: Ido Schimmel <idosch@idosch.org> > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > --- > tools/testing/selftests/net/cmsg_sender.c | 11 +- > .../testing/selftests/net/cmsg_so_priority.sh | 115 ++++++++++++++++++ > 2 files changed, 125 insertions(+), 1 deletion(-) > create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh > diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh > new file mode 100755 > index 000000000000..706d29b251e9 > --- /dev/null > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh > @@ -0,0 +1,115 @@ > +#!/bin/bash SPDX header > + > +source lib.sh > + > +IP4=192.168.0.2/16 > +TGT4=192.168.0.3/16 > +TGT4_NO_MASK=192.168.0.3 > +IP6=2001:db8::2/64 > +TGT6=2001:db8::3/64 > +TGT6_NO_MASK=2001:db8::3 > +IP4BR=192.168.0.1/16 > +IP6BR=2001:db8::1/64 > +PORT=8080 > +DELAY=400000 > +QUEUE_NUM=4 > + > + > +cleanup() { > + ip netns del red 2>/dev/null > + ip netns del green 2>/dev/null > + ip link del br0 2>/dev/null > + ip link del vethcab0 2>/dev/null > + ip link del vethcab1 2>/dev/null > +} > + > +trap cleanup EXIT > + > +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) > + > +queue_config="" > +for ((i=0; i<$QUEUE_NUM; i++)); do > + queue_config+=" 1@$i" > +done > + > +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') > + > +ip netns add red > +ip netns add green > +ip link add br0 type bridge > +ip link set br0 up Is this bridge needed? Can this just use a veth pair as is. More importantly, it would be great if we can deduplicate this kind of setup boilerplate across similar tests as much as possible. > +ip addr add $IP4BR dev br0 > +ip addr add $IP6BR dev br0 > + > +ip link add vethcab0 type veth peer name red0 > +ip link set vethcab0 master br0 > +ip link set red0 netns red > +ip netns exec red bash -c " > +ip link set lo up > +ip link set red0 up > +ip addr add $IP4 dev red0 > +ip addr add $IP6 dev red0 > +sysctl -w net.ipv4.ping_group_range='0 2147483647' > +exit" > +ip link set vethcab0 up > + > +ip link add vethcab1 type veth peer name green0 > +ip link set vethcab1 master br0 > +ip link set green0 netns green > +ip netns exec green bash -c " > +ip link set lo up > +ip link set green0 up > +ip addr add $TGT4 dev green0 > +ip addr add $TGT6 dev green0 > +exit" > +ip link set vethcab1 up > + > +ip netns exec red bash -c " > +sudo ethtool -L red0 tx $QUEUE_NUM rx $QUEUE_NUM > +sudo tc qdisc add dev red0 root mqprio num_tc $QUEUE_NUM queues $queue_config map $map_config hw 0 > +exit" > + > +get_queue_bytes() { > + ip netns exec red sudo tc -s qdisc show dev red0 | grep 'Sent' | awk '{print $2}' > +} > + > +TOTAL_TESTS=0 > +FAILED_TESTS=0 > + > +check_result() { > + ((TOTAL_TESTS++)) > + if [ "$1" -ne 0 ]; then > + ((FAILED_TESTS++)) > + fi > +} > + > + > +for i in 4 6; do > + [ $i == 4 ] && TGT=$TGT4_NO_MASK || TGT=$TGT6_NO_MASK > + > + for p in u i r; do > + echo "Test IPV$i, prot: $p" > + for value in "${priority_values[@]}"; do > + ip netns exec red ./cmsg_sender -$i -Q $value -d "${DELAY}" -p $p $TGT $PORT > + setsockopt_priority_bytes_num=($(get_queue_bytes)) > + > + ip netns exec red ./cmsg_sender -$i -P $value -d "${DELAY}" -p $p $TGT $PORT > + cmsg_priority_bytes_num=($(get_queue_bytes)) > + > + if [[ "${cmsg_priority_bytes_num[$actual_queue]}" != \ > + "${setsockopt_priority_bytes_num[$actual_queue]}" ]]; then > + check_result 0 > + else > + check_result 1 > + fi > + done > + done > +done > + > +if [ $FAILED_TESTS -ne 0 ]; then > + echo "FAIL - $FAILED_TESTS/$TOTAL_TESTS tests failed" > + exit 1 > +else > + echo "OK - All $TOTAL_TESTS tests passed" > + exit 0 > +fi > -- > 2.43.0 > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-07 17:22 ` Willem de Bruijn @ 2024-11-08 14:46 ` Willem de Bruijn 2024-11-09 18:54 ` Ferenc Fejes 0 siblings, 1 reply; 13+ messages in thread From: Willem de Bruijn @ 2024-11-08 14:46 UTC (permalink / raw) To: Willem de Bruijn, Anna Emese Nyiri, netdev Cc: fejes, annaemesenyiri, edumazet, kuba, pabeni, willemdebruijn.kernel, Ido Schimmel Willem de Bruijn wrote: > Anna Emese Nyiri wrote: > > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY > > ancillary data. > > > > Add the `cmsg_so_priority.sh` script, which sets up two network > > namespaces (red and green) and uses the `cmsg_sender.c` program to > > send messages between them. The script sets packet priorities both > > via `setsockopt` and control messages (cmsg) and verifies whether > > packets are routed to the same queue based on priority settings. > > qdisc. queue is a more generic term, generally referring to hw queues. > > > Suggested-by: Ido Schimmel <idosch@idosch.org> > > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > > --- > > tools/testing/selftests/net/cmsg_sender.c | 11 +- > > .../testing/selftests/net/cmsg_so_priority.sh | 115 ++++++++++++++++++ > > 2 files changed, 125 insertions(+), 1 deletion(-) > > create mode 100755 tools/testing/selftests/net/cmsg_so_priority.sh > > > diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh b/tools/testing/selftests/net/cmsg_so_priority.sh > > new file mode 100755 > > index 000000000000..706d29b251e9 > > --- /dev/null > > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh > > @@ -0,0 +1,115 @@ > > +#!/bin/bash > > SPDX header > > > + > > +source lib.sh > > + > > +IP4=192.168.0.2/16 > > +TGT4=192.168.0.3/16 > > +TGT4_NO_MASK=192.168.0.3 > > +IP6=2001:db8::2/64 > > +TGT6=2001:db8::3/64 > > +TGT6_NO_MASK=2001:db8::3 > > +IP4BR=192.168.0.1/16 > > +IP6BR=2001:db8::1/64 > > +PORT=8080 > > +DELAY=400000 > > +QUEUE_NUM=4 > > + > > + > > +cleanup() { > > + ip netns del red 2>/dev/null > > + ip netns del green 2>/dev/null > > + ip link del br0 2>/dev/null > > + ip link del vethcab0 2>/dev/null > > + ip link del vethcab1 2>/dev/null > > +} > > + > > +trap cleanup EXIT > > + > > +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) > > + > > +queue_config="" > > +for ((i=0; i<$QUEUE_NUM; i++)); do > > + queue_config+=" 1@$i" > > +done > > + > > +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') > > + > > +ip netns add red > > +ip netns add green > > +ip link add br0 type bridge > > +ip link set br0 up > > Is this bridge needed? Can this just use a veth pair as is. > > More importantly, it would be great if we can deduplicate this kind of > setup boilerplate across similar tests as much as possible. As a matter of fact, similar to cmsg_so_mark, this test can probably use a dummy netdevice, no need for a second namespace and dev. cmsg_so_mark.sh is probably small enough that it is fine to copy that and create a duplicate. As trying to extend it to cover both tests will probably double it in size and will just be harder to follow. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-08 14:46 ` Willem de Bruijn @ 2024-11-09 18:54 ` Ferenc Fejes 2024-11-09 19:56 ` Willem de Bruijn 0 siblings, 1 reply; 13+ messages in thread From: Ferenc Fejes @ 2024-11-09 18:54 UTC (permalink / raw) To: Willem de Bruijn, Anna Emese Nyiri, netdev Cc: edumazet, kuba, pabeni, Ido Schimmel On Fri, 2024-11-08 at 09:46 -0500, Willem de Bruijn wrote: > Willem de Bruijn wrote: > > Anna Emese Nyiri wrote: > > > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY > > > ancillary data. > > > > > > Add the `cmsg_so_priority.sh` script, which sets up two network > > > namespaces (red and green) and uses the `cmsg_sender.c` program > > > to > > > send messages between them. The script sets packet priorities > > > both > > > via `setsockopt` and control messages (cmsg) and verifies > > > whether > > > packets are routed to the same queue based on priority settings. > > > > qdisc. queue is a more generic term, generally referring to hw > > queues. > > > > > Suggested-by: Ido Schimmel <idosch@idosch.org> > > > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > > > --- > > > tools/testing/selftests/net/cmsg_sender.c | 11 +- > > > .../testing/selftests/net/cmsg_so_priority.sh | 115 > > > ++++++++++++++++++ > > > 2 files changed, 125 insertions(+), 1 deletion(-) > > > create mode 100755 > > > tools/testing/selftests/net/cmsg_so_priority.sh > > > > > diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh > > > b/tools/testing/selftests/net/cmsg_so_priority.sh > > > new file mode 100755 > > > index 000000000000..706d29b251e9 > > > --- /dev/null > > > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh > > > @@ -0,0 +1,115 @@ > > > +#!/bin/bash > > > > SPDX header > > > > > + > > > +source lib.sh > > > + > > > +IP4=192.168.0.2/16 > > > +TGT4=192.168.0.3/16 > > > +TGT4_NO_MASK=192.168.0.3 > > > +IP6=2001:db8::2/64 > > > +TGT6=2001:db8::3/64 > > > +TGT6_NO_MASK=2001:db8::3 > > > +IP4BR=192.168.0.1/16 > > > +IP6BR=2001:db8::1/64 > > > +PORT=8080 > > > +DELAY=400000 > > > +QUEUE_NUM=4 > > > + > > > + > > > +cleanup() { > > > + ip netns del red 2>/dev/null > > > + ip netns del green 2>/dev/null > > > + ip link del br0 2>/dev/null > > > + ip link del vethcab0 2>/dev/null > > > + ip link del vethcab1 2>/dev/null > > > +} > > > + > > > +trap cleanup EXIT > > > + > > > +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) > > > + > > > +queue_config="" > > > +for ((i=0; i<$QUEUE_NUM; i++)); do > > > + queue_config+=" 1@$i" > > > +done > > > + > > > +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') > > > + > > > +ip netns add red > > > +ip netns add green > > > +ip link add br0 type bridge > > > +ip link set br0 up > > > > Is this bridge needed? Can this just use a veth pair as is. > > > > More importantly, it would be great if we can deduplicate this kind > > of > > setup boilerplate across similar tests as much as possible. > > As a matter of fact, similar to cmsg_so_mark, this test can probably > use a dummy netdevice, no need for a second namespace and dev. > > cmsg_so_mark.sh is probably small enough that it is fine to copy that > and create a duplicate. As trying to extend it to cover both tests > will probably double it in size and will just be harder to follow. I'm afraid we don't have "ip rule" match argument for skb->priority like we have for skb->mark (ip rule fwmark). AFAIU cmsg_so_mark.sh uses rule matches to confirm if skb->mark is correct or not. Using nftables meta priority would work with a dummy device: add table so_prio add chain so_prio so_prio_chain { type filter hook output priority 0; } add rule so_prio so_prio_chain meta priority $SOPRIOVALUE counter Is there anything simpler? I am afraid we cannot use nftables in selftests, or can we? Thanks! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-09 18:54 ` Ferenc Fejes @ 2024-11-09 19:56 ` Willem de Bruijn 2024-11-10 13:27 ` Ido Schimmel 0 siblings, 1 reply; 13+ messages in thread From: Willem de Bruijn @ 2024-11-09 19:56 UTC (permalink / raw) To: Ferenc Fejes, Willem de Bruijn, Anna Emese Nyiri, netdev Cc: edumazet, kuba, pabeni, Ido Schimmel Ferenc Fejes wrote: > On Fri, 2024-11-08 at 09:46 -0500, Willem de Bruijn wrote: > > Willem de Bruijn wrote: > > > Anna Emese Nyiri wrote: > > > > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY > > > > ancillary data. > > > > > > > > Add the `cmsg_so_priority.sh` script, which sets up two network > > > > namespaces (red and green) and uses the `cmsg_sender.c` program > > > > to > > > > send messages between them. The script sets packet priorities > > > > both > > > > via `setsockopt` and control messages (cmsg) and verifies > > > > whether > > > > packets are routed to the same queue based on priority settings. > > > > > > qdisc. queue is a more generic term, generally referring to hw > > > queues. > > > > > > > Suggested-by: Ido Schimmel <idosch@idosch.org> > > > > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > > > > --- > > > > tools/testing/selftests/net/cmsg_sender.c | 11 +- > > > > .../testing/selftests/net/cmsg_so_priority.sh | 115 > > > > ++++++++++++++++++ > > > > 2 files changed, 125 insertions(+), 1 deletion(-) > > > > create mode 100755 > > > > tools/testing/selftests/net/cmsg_so_priority.sh > > > > > > > diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh > > > > b/tools/testing/selftests/net/cmsg_so_priority.sh > > > > new file mode 100755 > > > > index 000000000000..706d29b251e9 > > > > --- /dev/null > > > > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh > > > > @@ -0,0 +1,115 @@ > > > > +#!/bin/bash > > > > > > SPDX header > > > > > > > + > > > > +source lib.sh > > > > + > > > > +IP4=192.168.0.2/16 > > > > +TGT4=192.168.0.3/16 > > > > +TGT4_NO_MASK=192.168.0.3 > > > > +IP6=2001:db8::2/64 > > > > +TGT6=2001:db8::3/64 > > > > +TGT6_NO_MASK=2001:db8::3 > > > > +IP4BR=192.168.0.1/16 > > > > +IP6BR=2001:db8::1/64 > > > > +PORT=8080 > > > > +DELAY=400000 > > > > +QUEUE_NUM=4 > > > > + > > > > + > > > > +cleanup() { > > > > + ip netns del red 2>/dev/null > > > > + ip netns del green 2>/dev/null > > > > + ip link del br0 2>/dev/null > > > > + ip link del vethcab0 2>/dev/null > > > > + ip link del vethcab1 2>/dev/null > > > > +} > > > > + > > > > +trap cleanup EXIT > > > > + > > > > +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) > > > > + > > > > +queue_config="" > > > > +for ((i=0; i<$QUEUE_NUM; i++)); do > > > > + queue_config+=" 1@$i" > > > > +done > > > > + > > > > +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') > > > > + > > > > +ip netns add red > > > > +ip netns add green > > > > +ip link add br0 type bridge > > > > +ip link set br0 up > > > > > > Is this bridge needed? Can this just use a veth pair as is. > > > > > > More importantly, it would be great if we can deduplicate this kind > > > of > > > setup boilerplate across similar tests as much as possible. > > > > As a matter of fact, similar to cmsg_so_mark, this test can probably > > use a dummy netdevice, no need for a second namespace and dev. > > > > cmsg_so_mark.sh is probably small enough that it is fine to copy that > > and create a duplicate. As trying to extend it to cover both tests > > will probably double it in size and will just be harder to follow. > > I'm afraid we don't have "ip rule" match argument for skb->priority > like we have for skb->mark (ip rule fwmark). AFAIU cmsg_so_mark.sh uses > rule matches to confirm if skb->mark is correct or not. > > Using nftables meta priority would work with a dummy device: > add table so_prio > add chain so_prio so_prio_chain { type filter hook output priority 0; } > add rule so_prio so_prio_chain meta priority $SOPRIOVALUE counter > > Is there anything simpler? I am afraid we cannot use nftables in > selftests, or can we? Thanks! I'd use traffic shaper (tc). There are a variety of qdiscs/schedulers and classifiers that act on skb->priority. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender 2024-11-09 19:56 ` Willem de Bruijn @ 2024-11-10 13:27 ` Ido Schimmel 0 siblings, 0 replies; 13+ messages in thread From: Ido Schimmel @ 2024-11-10 13:27 UTC (permalink / raw) To: Willem de Bruijn Cc: Ferenc Fejes, Anna Emese Nyiri, netdev, edumazet, kuba, pabeni On Sat, Nov 09, 2024 at 02:56:29PM -0500, Willem de Bruijn wrote: > Ferenc Fejes wrote: > > On Fri, 2024-11-08 at 09:46 -0500, Willem de Bruijn wrote: > > > Willem de Bruijn wrote: > > > > Anna Emese Nyiri wrote: > > > > > Extend cmsg_sender.c with a new option '-Q' to send SO_PRIORITY > > > > > ancillary data. > > > > > > > > > > Add the `cmsg_so_priority.sh` script, which sets up two network > > > > > namespaces (red and green) and uses the `cmsg_sender.c` program > > > > > to > > > > > send messages between them. The script sets packet priorities > > > > > both > > > > > via `setsockopt` and control messages (cmsg) and verifies > > > > > whether > > > > > packets are routed to the same queue based on priority settings. > > > > > > > > qdisc. queue is a more generic term, generally referring to hw > > > > queues. > > > > > > > > > Suggested-by: Ido Schimmel <idosch@idosch.org> > > > > > Signed-off-by: Anna Emese Nyiri <annaemesenyiri@gmail.com> > > > > > --- > > > > > tools/testing/selftests/net/cmsg_sender.c | 11 +- > > > > > .../testing/selftests/net/cmsg_so_priority.sh | 115 > > > > > ++++++++++++++++++ > > > > > 2 files changed, 125 insertions(+), 1 deletion(-) > > > > > create mode 100755 > > > > > tools/testing/selftests/net/cmsg_so_priority.sh > > > > > > > > > diff --git a/tools/testing/selftests/net/cmsg_so_priority.sh > > > > > b/tools/testing/selftests/net/cmsg_so_priority.sh > > > > > new file mode 100755 > > > > > index 000000000000..706d29b251e9 > > > > > --- /dev/null > > > > > +++ b/tools/testing/selftests/net/cmsg_so_priority.sh > > > > > @@ -0,0 +1,115 @@ > > > > > +#!/bin/bash > > > > > > > > SPDX header > > > > > > > > > + > > > > > +source lib.sh > > > > > + > > > > > +IP4=192.168.0.2/16 > > > > > +TGT4=192.168.0.3/16 > > > > > +TGT4_NO_MASK=192.168.0.3 > > > > > +IP6=2001:db8::2/64 > > > > > +TGT6=2001:db8::3/64 > > > > > +TGT6_NO_MASK=2001:db8::3 > > > > > +IP4BR=192.168.0.1/16 > > > > > +IP6BR=2001:db8::1/64 > > > > > +PORT=8080 > > > > > +DELAY=400000 > > > > > +QUEUE_NUM=4 > > > > > + > > > > > + > > > > > +cleanup() { > > > > > + ip netns del red 2>/dev/null > > > > > + ip netns del green 2>/dev/null > > > > > + ip link del br0 2>/dev/null > > > > > + ip link del vethcab0 2>/dev/null > > > > > + ip link del vethcab1 2>/dev/null > > > > > +} > > > > > + > > > > > +trap cleanup EXIT > > > > > + > > > > > +priority_values=($(seq 0 $((QUEUE_NUM - 1)))) > > > > > + > > > > > +queue_config="" > > > > > +for ((i=0; i<$QUEUE_NUM; i++)); do > > > > > + queue_config+=" 1@$i" > > > > > +done > > > > > + > > > > > +map_config=$(seq 0 $((QUEUE_NUM - 1)) | tr '\n' ' ') > > > > > + > > > > > +ip netns add red > > > > > +ip netns add green > > > > > +ip link add br0 type bridge > > > > > +ip link set br0 up > > > > > > > > Is this bridge needed? Can this just use a veth pair as is. > > > > > > > > More importantly, it would be great if we can deduplicate this kind > > > > of > > > > setup boilerplate across similar tests as much as possible. > > > > > > As a matter of fact, similar to cmsg_so_mark, this test can probably > > > use a dummy netdevice, no need for a second namespace and dev. > > > > > > cmsg_so_mark.sh is probably small enough that it is fine to copy that > > > and create a duplicate. As trying to extend it to cover both tests > > > will probably double it in size and will just be harder to follow. > > > > I'm afraid we don't have "ip rule" match argument for skb->priority > > like we have for skb->mark (ip rule fwmark). AFAIU cmsg_so_mark.sh uses > > rule matches to confirm if skb->mark is correct or not. > > > > Using nftables meta priority would work with a dummy device: > > add table so_prio > > add chain so_prio so_prio_chain { type filter hook output priority 0; } > > add rule so_prio so_prio_chain meta priority $SOPRIOVALUE counter > > > > Is there anything simpler? I am afraid we cannot use nftables in > > selftests, or can we? Thanks! > > I'd use traffic shaper (tc). There are a variety of qdiscs/schedulers > and classifiers that act on skb->priority. When I initially suggested the test I was thinking about creating a VLAN device with egress QoS map and then matching on VLAN priority with flower. Something like [1]. It is just a quick hack. Proper test should test with all combinations of IPv4 / IPv6 / UDP / ICMP / RAW / cmsg / sockopt (like cmsg_so_mark.sh). [1] #!/bin/bash ip netns del ns1 &> /dev/null ip netns add ns1 ip -n ns1 link set dev lo up ip -n ns1 link add name dummy1 up type dummy ip -n ns1 link add link dummy1 name dummy1.10 up type vlan id 10 \ egress-qos-map 0:0 1:1 2:2 3:3 4:4 5:5 6:6 7:7 ip -n ns1 address add 192.0.2.1/24 dev dummy1.10 ip -n ns1 neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud permanent dev \ dummy1.10 tc -n ns1 qdisc add dev dummy1 clsact for i in {0..7}; do tc -n ns1 filter add dev dummy1 egress pref 1 handle 10${i} \ proto 802.1q flower vlan_prio $i vlan_ethtype ipv4 \ ip_proto udp dst_ip 192.0.2.2 action drop done for i in {0..7}; do pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \ | jq ".[] | select(.options.handle == 10$i) | \ .options.actions[0].stats.packets") [[ $pkts == 0 ]] || echo "prio $i: expected 0, got $pkts" ip netns exec ns1 ./cmsg_sender -4 -p udp -P $i 192.0.2.2 1234 pkts=$(tc -n ns1 -j -s filter show dev dummy1 egress \ | jq ".[] | select(.options.handle == 10$i) | \ .options.actions[0].stats.packets") [[ $pkts == 1 ]] || echo "prio $i: expected 1, got $pkts" done ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-11-10 13:27 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-11-07 13:22 [PATCH net-next v3 0/3] SO_PRIORITY cmsg patch summary Anna Emese Nyiri 2024-11-07 13:22 ` [PATCH net-next v3 1/3] net: Introduce sk_set_prio_allowed helper function Anna Emese Nyiri 2024-11-07 13:48 ` Eric Dumazet 2024-11-07 15:55 ` Willem de Bruijn 2024-11-07 13:22 ` [PATCH net-next v3 2/3] net: support SO_PRIORITY cmsg Anna Emese Nyiri 2024-11-07 13:50 ` Eric Dumazet 2024-11-07 16:17 ` Willem de Bruijn 2024-11-07 13:22 ` [PATCH net-next v3 3/3] selftest: net: test SO_PRIORITY ancillary data with cmsg_sender Anna Emese Nyiri 2024-11-07 17:22 ` Willem de Bruijn 2024-11-08 14:46 ` Willem de Bruijn 2024-11-09 18:54 ` Ferenc Fejes 2024-11-09 19:56 ` Willem de Bruijn 2024-11-10 13:27 ` Ido Schimmel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).