BPF List
 help / color / mirror / Atom feed
* [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
@ 2023-04-13  2:53 Yafang Shao
  2023-04-13 11:47 ` Daniel Borkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Yafang Shao @ 2023-04-13  2:53 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend
  Cc: netdev, bpf, Yafang Shao, Jesper Dangaard Brouer, Tonghao Zhang

In our container environment, we are using EDT-bpf to limit the egress
bandwidth. EDT-bpf can be used to limit egress only, but can't be used
to limit ingress. Some of our users also want to limit the ingress
bandwidth. But after applying EDT-bpf, which is based on clsact qdisc,
it is impossible to limit the ingress bandwidth currently, due to some
reasons,
1). We can't add ingress qdisc
The ingress qdisc can't coexist with clsact qdisc as clsact has both
ingress and egress handler. So our traditional method to limit ingress
bandwidth can't work any more.
2). We can't redirect ingress packet to ifb with bpf
By trying to analyze if it is possible to redirect the ingress packet to
ifb with a bpf program, we find that the ifb device is not supported by
bpf redirect yet.

This patch tries to resolve it by supporting redirecting to ifb with bpf
program.

Ingress bandwidth limit is useful in some scenarios, for example, for the
TCP-based service, there may be lots of clients connecting it, so it is
not wise to limit the clients' egress. After limiting the server-side's
ingress, it will lower the send rate of the client by lowering the TCP
cwnd if the ingress bandwidth limit is reached. If we don't limit it,
the clients will continue sending requests at a high rate.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 net/core/dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index c785319..da6b196 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3956,6 +3956,7 @@ int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
 		return NULL;
 	case TC_ACT_REDIRECT:
 		/* No need to push/pop skb's mac_header here on egress! */
+		skb_set_redirected(skb, skb->tc_at_ingress);
 		skb_do_redirect(skb);
 		*ret = NET_XMIT_SUCCESS;
 		return NULL;
@@ -5138,6 +5139,7 @@ static __latent_entropy void net_tx_action(struct softirq_action *h)
 		 * redirecting to another netdev
 		 */
 		__skb_push(skb, skb->mac_len);
+		skb_set_redirected(skb, skb->tc_at_ingress);
 		if (skb_do_redirect(skb) == -EAGAIN) {
 			__skb_pull(skb, skb->mac_len);
 			*another = true;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread
* [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
@ 2023-04-17 13:49 Daniel Borkmann
  2023-04-17 20:30 ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-17 13:49 UTC (permalink / raw)
  To: bpf
  Cc: john.fastabend, Daniel Borkmann, Toke Høiland-Jørgensen,
	Yafang Shao, Tonghao Zhang

There are some use-cases where it is desirable to use bpf_redirect()
in combination with ifb device, which currently is not supported, for
example, around filtering inbound traffic with BPF to then push it to
ifb which holds the qdisc for shaping in contrast to doing that on the
egress device.

Toke mentions the following case related to OpenWrt:

   Because there's not always a single egress on the other side. These are
   mainly home routers, which tend to have one or more WiFi devices bridged
   to one or more ethernet ports on the LAN side, and a single upstream WAN
   port. And the objective is to control the total amount of traffic going
   over the WAN link (in both directions), to deal with bufferbloat in the
   ISP network (which is sadly still all too prevalent).

   In this setup, the traffic can be split arbitrarily between the links
   on the LAN side, and the only "single bottleneck" is the WAN link. So we
   install both egress and ingress shapers on this, configured to something
   like 95-98% of the true link bandwidth, thus moving the queues into the
   qdisc layer in the router. It's usually necessary to set the ingress
   bandwidth shaper a bit lower than the egress due to being "downstream"
   of the bottleneck link, but it does work surprisingly well.

   We usually use something like a matchall filter to put all ingress
   traffic on the ifb, so doing the redirect from BPF has not been an
   immediate requirement thus far. However, it does seem a bit odd that
   this is not possible, and we do have a BPF-based filter that layers on
   top of this kind of setup, which currently uses u32 as the ingress
   filter and so it could presumably be improved to use BPF instead if
   that was available.

Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reported-by: Yafang Shao <laoar.shao@gmail.com>
Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk
---
 include/linux/skbuff.h | 9 +++++++++
 net/core/filter.c      | 1 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 494a23a976b0..9ff2e3d57329 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -5041,6 +5041,15 @@ static inline void skb_reset_redirect(struct sk_buff *skb)
 	skb->redirected = 0;
 }
 
+static inline void skb_set_redirected_noclear(struct sk_buff *skb,
+					      bool from_ingress)
+{
+	skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+	skb->from_ingress = from_ingress;
+#endif
+}
+
 static inline bool skb_csum_is_sctp(struct sk_buff *skb)
 {
 	return skb->csum_not_inet;
diff --git a/net/core/filter.c b/net/core/filter.c
index df0df59814ae..44fb997434ad 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2122,6 +2122,7 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 	}
 
 	skb->dev = dev;
+	skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb));
 	skb_clear_tstamp(skb);
 
 	dev_xmit_recursion_inc();
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-04-17 20:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-13  2:53 [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
2023-04-13 11:47 ` Daniel Borkmann
2023-04-13 14:20   ` Yafang Shao
2023-04-13 14:43   ` Toke Høiland-Jørgensen
2023-04-14  9:34     ` Daniel Borkmann
2023-04-14 16:07       ` Toke Høiland-Jørgensen
2023-04-14 22:57         ` Daniel Borkmann
2023-04-15  0:43           ` [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb kernel test robot
2023-04-15  1:04           ` kernel test robot
2023-04-15 13:38           ` [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
2023-04-15 14:00           ` Toke Høiland-Jørgensen
2023-04-17 13:50             ` Daniel Borkmann
  -- strict thread matches above, loose matches on Subject: below --
2023-04-17 13:49 [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb Daniel Borkmann
2023-04-17 20:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox