[PATCH net-next] bpf, net: Support redirecting to ifb with bpf

BPF List
 help / color / mirror / Atom feed

* [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
@ 2023-04-13  2:53 Yafang Shao
  2023-04-13 11:47 ` Daniel Borkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Yafang Shao @ 2023-04-13  2:53 UTC (permalink / raw)
  To: davem, edumazet, kuba, pabeni, ast, daniel, hawk, john.fastabend
  Cc: netdev, bpf, Yafang Shao, Jesper Dangaard Brouer, Tonghao Zhang

In our container environment, we are using EDT-bpf to limit the egress
bandwidth. EDT-bpf can be used to limit egress only, but can't be used
to limit ingress. Some of our users also want to limit the ingress
bandwidth. But after applying EDT-bpf, which is based on clsact qdisc,
it is impossible to limit the ingress bandwidth currently, due to some
reasons,
1). We can't add ingress qdisc
The ingress qdisc can't coexist with clsact qdisc as clsact has both
ingress and egress handler. So our traditional method to limit ingress
bandwidth can't work any more.
2). We can't redirect ingress packet to ifb with bpf
By trying to analyze if it is possible to redirect the ingress packet to
ifb with a bpf program, we find that the ifb device is not supported by
bpf redirect yet.

This patch tries to resolve it by supporting redirecting to ifb with bpf
program.

Ingress bandwidth limit is useful in some scenarios, for example, for the
TCP-based service, there may be lots of clients connecting it, so it is
not wise to limit the clients' egress. After limiting the server-side's
ingress, it will lower the send rate of the client by lowering the TCP
cwnd if the ingress bandwidth limit is reached. If we don't limit it,
the clients will continue sending requests at a high rate.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Tonghao Zhang <xiangxia.m.yue@gmail.com>
---
 net/core/dev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/core/dev.c b/net/core/dev.c
index c785319..da6b196 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3956,6 +3956,7 @@ int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *skb)
 		return NULL;
 	case TC_ACT_REDIRECT:
 		/* No need to push/pop skb's mac_header here on egress! */
+		skb_set_redirected(skb, skb->tc_at_ingress);
 		skb_do_redirect(skb);
 		*ret = NET_XMIT_SUCCESS;
 		return NULL;
@@ -5138,6 +5139,7 @@ static __latent_entropy void net_tx_action(struct softirq_action *h)
 		 * redirecting to another netdev
 		 */
 		__skb_push(skb, skb->mac_len);
+		skb_set_redirected(skb, skb->tc_at_ingress);
 		if (skb_do_redirect(skb) == -EAGAIN) {
 			__skb_pull(skb, skb->mac_len);
 			*another = true;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-13  2:53 [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
@ 2023-04-13 11:47 ` Daniel Borkmann
  2023-04-13 14:20   ` Yafang Shao
  2023-04-13 14:43   ` Toke Høiland-Jørgensen
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-13 11:47 UTC (permalink / raw)
  To: Yafang Shao, davem, edumazet, kuba, pabeni, ast, hawk,
	john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau,
	toke

On 4/13/23 4:53 AM, Yafang Shao wrote:
> In our container environment, we are using EDT-bpf to limit the egress
> bandwidth. EDT-bpf can be used to limit egress only, but can't be used
> to limit ingress. Some of our users also want to limit the ingress
> bandwidth. But after applying EDT-bpf, which is based on clsact qdisc,
> it is impossible to limit the ingress bandwidth currently, due to some
> reasons,
> 1). We can't add ingress qdisc
> The ingress qdisc can't coexist with clsact qdisc as clsact has both
> ingress and egress handler. So our traditional method to limit ingress
> bandwidth can't work any more.

I'm not following, the latter is a super set of the former, why do you
need it to co-exist?

> 2). We can't redirect ingress packet to ifb with bpf
> By trying to analyze if it is possible to redirect the ingress packet to
> ifb with a bpf program, we find that the ifb device is not supported by
> bpf redirect yet.

You actually can: Just let BPF program return TC_ACT_UNSPEC for this
case and then add a matchall with higher prio (so it runs after bpf)
that contains an action with mirred egress redirect that pushes to ifb
dev - there is no change needed.

> This patch tries to resolve it by supporting redirecting to ifb with bpf
> program.
> 
> Ingress bandwidth limit is useful in some scenarios, for example, for the
> TCP-based service, there may be lots of clients connecting it, so it is
> not wise to limit the clients' egress. After limiting the server-side's
> ingress, it will lower the send rate of the client by lowering the TCP
> cwnd if the ingress bandwidth limit is reached. If we don't limit it,
> the clients will continue sending requests at a high rate.

Adding artificial queueing for the inbound traffic, aren't you worried
about DoS'ing your node? If you need to tell the sender to slow down,
have you looked at hbm (https://lpc.events/event/4/contributions/486/,
samples/bpf/hbm_out_kern.c) which uses ECN CE marking to tell the TCP
sender to slow down? (Fwiw, for UDP https://github.com/cloudflare/rakelimit
would be an option.)

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-13 11:47 ` Daniel Borkmann
@ 2023-04-13 14:20   ` Yafang Shao
  2023-04-13 14:43   ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 14+ messages in thread
From: Yafang Shao @ 2023-04-13 14:20 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: davem, edumazet, kuba, pabeni, ast, hawk, john.fastabend, netdev,
	bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau, toke

On Thu, Apr 13, 2023 at 7:47 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 4/13/23 4:53 AM, Yafang Shao wrote:
> > In our container environment, we are using EDT-bpf to limit the egress
> > bandwidth. EDT-bpf can be used to limit egress only, but can't be used
> > to limit ingress. Some of our users also want to limit the ingress
> > bandwidth. But after applying EDT-bpf, which is based on clsact qdisc,
> > it is impossible to limit the ingress bandwidth currently, due to some
> > reasons,
> > 1). We can't add ingress qdisc
> > The ingress qdisc can't coexist with clsact qdisc as clsact has both
> > ingress and egress handler. So our traditional method to limit ingress
> > bandwidth can't work any more.
>
> I'm not following, the latter is a super set of the former, why do you
> need it to co-exist?
>

It seems that I have a misunderstanding.

> > 2). We can't redirect ingress packet to ifb with bpf
> > By trying to analyze if it is possible to redirect the ingress packet to
> > ifb with a bpf program, we find that the ifb device is not supported by
> > bpf redirect yet.
>
> You actually can: Just let BPF program return TC_ACT_UNSPEC for this
> case and then add a matchall with higher prio (so it runs after bpf)
> that contains an action with mirred egress redirect that pushes to ifb
> dev - there is no change needed.
>

Ah, indeed, it works.
Many thanks for your help.

> > This patch tries to resolve it by supporting redirecting to ifb with bpf
> > program.
> >
> > Ingress bandwidth limit is useful in some scenarios, for example, for the
> > TCP-based service, there may be lots of clients connecting it, so it is
> > not wise to limit the clients' egress. After limiting the server-side's
> > ingress, it will lower the send rate of the client by lowering the TCP
> > cwnd if the ingress bandwidth limit is reached. If we don't limit it,
> > the clients will continue sending requests at a high rate.
>
> Adding artificial queueing for the inbound traffic, aren't you worried
> about DoS'ing your node?

Yes, we worried about it, but we haven't observed it in our data center.

> If you need to tell the sender to slow down,
> have you looked at hbm (https://lpc.events/event/4/contributions/486/,
> samples/bpf/hbm_out_kern.c) which uses ECN CE marking to tell the TCP
> sender to slow down? (Fwiw, for UDP https://github.com/cloudflare/rakelimit
> would be an option.)
>

We're considering using ECN. Thanks for your information, I will analyze it.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-13 11:47 ` Daniel Borkmann
  2023-04-13 14:20   ` Yafang Shao
@ 2023-04-13 14:43   ` Toke Høiland-Jørgensen
  2023-04-14  9:34     ` Daniel Borkmann
  1 sibling, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-04-13 14:43 UTC (permalink / raw)
  To: Daniel Borkmann, Yafang Shao, davem, edumazet, kuba, pabeni, ast,
	hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

Daniel Borkmann <daniel@iogearbox.net> writes:

>> 2). We can't redirect ingress packet to ifb with bpf
>> By trying to analyze if it is possible to redirect the ingress packet to
>> ifb with a bpf program, we find that the ifb device is not supported by
>> bpf redirect yet.
>
> You actually can: Just let BPF program return TC_ACT_UNSPEC for this
> case and then add a matchall with higher prio (so it runs after bpf)
> that contains an action with mirred egress redirect that pushes to ifb
> dev - there is no change needed.

I wasn't aware that BPF couldn't redirect directly to an IFB; any reason
why we shouldn't merge this patch in any case?

>> This patch tries to resolve it by supporting redirecting to ifb with bpf
>> program.
>> 
>> Ingress bandwidth limit is useful in some scenarios, for example, for the
>> TCP-based service, there may be lots of clients connecting it, so it is
>> not wise to limit the clients' egress. After limiting the server-side's
>> ingress, it will lower the send rate of the client by lowering the TCP
>> cwnd if the ingress bandwidth limit is reached. If we don't limit it,
>> the clients will continue sending requests at a high rate.
>
> Adding artificial queueing for the inbound traffic, aren't you worried
> about DoS'ing your node?

Just as an aside, the ingress filter -> ifb -> qdisc on the ifb
interface does work surprisingly well, and we've been using that over in
OpenWrt land for years[0]. It does have some overhead associated with it,
but I wouldn't expect it to be a source of self-DoS in itself (assuming
well-behaved TCP traffic).

-Toke

[0] https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-13 14:43   ` Toke Høiland-Jørgensen
@ 2023-04-14  9:34     ` Daniel Borkmann
  2023-04-14 16:07       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-14  9:34 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Yafang Shao, davem, edumazet,
	kuba, pabeni, ast, hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

On 4/13/23 4:43 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
> 
>>> 2). We can't redirect ingress packet to ifb with bpf
>>> By trying to analyze if it is possible to redirect the ingress packet to
>>> ifb with a bpf program, we find that the ifb device is not supported by
>>> bpf redirect yet.
>>
>> You actually can: Just let BPF program return TC_ACT_UNSPEC for this
>> case and then add a matchall with higher prio (so it runs after bpf)
>> that contains an action with mirred egress redirect that pushes to ifb
>> dev - there is no change needed.
> 
> I wasn't aware that BPF couldn't redirect directly to an IFB; any reason
> why we shouldn't merge this patch in any case?
> 
>>> This patch tries to resolve it by supporting redirecting to ifb with bpf
>>> program.
>>>
>>> Ingress bandwidth limit is useful in some scenarios, for example, for the
>>> TCP-based service, there may be lots of clients connecting it, so it is
>>> not wise to limit the clients' egress. After limiting the server-side's
>>> ingress, it will lower the send rate of the client by lowering the TCP
>>> cwnd if the ingress bandwidth limit is reached. If we don't limit it,
>>> the clients will continue sending requests at a high rate.
>>
>> Adding artificial queueing for the inbound traffic, aren't you worried
>> about DoS'ing your node?
> 
> Just as an aside, the ingress filter -> ifb -> qdisc on the ifb
> interface does work surprisingly well, and we've been using that over in
> OpenWrt land for years[0]. It does have some overhead associated with it,
> but I wouldn't expect it to be a source of self-DoS in itself (assuming
> well-behaved TCP traffic).

Out of curiosity, wrt OpenWrt case, can you elaborate on the use case to why
choosing to do this on ingress via ifb rather than on the egress side? I
presume in this case it's regular router, so pkts would be forwarded anyway,
and in your case traversing qdisc layer / queuing twice (ingress phys dev ->
ifb, egress phys dev), right? What is the rationale that would justify such
setup aka why it cannot be solved differently?

Thanks,
Daniel

> -Toke
> 
> [0] https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-14  9:34     ` Daniel Borkmann
@ 2023-04-14 16:07       ` Toke Høiland-Jørgensen
  2023-04-14 22:57         ` Daniel Borkmann
  0 siblings, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-04-14 16:07 UTC (permalink / raw)
  To: Daniel Borkmann, Yafang Shao, davem, edumazet, kuba, pabeni, ast,
	hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 4/13/23 4:43 PM, Toke Høiland-Jørgensen wrote:
>> Daniel Borkmann <daniel@iogearbox.net> writes:
>> 
>>>> 2). We can't redirect ingress packet to ifb with bpf
>>>> By trying to analyze if it is possible to redirect the ingress packet to
>>>> ifb with a bpf program, we find that the ifb device is not supported by
>>>> bpf redirect yet.
>>>
>>> You actually can: Just let BPF program return TC_ACT_UNSPEC for this
>>> case and then add a matchall with higher prio (so it runs after bpf)
>>> that contains an action with mirred egress redirect that pushes to ifb
>>> dev - there is no change needed.
>> 
>> I wasn't aware that BPF couldn't redirect directly to an IFB; any reason
>> why we shouldn't merge this patch in any case?
>> 
>>>> This patch tries to resolve it by supporting redirecting to ifb with bpf
>>>> program.
>>>>
>>>> Ingress bandwidth limit is useful in some scenarios, for example, for the
>>>> TCP-based service, there may be lots of clients connecting it, so it is
>>>> not wise to limit the clients' egress. After limiting the server-side's
>>>> ingress, it will lower the send rate of the client by lowering the TCP
>>>> cwnd if the ingress bandwidth limit is reached. If we don't limit it,
>>>> the clients will continue sending requests at a high rate.
>>>
>>> Adding artificial queueing for the inbound traffic, aren't you worried
>>> about DoS'ing your node?
>> 
>> Just as an aside, the ingress filter -> ifb -> qdisc on the ifb
>> interface does work surprisingly well, and we've been using that over in
>> OpenWrt land for years[0]. It does have some overhead associated with it,
>> but I wouldn't expect it to be a source of self-DoS in itself (assuming
>> well-behaved TCP traffic).
>
> Out of curiosity, wrt OpenWrt case, can you elaborate on the use case to why
> choosing to do this on ingress via ifb rather than on the egress side? I
> presume in this case it's regular router, so pkts would be forwarded anyway,
> and in your case traversing qdisc layer / queuing twice (ingress phys dev ->
> ifb, egress phys dev), right? What is the rationale that would justify such
> setup aka why it cannot be solved differently?

Because there's not always a single egress on the other side. These are
mainly home routers, which tend to have one or more WiFi devices bridged
to one or more ethernet ports on the LAN side, and a single upstream WAN
port. And the objective is to control the total amount of traffic going
over the WAN link (in both directions), to deal with bufferbloat in the
ISP network (which is sadly still all too prevalent).

In this setup, the traffic can be split arbitrarily between the links on
the LAN side, and the only "single bottleneck" is the WAN link. So we
install both egress and ingress shapers on this, configured to something
like 95-98% of the true link bandwidth, thus moving the queues into the
qdisc layer in the router. It's usually necessary to set the ingress
bandwidth shaper a bit lower than the egress due to being "downstream"
of the bottleneck link, but it does work surprisingly well.

We usually use something like a matchall filter to put all ingress
traffic on the ifb, so doing the redirect from BPF has not been an
immediate requirement thus far. However, it does seem a bit odd that
this is not possible, and we do have a BPF-based filter that layers on
top of this kind of setup, which currently uses u32 as the ingress
filter and so it could presumably be improved to use BPF instead if that
was available:
https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README

-Toke

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-14 16:07       ` Toke Høiland-Jørgensen
@ 2023-04-14 22:57         ` Daniel Borkmann
  2023-04-15  0:43           ` [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb kernel test robot
                             ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-14 22:57 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Yafang Shao, davem, edumazet,
	kuba, pabeni, ast, hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

On 4/14/23 6:07 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
[...]
> https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README

Thanks for the explanation, that sounds reasonable and this should ideally
be part of the commit msg! Yafang, Toke, how about we craft it the following
way then to support this case:

 From f6c83e5e55c5eb9da8acd19369c688acf53951db Mon Sep 17 00:00:00 2001
Message-Id: <f6c83e5e55c5eb9da8acd19369c688acf53951db.1681512637.git.daniel@iogearbox.net>
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Sat, 15 Apr 2023 00:30:27 +0200
Subject: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

There are some use-cases where it is desirable to use bpf_redirect()
in combination with ifb device, which currently is not supported, for
example, around filtering inbound traffic with BPF to then push it to
ifb which holds the qdisc for shaping in contrast to doing that on the
egress device.

Toke mentions the following case related to OpenWrt:

   Because there's not always a single egress on the other side. These are
   mainly home routers, which tend to have one or more WiFi devices bridged
   to one or more ethernet ports on the LAN side, and a single upstream WAN
   port. And the objective is to control the total amount of traffic going
   over the WAN link (in both directions), to deal with bufferbloat in the
   ISP network (which is sadly still all too prevalent).

   In this setup, the traffic can be split arbitrarily between the links
   on the LAN side, and the only "single bottleneck" is the WAN link. So we
   install both egress and ingress shapers on this, configured to something
   like 95-98% of the true link bandwidth, thus moving the queues into the
   qdisc layer in the router. It's usually necessary to set the ingress
   bandwidth shaper a bit lower than the egress due to being "downstream"
   of the bottleneck link, but it does work surprisingly well.

   We usually use something like a matchall filter to put all ingress
   traffic on the ifb, so doing the redirect from BPF has not been an
   immediate requirement thus far. However, it does seem a bit odd that this
   is not possible, and we do have a BPF-based filter that layers on top of
   this kind of setup, which currently uses u32 as the ingress filter and
   so it could presumably be improved to use BPF instead if that was
   available.

Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reported-by: Yafang Shao <laoar.shao@gmail.com>
Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk
---
  include/linux/skbuff.h | 9 +++++++++
  net/core/filter.c      | 1 +
  2 files changed, 10 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ff7ad331fb82..2bbf9245640a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -5049,6 +5049,15 @@ static inline void skb_reset_redirect(struct sk_buff *skb)
  	skb->redirected = 0;
  }

+static inline void skb_set_redirected_noclear(struct sk_buff *skb,
+					      bool from_ingress)
+{
+	skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+	skb->from_ingress = from_ingress;
+#endif
+}
+
  static inline bool skb_csum_is_sctp(struct sk_buff *skb)
  {
  	return skb->csum_not_inet;
diff --git a/net/core/filter.c b/net/core/filter.c
index 1d6f165923bf..27ba616aaa1a 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2111,6 +2111,7 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
  	}

  	skb->dev = dev;
+	skb_set_redirected_noclear(skb, skb->tc_at_ingress);
  	skb_clear_tstamp(skb);

  	dev_xmit_recursion_inc();
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
  2023-04-14 22:57         ` Daniel Borkmann
@ 2023-04-15  0:43           ` kernel test robot
  2023-04-15  1:04           ` kernel test robot
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2023-04-15  0:43 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen, Yafang Shao,
	davem, edumazet, kuba, pabeni, ast, hawk, john.fastabend
  Cc: llvm, oe-kbuild-all, netdev, bpf, Jesper Dangaard Brouer,
	Tonghao Zhang, martin.lau

Hi Daniel,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Borkmann/bpf-Set-skb-redirect-and-from_ingress-info-in-__bpf_tx_skb/20230415-065912
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/c68bf723-3406-d177-49b4-6d5b485048de%40iogearbox.net
patch subject: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
config: x86_64-randconfig-a002-20230410 (https://download.01.org/0day-ci/archive/20230415/202304150814.os34v8BI-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/91c0d1d0b071c23d95bf9747b89daf5ae378ad1a
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Borkmann/bpf-Set-skb-redirect-and-from_ingress-info-in-__bpf_tx_skb/20230415-065912
        git checkout 91c0d1d0b071c23d95bf9747b89daf5ae378ad1a
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash net/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304150814.os34v8BI-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/core/filter.c:2125:39: error: no member named 'tc_at_ingress' in 'struct sk_buff'
           skb_set_redirected_noclear(skb, skb->tc_at_ingress);
                                           ~~~  ^
   1 error generated.


vim +2125 net/core/filter.c

  2113	
  2114	static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
  2115	{
  2116		int ret;
  2117	
  2118		if (dev_xmit_recursion()) {
  2119			net_crit_ratelimited("bpf: recursion limit reached on datapath, buggy bpf program?\n");
  2120			kfree_skb(skb);
  2121			return -ENETDOWN;
  2122		}
  2123	
  2124		skb->dev = dev;
> 2125		skb_set_redirected_noclear(skb, skb->tc_at_ingress);
  2126		skb_clear_tstamp(skb);
  2127	
  2128		dev_xmit_recursion_inc();
  2129		ret = dev_queue_xmit(skb);
  2130		dev_xmit_recursion_dec();
  2131	
  2132		return ret;
  2133	}
  2134	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
  2023-04-14 22:57         ` Daniel Borkmann
  2023-04-15  0:43           ` [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb kernel test robot
@ 2023-04-15  1:04           ` kernel test robot
  2023-04-15 13:38           ` [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
  2023-04-15 14:00           ` Toke Høiland-Jørgensen
  3 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2023-04-15  1:04 UTC (permalink / raw)
  To: Daniel Borkmann, Toke Høiland-Jørgensen, Yafang Shao,
	davem, edumazet, kuba, pabeni, ast, hawk, john.fastabend
  Cc: oe-kbuild-all, netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang,
	martin.lau

Hi Daniel,

kernel test robot noticed the following build errors:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Daniel-Borkmann/bpf-Set-skb-redirect-and-from_ingress-info-in-__bpf_tx_skb/20230415-065912
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/c68bf723-3406-d177-49b4-6d5b485048de%40iogearbox.net
patch subject: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
config: x86_64-randconfig-a014-20230410 (https://download.01.org/0day-ci/archive/20230415/202304150811.bzx9niRq-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/91c0d1d0b071c23d95bf9747b89daf5ae378ad1a
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Daniel-Borkmann/bpf-Set-skb-redirect-and-from_ingress-info-in-__bpf_tx_skb/20230415-065912
        git checkout 91c0d1d0b071c23d95bf9747b89daf5ae378ad1a
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash net/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202304150811.bzx9niRq-lkp@intel.com/

All errors (new ones prefixed by >>):

   net/core/filter.c: In function '__bpf_tx_skb':
>> net/core/filter.c:2125:44: error: 'struct sk_buff' has no member named 'tc_at_ingress'
    2125 |         skb_set_redirected_noclear(skb, skb->tc_at_ingress);
         |                                            ^~


vim +2125 net/core/filter.c

  2113	
  2114	static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
  2115	{
  2116		int ret;
  2117	
  2118		if (dev_xmit_recursion()) {
  2119			net_crit_ratelimited("bpf: recursion limit reached on datapath, buggy bpf program?\n");
  2120			kfree_skb(skb);
  2121			return -ENETDOWN;
  2122		}
  2123	
  2124		skb->dev = dev;
> 2125		skb_set_redirected_noclear(skb, skb->tc_at_ingress);
  2126		skb_clear_tstamp(skb);
  2127	
  2128		dev_xmit_recursion_inc();
  2129		ret = dev_queue_xmit(skb);
  2130		dev_xmit_recursion_dec();
  2131	
  2132		return ret;
  2133	}
  2134	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-14 22:57         ` Daniel Borkmann
  2023-04-15  0:43           ` [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb kernel test robot
  2023-04-15  1:04           ` kernel test robot
@ 2023-04-15 13:38           ` Yafang Shao
  2023-04-15 14:00           ` Toke Høiland-Jørgensen
  3 siblings, 0 replies; 14+ messages in thread
From: Yafang Shao @ 2023-04-15 13:38 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Toke Høiland-Jørgensen, davem, edumazet, kuba, pabeni,
	ast, hawk, john.fastabend, netdev, bpf, Jesper Dangaard Brouer,
	Tonghao Zhang, martin.lau

On Sat, Apr 15, 2023 at 6:57 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 4/14/23 6:07 PM, Toke Høiland-Jørgensen wrote:
> > Daniel Borkmann <daniel@iogearbox.net> writes:
> [...]
> > https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
>
> Thanks for the explanation, that sounds reasonable and this should ideally
> be part of the commit msg! Yafang, Toke, how about we craft it the following
> way then to support this case:
>

LGTM. With the issue reported by kernel test robot [1] fixed,

Acked-by: Yafang Shao <laoar.shao@gmail.com>

[1]. https://lore.kernel.org/bpf/202304150811.bzx9niRq-lkp@intel.com/

>  From f6c83e5e55c5eb9da8acd19369c688acf53951db Mon Sep 17 00:00:00 2001
> Message-Id: <f6c83e5e55c5eb9da8acd19369c688acf53951db.1681512637.git.daniel@iogearbox.net>
> From: Daniel Borkmann <daniel@iogearbox.net>
> Date: Sat, 15 Apr 2023 00:30:27 +0200
> Subject: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> There are some use-cases where it is desirable to use bpf_redirect()
> in combination with ifb device, which currently is not supported, for
> example, around filtering inbound traffic with BPF to then push it to
> ifb which holds the qdisc for shaping in contrast to doing that on the
> egress device.
>
> Toke mentions the following case related to OpenWrt:
>
>    Because there's not always a single egress on the other side. These are
>    mainly home routers, which tend to have one or more WiFi devices bridged
>    to one or more ethernet ports on the LAN side, and a single upstream WAN
>    port. And the objective is to control the total amount of traffic going
>    over the WAN link (in both directions), to deal with bufferbloat in the
>    ISP network (which is sadly still all too prevalent).
>
>    In this setup, the traffic can be split arbitrarily between the links
>    on the LAN side, and the only "single bottleneck" is the WAN link. So we
>    install both egress and ingress shapers on this, configured to something
>    like 95-98% of the true link bandwidth, thus moving the queues into the
>    qdisc layer in the router. It's usually necessary to set the ingress
>    bandwidth shaper a bit lower than the egress due to being "downstream"
>    of the bottleneck link, but it does work surprisingly well.
>
>    We usually use something like a matchall filter to put all ingress
>    traffic on the ifb, so doing the redirect from BPF has not been an
>    immediate requirement thus far. However, it does seem a bit odd that this
>    is not possible, and we do have a BPF-based filter that layers on top of
>    this kind of setup, which currently uses u32 as the ingress filter and
>    so it could presumably be improved to use BPF instead if that was
>    available.
>
> Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
> Reported-by: Yafang Shao <laoar.shao@gmail.com>
> Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
> Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk
> ---
>   include/linux/skbuff.h | 9 +++++++++
>   net/core/filter.c      | 1 +
>   2 files changed, 10 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index ff7ad331fb82..2bbf9245640a 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -5049,6 +5049,15 @@ static inline void skb_reset_redirect(struct sk_buff *skb)
>         skb->redirected = 0;
>   }
>
> +static inline void skb_set_redirected_noclear(struct sk_buff *skb,
> +                                             bool from_ingress)
> +{
> +       skb->redirected = 1;
> +#ifdef CONFIG_NET_REDIRECT
> +       skb->from_ingress = from_ingress;
> +#endif
> +}
> +
>   static inline bool skb_csum_is_sctp(struct sk_buff *skb)
>   {
>         return skb->csum_not_inet;
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 1d6f165923bf..27ba616aaa1a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2111,6 +2111,7 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
>         }
>
>         skb->dev = dev;
> +       skb_set_redirected_noclear(skb, skb->tc_at_ingress);
>         skb_clear_tstamp(skb);
>
>         dev_xmit_recursion_inc();
> --
> 2.21.0



-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-14 22:57         ` Daniel Borkmann
                             ` (2 preceding siblings ...)
  2023-04-15 13:38           ` [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
@ 2023-04-15 14:00           ` Toke Høiland-Jørgensen
  2023-04-17 13:50             ` Daniel Borkmann
  3 siblings, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-04-15 14:00 UTC (permalink / raw)
  To: Daniel Borkmann, Yafang Shao, davem, edumazet, kuba, pabeni, ast,
	hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

Daniel Borkmann <daniel@iogearbox.net> writes:

> On 4/14/23 6:07 PM, Toke Høiland-Jørgensen wrote:
>> Daniel Borkmann <daniel@iogearbox.net> writes:
> [...]
>> https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
>
> Thanks for the explanation, that sounds reasonable and this should ideally
> be part of the commit msg! Yafang, Toke, how about we craft it the following
> way then to support this case:

SGTM! With the kbot complaint fixed:

Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next] bpf, net: Support redirecting to ifb with bpf
  2023-04-15 14:00           ` Toke Høiland-Jørgensen
@ 2023-04-17 13:50             ` Daniel Borkmann
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-17 13:50 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, Yafang Shao, davem, edumazet,
	kuba, pabeni, ast, hawk, john.fastabend
  Cc: netdev, bpf, Jesper Dangaard Brouer, Tonghao Zhang, martin.lau

On 4/15/23 4:00 PM, Toke Høiland-Jørgensen wrote:
> Daniel Borkmann <daniel@iogearbox.net> writes:
> 
>> On 4/14/23 6:07 PM, Toke Høiland-Jørgensen wrote:
>>> Daniel Borkmann <daniel@iogearbox.net> writes:
>> [...]
>>> https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
>>
>> Thanks for the explanation, that sounds reasonable and this should ideally
>> be part of the commit msg! Yafang, Toke, how about we craft it the following
>> way then to support this case:
> 
> SGTM! With the kbot complaint fixed:

Sounds good, thanks both; sent out now.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
@ 2023-04-17 13:49 Daniel Borkmann
  2023-04-17 20:30 ` patchwork-bot+netdevbpf
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Borkmann @ 2023-04-17 13:49 UTC (permalink / raw)
  To: bpf
  Cc: john.fastabend, Daniel Borkmann, Toke Høiland-Jørgensen,
	Yafang Shao, Tonghao Zhang

There are some use-cases where it is desirable to use bpf_redirect()
in combination with ifb device, which currently is not supported, for
example, around filtering inbound traffic with BPF to then push it to
ifb which holds the qdisc for shaping in contrast to doing that on the
egress device.

Toke mentions the following case related to OpenWrt:

   Because there's not always a single egress on the other side. These are
   mainly home routers, which tend to have one or more WiFi devices bridged
   to one or more ethernet ports on the LAN side, and a single upstream WAN
   port. And the objective is to control the total amount of traffic going
   over the WAN link (in both directions), to deal with bufferbloat in the
   ISP network (which is sadly still all too prevalent).

   In this setup, the traffic can be split arbitrarily between the links
   on the LAN side, and the only "single bottleneck" is the WAN link. So we
   install both egress and ingress shapers on this, configured to something
   like 95-98% of the true link bandwidth, thus moving the queues into the
   qdisc layer in the router. It's usually necessary to set the ingress
   bandwidth shaper a bit lower than the egress due to being "downstream"
   of the bottleneck link, but it does work surprisingly well.

   We usually use something like a matchall filter to put all ingress
   traffic on the ifb, so doing the redirect from BPF has not been an
   immediate requirement thus far. However, it does seem a bit odd that
   this is not possible, and we do have a BPF-based filter that layers on
   top of this kind of setup, which currently uses u32 as the ingress
   filter and so it could presumably be improved to use BPF instead if
   that was available.

Reported-by: Toke Høiland-Jørgensen <toke@redhat.com>
Reported-by: Yafang Shao <laoar.shao@gmail.com>
Reported-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yafang Shao <laoar.shao@gmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://git.openwrt.org/?p=project/qosify.git;a=blob;f=README
Link: https://lore.kernel.org/bpf/875y9yzbuy.fsf@toke.dk
---
 include/linux/skbuff.h | 9 +++++++++
 net/core/filter.c      | 1 +
 2 files changed, 10 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 494a23a976b0..9ff2e3d57329 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -5041,6 +5041,15 @@ static inline void skb_reset_redirect(struct sk_buff *skb)
 	skb->redirected = 0;
 }

+static inline void skb_set_redirected_noclear(struct sk_buff *skb,
+					      bool from_ingress)
+{
+	skb->redirected = 1;
+#ifdef CONFIG_NET_REDIRECT
+	skb->from_ingress = from_ingress;
+#endif
+}
+
 static inline bool skb_csum_is_sctp(struct sk_buff *skb)
 {
 	return skb->csum_not_inet;
diff --git a/net/core/filter.c b/net/core/filter.c
index df0df59814ae..44fb997434ad 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2122,6 +2122,7 @@ static inline int __bpf_tx_skb(struct net_device *dev, struct sk_buff *skb)
 	}

 	skb->dev = dev;
+	skb_set_redirected_noclear(skb, skb_at_tc_ingress(skb));
 	skb_clear_tstamp(skb);

 	dev_xmit_recursion_inc();
-- 
2.21.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
  2023-04-17 13:49 [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb Daniel Borkmann
@ 2023-04-17 20:30 ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-04-17 20:30 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: bpf, john.fastabend, toke, laoar.shao, xiangxia.m.yue

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Mon, 17 Apr 2023 15:49:15 +0200 you wrote:
> There are some use-cases where it is desirable to use bpf_redirect()
> in combination with ifb device, which currently is not supported, for
> example, around filtering inbound traffic with BPF to then push it to
> ifb which holds the qdisc for shaping in contrast to doing that on the
> egress device.
> 
> Toke mentions the following case related to OpenWrt:
> 
> [...]

Here is the summary with links:
  - [bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb
    https://git.kernel.org/bpf/bpf-next/c/59e498a3289f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2023-04-17 20:30 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-13  2:53 [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
2023-04-13 11:47 ` Daniel Borkmann
2023-04-13 14:20   ` Yafang Shao
2023-04-13 14:43   ` Toke Høiland-Jørgensen
2023-04-14  9:34     ` Daniel Borkmann
2023-04-14 16:07       ` Toke Høiland-Jørgensen
2023-04-14 22:57         ` Daniel Borkmann
2023-04-15  0:43           ` [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb kernel test robot
2023-04-15  1:04           ` kernel test robot
2023-04-15 13:38           ` [PATCH net-next] bpf, net: Support redirecting to ifb with bpf Yafang Shao
2023-04-15 14:00           ` Toke Høiland-Jørgensen
2023-04-17 13:50             ` Daniel Borkmann
  -- strict thread matches above, loose matches on Subject: below --
2023-04-17 13:49 [PATCH bpf-next] bpf: Set skb redirect and from_ingress info in __bpf_tx_skb Daniel Borkmann
2023-04-17 20:30 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox