netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <hawk@kernel.org>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>, netdev@vger.kernel.org
Cc: hawk@kernel.org, pabeni@redhat.com, kuba@kernel.org,
	davem@davemloft.net, lorenzo@kernel.org,
	Ilias Apalodimas <ilias.apalodimas@linaro.org>,
	mtahhan@redhat.com, huangjie.albert@bytedance.com,
	Yunsheng Lin <linyunsheng@huawei.com>,
	edumazet@google.com, Liang Chen <liangchen.linux@gmail.com>
Subject: Re: [PATCH net-next RFC v1 2/4] veth: use generic-XDP functions when dealing with SKBs
Date: Tue, 29 Aug 2023 16:37:28 +0200	[thread overview]
Message-ID: <d7d3e320-9b2c-fbc4-7d2d-866741b10cf7@kernel.org> (raw)
In-Reply-To: <87msyg91gl.fsf@toke.dk>



On 24/08/2023 12.30, Toke Høiland-Jørgensen wrote:
> Jesper Dangaard Brouer <hawk@kernel.org> writes:
> 
>> The root-cause the realloc issue is that veth_xdp_rcv_skb() code path (that
>> handles SKBs like generic-XDP) is calling a native-XDP function
>> xdp_do_redirect(), instead of simply using xdp_do_generic_redirect() that can
>> handle SKBs.
>>
>> The existing code tries to steal the packet-data from the SKB (and frees the SKB
>> itself). This cause issues as SKBs can have different memory models that are
>> incompatible with native-XDP call xdp_do_redirect(). For this reason the checks
>> in veth_convert_skb_to_xdp_buff() becomes more strict. This in turn makes this a
>> bad approach. Simply leveraging generic-XDP helpers e.g. generic_xdp_tx() and
>> xdp_do_generic_redirect() as this resolves the issue given netstack can handle
>> these different SKB memory models.
> 
> While this does solve the memory issue, it's also a subtle change of
> semantics. For one thing, generic_xdp_tx() has this comment above it:
> 
> /* When doing generic XDP we have to bypass the qdisc layer and the
>   * network taps in order to match in-driver-XDP behavior. This also means
>   * that XDP packets are able to starve other packets going through a qdisc,
>   * and DDOS attacks will be more effective. In-driver-XDP use dedicated TX
>   * queues, so they do not have this starvation issue.
>   */
> 
> Also, more generally, this means that if you have a setup with
> XDP_REDIRECT-based forwarding in on a host with a mix of physical and
> veth devices, all the traffic originating from the veth devices will go
> on different TXQs than that originating from a physical NIC. Or if a
> veth device has a mix of xdp_frame-backed packets and skb-backed
> packets, those will also go on different queues, potentially leading to
> reordering.
> 

Mixing xdp_frame-backed packets and skb-backed packet (towards veth)
will naturally come from two different data paths, and the BPF-developer
that redirected the xdp_frame (into veth) will have taken this choice,
including the chance of reordering (given the two data/code paths).

I will claim that (for SKBs) current code cause reordering on TXQs (as
you explain), and my code changes actually fix this problem.

Consider a userspace app (inside namespace) sending packets out (to veth
peer).  Routing (or bridging) will make netstack send out device A
(maybe a physical device).  On veth peer we have XDP-prog running, that
will XDP-redirect every 2nd packet to device A.  With current code TXQ
reordering will occur, as calling "native" xdp_do_redirect() will select
TXQ based on current-running CPU, while normal SKBs will use
netdev_core_pick_tx().  After my change, using
xdp_do_generic_redirect(), the code end-up using generic_xdp_tx() which
(looking at the code) also use netdev_core_pick_tx() to select the TXQ.
Thus, I will claim it is more correct (even-though XDP in general
doesn't give this guarantee).

> I'm not sure exactly how much of an issue this is in practice, but at
> least from a conceptual PoV it's a change in behaviour that I don't
> think we should be making lightly. WDYT?

As desc above, I think this patchset is an improvement.  It might even
fix/address the concern that was raised.


[Outside the scope of this patchset]

The single XDP BPF-prog getting attached to (RX-side) on a veth device,
actually needs to handle *both* xdp_frame-backed packets and SKB-backed
packets, and it cannot tell them apart. (Easy fix: implement a kfunc
RX-metadata hint to expose this?).

For the use-case[1] of implementing NFV (Network Function Virt) chaining
via veth device, where each veth-pairs XDP BPF-prog implement a network
"function" and redirect/chain to the next veth/container NFV.  For this
use-case, I would like the ability to either skip SKB-backed packet or
turn off BPF-prog seeing any SKB-backed packets. There is a huge
performance advantage when XDP-redirecting an xdp_frame into veth
devices in this way, approx 6Mpps for traversing 4 veth devices as
benchmarked in [1]. (p.s. I was going to improve this performance
further, but I got distracted by other work).

  [1] 
https://github.com/xdp-project/xdp-project/blob/master/areas/core/xdp_frame03_overhead.org

The veth-NFV like use-cases are hampered by the SKB-based XDP code-path
causing a significant slowdown for normal netstack packets.  Plus, it
need to parse-and-filter those SKB-based packets too.  This, patchset
"just" significantly reduce the overhead of the SKB-based XDP code path,
which IMHO is a good first step.  Then we can discuss if should have a
switch to turn off the SKB-based XDP code-path in veth, afterwards.

--Jesper


  reply	other threads:[~2023-08-29 14:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-22 17:59 [PATCH net-next RFC v1 0/4] veth: reduce reallocations of SKBs when XDP bpf-prog is loaded Jesper Dangaard Brouer
2023-08-22 17:59 ` [PATCH net-next RFC v1 1/4] veth: use same bpf_xdp_adjust_head check as generic-XDP Jesper Dangaard Brouer
2023-08-22 17:59 ` [PATCH net-next RFC v1 2/4] veth: use generic-XDP functions when dealing with SKBs Jesper Dangaard Brouer
2023-08-24 10:30   ` Toke Høiland-Jørgensen
2023-08-29 14:37     ` Jesper Dangaard Brouer [this message]
2023-09-01 13:32       ` Toke Høiland-Jørgensen
2023-08-22 17:59 ` [PATCH net-next RFC v1 3/4] veth: lift skb_head_is_locked restriction for SKB based XDP Jesper Dangaard Brouer
2023-08-22 17:59 ` [PATCH net-next RFC v1 4/4] veth: when XDP is loaded increase needed_headroom Jesper Dangaard Brouer
2023-08-24  4:27 ` [PATCH net-next RFC v1 0/4] veth: reduce reallocations of SKBs when XDP bpf-prog is loaded Liang Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d7d3e320-9b2c-fbc4-7d2d-866741b10cf7@kernel.org \
    --to=hawk@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=huangjie.albert@bytedance.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=kuba@kernel.org \
    --cc=liangchen.linux@gmail.com \
    --cc=linyunsheng@huawei.com \
    --cc=lorenzo@kernel.org \
    --cc=mtahhan@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=toke@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).