netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: netfilter-devel@vger.kernel.org
Cc: davem@davemloft.net, netdev@vger.kernel.org, kuba@kernel.org,
	pabeni@redhat.com, edumazet@google.com
Subject: [PATCH net 01/14] ipvs: align inner_mac_header for encapsulation
Date: Tue, 20 Jun 2023 11:35:29 +0200	[thread overview]
Message-ID: <20230620093542.69232-2-pablo@netfilter.org> (raw)
In-Reply-To: <20230620093542.69232-1-pablo@netfilter.org>

From: Terin Stock <terin@cloudflare.com>

When using encapsulation the original packet's headers are copied to the
inner headers. This preserves the space for an inner mac header, which
is not used by the inner payloads for the encapsulation types supported
by IPVS. If a packet is using GUE or GRE encapsulation and needs to be
segmented, flow can be passed to __skb_udp_tunnel_segment() which
calculates a negative tunnel header length. A negative tunnel header
length causes pskb_may_pull() to fail, dropping the packet.

This can be observed by attaching probes to ip_vs_in_hook(),
__dev_queue_xmit(), and __skb_udp_tunnel_segment():

    perf probe --add '__dev_queue_xmit skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'
    perf probe --add '__skb_udp_tunnel_segment:7 tnl_hlen'
    perf probe -m ip_vs --add 'ip_vs_in_hook skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'

These probes the headers and tunnel header length for packets which
traverse the IPVS encapsulation path. A TCP packet can be forced into
the segmentation path by being smaller than a calculated clamped MSS,
but larger than the advertised MSS.

    probe:ip_vs_in_hook: inner_mac_header=0x0 inner_network_header=0x0 mac_header=0x44 network_header=0x52
    probe:ip_vs_in_hook: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:dev_queue_xmit: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:__skb_udp_tunnel_segment_L7: tnl_hlen=-2

When using veth-based encapsulation, the interfaces are set to be
mac-less, which does not preserve space for an inner mac header. This
prevents this issue from occurring.

In our real-world testing of sending a 32KB file we observed operation
time increasing from ~75ms for veth-based encapsulation to over 1.5s
using IPVS encapsulation due to retries from dropped packets.

This changeset modifies the packet on the encapsulation path in
ip_vs_tunnel_xmit() and ip_vs_tunnel_xmit_v6() to remove the inner mac
header offset. This fixes UDP segmentation for both encapsulation types,
and corrects the inner headers for any IPIP flows that may use it.

Fixes: 84c0d5e96f3a ("ipvs: allow tunneling with gue encapsulation")
Signed-off-by: Terin Stock <terin@cloudflare.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipvs/ip_vs_xmit.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index feb1d7fcb09f..a80b960223e1 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -1207,6 +1207,7 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp,
 	skb->transport_header = skb->network_header;
 
 	skb_set_inner_ipproto(skb, next_protocol);
+	skb_set_inner_mac_header(skb, skb_inner_network_offset(skb));
 
 	if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE) {
 		bool check = false;
@@ -1349,6 +1350,7 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct ip_vs_conn *cp,
 	skb->transport_header = skb->network_header;
 
 	skb_set_inner_ipproto(skb, next_protocol);
+	skb_set_inner_mac_header(skb, skb_inner_network_offset(skb));
 
 	if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE) {
 		bool check = false;
-- 
2.30.2


  reply	other threads:[~2023-06-20  9:36 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-20  9:35 [PATCH net 00/14,v2] Netfilter/IPVS fixes for net Pablo Neira Ayuso
2023-06-20  9:35 ` Pablo Neira Ayuso [this message]
2023-06-20  9:35 ` [PATCH net 02/14] netfilter: nf_tables: fix chain binding transaction logic Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 03/14] netfilter: nf_tables: add NFT_TRANS_PREPARE_ERROR to deal with bound set/chain Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 04/14] netfilter: nf_tables: drop map element references from preparation phase Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 05/14] netfilter: nft_set_pipapo: .walk does not deal with generations Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 06/14] netfilter: nf_tables: fix underflow in object reference counter Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 07/14] netfilter: nf_tables: disallow element updates of bound anonymous sets Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 08/14] netfilter: nf_tables: reject unbound anonymous set before commit phase Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 09/14] netfilter: nf_tables: reject unbound chain " Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 10/14] netfilter: nf_tables: disallow updates of anonymous sets Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 11/14] netfilter: nf_tables: disallow timeout for " Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 12/14] netfilter: nf_tables: drop module reference after updating chain Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 13/14] netfilter: nfnetlink_osf: fix module autoload Pablo Neira Ayuso
2023-06-20  9:35 ` [PATCH net 14/14] netfilter: nf_tables: Fix for deleting base chains with payload Pablo Neira Ayuso
2023-06-20 17:00 ` [PATCH net 00/14,v2] Netfilter/IPVS fixes for net Pablo Neira Ayuso
  -- strict thread matches above, loose matches on Subject: below --
2023-06-21 10:07 [PATCH net,v3 00/14] " Pablo Neira Ayuso
2023-06-21 10:07 ` [PATCH net 01/14] ipvs: align inner_mac_header for encapsulation Pablo Neira Ayuso
2023-06-22 14:30   ` patchwork-bot+netdevbpf
2023-06-19 14:57 [PATCH net 00/14] Netfilter/IPVS fixes for net Pablo Neira Ayuso
2023-06-19 14:57 ` [PATCH net 01/14] ipvs: align inner_mac_header for encapsulation Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230620093542.69232-2-pablo@netfilter.org \
    --to=pablo@netfilter.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).