From: Leon Romanovsky <leon@kernel.org>
To: Cosmin Ratiu <cratiu@nvidia.com>
Cc: netdev@vger.kernel.org,
Steffen Klassert <steffen.klassert@secunet.com>,
Herbert Xu <herbert@gondor.apana.org.au>,
Paolo Abeni <pabeni@redhat.com>, Simon Horman <horms@kernel.org>,
Eric Dumazet <edumazet@google.com>,
"David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>,
Tariq Toukan <tariqt@nvidia.com>,
linux-kselftest@vger.kernel.org,
Dragos Tatulea <dtatulea@nvidia.com>,
Yael Chemla <ychemla@nvidia.com>
Subject: Re: [PATCH net] xfrm_output: Force software GSO only in tunnel mode
Date: Wed, 19 Feb 2025 16:28:12 +0200 [thread overview]
Message-ID: <20250219142812.GG53094@unreal> (raw)
In-Reply-To: <20250219105248.226962-1-cratiu@nvidia.com>
On Wed, Feb 19, 2025 at 12:52:48PM +0200, Cosmin Ratiu wrote:
> The cited commit fixed a software GSO bug with VXLAN + IPSec in tunnel
> mode. Unfortunately, it is slightly broader than necessary, as it also
> severely affects performance for Geneve + IPSec transport mode over a
> device capable of both HW GSO and IPSec crypto offload. In this case,
> xfrm_output unnecessarily triggers software GSO instead of letting the
> HW do it. In simple iperf3 tests over Geneve + IPSec transport mode over
> a back-2-back pair of NICs with MTU 1500, the performance was observed
> to be up to 6x worse when doing software GSO compared to leaving it to
> the hardware.
>
> This commit makes xfrm_output only trigger software GSO in crypto
> offload cases for already encapsulated packets in tunnel mode, as not
> doing so would then cause the inner tunnel skb->inner_networking_header
> to be overwritten and break software GSO for that packet later if the
> device turns out to not be capable of HW GSO.
>
> Taking a closer look at the conditions for the original bug, to better
> understand the reasons for this change:
> - vxlan_build_skb -> iptunnel_handle_offloads sets inner_protocol and
> inner network header.
> - then, udp_tunnel_xmit_skb -> ip_tunnel_xmit adds outer transport and
> network headers.
> - later in the xmit path, xfrm_output -> xfrm_outer_mode_output ->
> xfrm4_prepare_output -> xfrm4_tunnel_encap_add overwrites the inner
> network header with the one set in ip_tunnel_xmit before adding the
> second outer header.
> - __dev_queue_xmit -> validate_xmit_skb checks whether GSO segmentation
> needs to happen based on dev features. In the original bug, the hw
> couldn't segment the packets, so skb_gso_segment was invoked.
> - deep in the .gso_segment callback machinery, __skb_udp_tunnel_segment
> tries to use the wrong inner network header, expecting the one set in
> iptunnel_handle_offloads but getting the one set by xfrm instead.
> - a bit later, ipv6_gso_segment accesses the wrong memory based on that
> wrong inner network header.
>
> With the new change, the original bug (or similar ones) cannot happen
> again, as xfrm will now trigger software GSO before applying a tunnel.
> This concern doesn't exist in packet offload mode, when the HW adds
> encapsulation headers. For the non-offloaded packets (crypto in SW),
> software GSO is still done unconditionally in the else branch.
>
> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
> Reviewed-by: Yael Chemla <ychemla@nvidia.com>
> Fixes: a204aef9fd77 ("xfrm: call xfrm_output_gso when inner_protocol is set in xfrm_output")
> Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
> ---
> net/xfrm/xfrm_output.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
next prev parent reply other threads:[~2025-02-19 14:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-19 10:52 [PATCH net] xfrm_output: Force software GSO only in tunnel mode Cosmin Ratiu
2025-02-19 14:28 ` Leon Romanovsky [this message]
2025-02-25 11:30 ` Steffen Klassert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250219142812.GG53094@unreal \
--to=leon@kernel.org \
--cc=cratiu@nvidia.com \
--cc=davem@davemloft.net \
--cc=dtatulea@nvidia.com \
--cc=edumazet@google.com \
--cc=herbert@gondor.apana.org.au \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=steffen.klassert@secunet.com \
--cc=tariqt@nvidia.com \
--cc=ychemla@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).