netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC ipsec-next] xfrm: fix tunnel mode TX datapath in packet offload mode
@ 2025-02-05 18:41 Leon Romanovsky
  2025-02-14  9:34 ` Steffen Klassert
  0 siblings, 1 reply; 3+ messages in thread
From: Leon Romanovsky @ 2025-02-05 18:41 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Alexandre Cassen, netdev

From: Alexandre Cassen <acassen@corp.free.fr>

Packets that match the output xfrm policy are delivered to the netstack.
In IPsec packet mode for tunnel mode, the HW is responsible for building the
hard header and outer IP header. In such a situation, the inner header may
refer to a network that is not directly reachable by the host, resulting in
a failed neighbor resolution. The packet is then dropped. xfrm policy defines
the netdevice to use for xmit so we can send packets directly to it.

This fix also provides a performance improvement for transport mode, since
there is no need to perform neighbor resolution if the HW is already configured
to do so.

Fixes: f8a70afafc17 ("xfrm: add TX datapath support for IPsec packet offload mode")
Signed-off-by: Alexandre Cassen <acassen@corp.free.fr>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
Steffen,

I'm sending this patch AS IS to get feedback if it is right approach.

Thanks
---
 net/xfrm/xfrm_output.c | 38 ++++++++++++++++++++++++++++++++++++--
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_output.c b/net/xfrm/xfrm_output.c
index 34c8e266641c..4ad83b9ea0e9 100644
--- a/net/xfrm/xfrm_output.c
+++ b/net/xfrm/xfrm_output.c
@@ -495,7 +495,7 @@ static int xfrm_output_one(struct sk_buff *skb, int err)
 	struct xfrm_state *x = dst->xfrm;
 	struct net *net = xs_net(x);
 
-	if (err <= 0 || x->xso.type == XFRM_DEV_OFFLOAD_PACKET)
+	if (err <= 0)
 		goto resume;
 
 	do {
@@ -612,6 +612,40 @@ int xfrm_output_resume(struct sock *sk, struct sk_buff *skb, int err)
 }
 EXPORT_SYMBOL_GPL(xfrm_output_resume);
 
+static int xfrm_dev_direct_output(struct sock *sk, struct xfrm_state *x,
+				  struct sk_buff *skb)
+{
+	struct dst_entry *dst = skb_dst(skb);
+	struct net *net = xs_net(x);
+	int err;
+
+	dst = skb_dst_pop(skb);
+	if (!dst) {
+		XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
+		kfree_skb(skb);
+		return -EHOSTUNREACH;
+	}
+	skb_dst_set(skb, dst);
+	nf_reset_ct(skb);
+
+	err = skb_dst(skb)->ops->local_out(net, sk, skb);
+	if (unlikely(err != 1)) {
+		kfree_skb(skb);
+		return err;
+	}
+
+	/* In transport mode, network destination is
+	 * directly reachable, while in tunnel mode,
+	 * inner packet network may not be. In packet
+	 * offload type, HW is responsible for hard
+	 * header packet mangling so directly xmit skb
+	 * to netdevice.
+	 */
+	skb->dev = x->xso.dev;
+	__skb_push(skb, skb->dev->hard_header_len);
+	return dev_queue_xmit(skb);
+}
+
 static int xfrm_output2(struct net *net, struct sock *sk, struct sk_buff *skb)
 {
 	return xfrm_output_resume(sk, skb, 1);
@@ -735,7 +769,7 @@ int xfrm_output(struct sock *sk, struct sk_buff *skb)
 			return -EHOSTUNREACH;
 		}
 
-		return xfrm_output_resume(sk, skb, 0);
+		return xfrm_dev_direct_output(sk, x, skb);
 	}
 
 	secpath_reset(skb);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC ipsec-next] xfrm: fix tunnel mode TX datapath in packet offload mode
  2025-02-05 18:41 [RFC ipsec-next] xfrm: fix tunnel mode TX datapath in packet offload mode Leon Romanovsky
@ 2025-02-14  9:34 ` Steffen Klassert
  2025-02-17 12:27   ` Leon Romanovsky
  0 siblings, 1 reply; 3+ messages in thread
From: Steffen Klassert @ 2025-02-14  9:34 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Alexandre Cassen, netdev

On Wed, Feb 05, 2025 at 08:41:02PM +0200, Leon Romanovsky wrote:
> From: Alexandre Cassen <acassen@corp.free.fr>
> 
> +static int xfrm_dev_direct_output(struct sock *sk, struct xfrm_state *x,
> +				  struct sk_buff *skb)
> +{
> +	struct dst_entry *dst = skb_dst(skb);
> +	struct net *net = xs_net(x);
> +	int err;
> +
> +	dst = skb_dst_pop(skb);
> +	if (!dst) {
> +		XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> +		kfree_skb(skb);
> +		return -EHOSTUNREACH;
> +	}
> +	skb_dst_set(skb, dst);
> +	nf_reset_ct(skb);
> +
> +	err = skb_dst(skb)->ops->local_out(net, sk, skb);
> +	if (unlikely(err != 1)) {
> +		kfree_skb(skb);
> +		return err;
> +	}
> +
> +	/* In transport mode, network destination is
> +	 * directly reachable, while in tunnel mode,
> +	 * inner packet network may not be. In packet
> +	 * offload type, HW is responsible for hard
> +	 * header packet mangling so directly xmit skb
> +	 * to netdevice.
> +	 */
> +	skb->dev = x->xso.dev;
> +	__skb_push(skb, skb->dev->hard_header_len);
> +	return dev_queue_xmit(skb);

I think this is absolutely the right thing for tunnel mode,
but on transport mode we might bypass some valid netfilter
rules.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [RFC ipsec-next] xfrm: fix tunnel mode TX datapath in packet offload mode
  2025-02-14  9:34 ` Steffen Klassert
@ 2025-02-17 12:27   ` Leon Romanovsky
  0 siblings, 0 replies; 3+ messages in thread
From: Leon Romanovsky @ 2025-02-17 12:27 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Alexandre Cassen, netdev

On Fri, Feb 14, 2025 at 10:34:58AM +0100, Steffen Klassert wrote:
> On Wed, Feb 05, 2025 at 08:41:02PM +0200, Leon Romanovsky wrote:
> > From: Alexandre Cassen <acassen@corp.free.fr>
> > 
> > +static int xfrm_dev_direct_output(struct sock *sk, struct xfrm_state *x,
> > +				  struct sk_buff *skb)
> > +{
> > +	struct dst_entry *dst = skb_dst(skb);
> > +	struct net *net = xs_net(x);
> > +	int err;
> > +
> > +	dst = skb_dst_pop(skb);
> > +	if (!dst) {
> > +		XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> > +		kfree_skb(skb);
> > +		return -EHOSTUNREACH;
> > +	}
> > +	skb_dst_set(skb, dst);
> > +	nf_reset_ct(skb);
> > +
> > +	err = skb_dst(skb)->ops->local_out(net, sk, skb);
> > +	if (unlikely(err != 1)) {
> > +		kfree_skb(skb);
> > +		return err;
> > +	}
> > +
> > +	/* In transport mode, network destination is
> > +	 * directly reachable, while in tunnel mode,
> > +	 * inner packet network may not be. In packet
> > +	 * offload type, HW is responsible for hard
> > +	 * header packet mangling so directly xmit skb
> > +	 * to netdevice.
> > +	 */
> > +	skb->dev = x->xso.dev;
> > +	__skb_push(skb, skb->dev->hard_header_len);
> > +	return dev_queue_xmit(skb);
> 
> I think this is absolutely the right thing for tunnel mode,
> but on transport mode we might bypass some valid netfilter
> rules.

Thanks for the acknowledge. We will prepare proper patch.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-02-17 12:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-05 18:41 [RFC ipsec-next] xfrm: fix tunnel mode TX datapath in packet offload mode Leon Romanovsky
2025-02-14  9:34 ` Steffen Klassert
2025-02-17 12:27   ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).