public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: Sabrina Dubroca <sd@queasysnail.net>
Cc: <netdev@vger.kernel.org>, Simon Horman <horms@kernel.org>,
	Tobias Brunner <tobias@strongswan.org>,
	Herbert Xu <herbert@gondor.apana.org.au>, <devel@linux-ipsec.org>
Subject: Re: [PATCH ipsec-next] esp: Consolidate esp4 and esp6.
Date: Thu, 12 Mar 2026 10:37:40 +0100	[thread overview]
Message-ID: <abKJZHIr2THzZk09@secunet.com> (raw)
In-Reply-To: <aaVzz834HRXnR-to@krikkit>

On Mon, Mar 02, 2026 at 12:26:07PM +0100, Sabrina Dubroca wrote:
> 2026-02-26, 08:53:37 +0100, Steffen Klassert wrote:
> > This patch merges common code of esp4.c and esp6.c into
> > xfrm_esp.c. This almost halves the size of the ESP
> > implementation for the price of three indirect calls
> > on UDP/TCP encapsulation. No functional changes.
> > 
> > Changes from the RFC version:
> > 
> > - Fix a typo in the commit mesage.
> 
> the commit "mesage"? :)

It was not AI generated :)

> > - Remove some old comments that don't make sense anymore.
> > 
> > - Let the ->input_encap functions return the needed offsets.
> > 
> > - Remove the IP_MAX_MTU check from UDP/TCP encap.
> >   The IPv4/IPv6 local_out function will do that ceck later.
> > 
> > - The comment on IPv4 ESP offload with UDP encapsulation
> >   is true for IPv4 and IPv6, so remove the IPv4 from the
> >   comment.
> > 
> > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> > Tested-by: Tobias Brunner <tobias@strongswan.org>
> > ---
> >  include/net/esp.h       |    6 +
> >  include/net/xfrm.h      |    4 +
> >  net/ipv4/esp4.c         | 1065 ++------------------------------------
> >  net/ipv6/esp6.c         | 1081 ++-------------------------------------
> >  net/ipv6/esp6_offload.c |    6 +-
> >  net/xfrm/Makefile       |    1 +
> >  net/xfrm/xfrm_esp.c     | 1017 ++++++++++++++++++++++++++++++++++++
> >  7 files changed, 1132 insertions(+), 2048 deletions(-)
> >  create mode 100644 net/xfrm/xfrm_esp.c
> > 
> > diff --git a/include/net/esp.h b/include/net/esp.h
> > index 322950727dd0..5761c762c69a 100644
> > --- a/include/net/esp.h
> > +++ b/include/net/esp.h
> > @@ -47,4 +47,10 @@ int esp_input_done2(struct sk_buff *skb, int err);
> >  int esp6_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp);
> >  int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *esp);
> >  int esp6_input_done2(struct sk_buff *skb, int err);
> 
> Those 3 are gone now and can be removed.

Done.

> > +int esp_init_aead(struct xfrm_state *x, struct netlink_ext_ack *extack);
> > +int esp_init_authenc(struct xfrm_state *x, struct netlink_ext_ack *extack);
> > +void esp_destroy(struct xfrm_state *x);
> > +int esp_input(struct xfrm_state *x, struct sk_buff *skb);
> > +int esp_output(struct xfrm_state *x, struct sk_buff *skb);
> > +
> >  #endif
> > diff --git a/include/net/xfrm.h b/include/net/xfrm.h
> > index 0a14daaa5dd4..e3217cafe582 100644
> > --- a/include/net/xfrm.h
> > +++ b/include/net/xfrm.h
> > @@ -455,7 +455,11 @@ struct xfrm_type {
> >  					      struct netlink_ext_ack *extack);
> >  	void			(*destructor)(struct xfrm_state *);
> >  	int			(*input)(struct xfrm_state *, struct sk_buff *skb);
> > +	int			(*input_encap)(struct sk_buff *skb, struct xfrm_state *x);
> >  	int			(*output)(struct xfrm_state *, struct sk_buff *pskb);
> > +	struct sock		*(*find_tcp_sk)(struct xfrm_state *x);
> > +	void			(*output_encap_csum)(struct sk_buff *skb);
> > +	int			(*output_tcp_encap)(struct xfrm_state *x, struct sk_buff *skb);
> 
> This CB is unused.

Removed.

> > diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
> > index 2c922afadb8f..6c5568a96f0a 100644
> > --- a/net/ipv4/esp4.c
> > +++ b/net/ipv4/esp4.c
> [...]
> > +static int esp4_input_encap(struct sk_buff *skb, struct xfrm_state *x)
> >  {
> > +	const struct iphdr *iph = ip_hdr(skb);
> > +	int ihl = iph->ihl * 4;
> >  	struct xfrm_encap_tmpl *encap = x->encap;
> [...]
> > +	struct tcphdr *th = (void *)(skb_network_header(skb) + ihl);
> > +	struct udphdr *uh = (void *)(skb_network_header(skb) + ihl);
> > +	int ret = 0;
> > +	__be16 source;
> >  
> > +	switch (x->encap->encap_type) {
> >  	case TCP_ENCAP_ESPINTCP:
> > -		esph = esp_output_tcp_encap(x, skb, esp);
> > +		source = th->source;
> > +		ret -= sizeof(struct tcphdr);
> 
> I'm trying to check how all those offset combine, isn't that
> susceptible to the same problem as 17175d1a27c6 ("xfrm: esp6: fix
> encapsulation header offset computation")?
> 
> Also, since those new ->input_encap should return a negative value on
> success, the "return -1" for error is a bit confusing.

Yes, that was to align with IPv6, ipv6_skip_exthdr() returns a negative
value. I did not expect this and got a crash on my first try.

> Maybe:
>  - make them return +1 on error, without going through "goto out"
>    (return -1 on error should be fine since that's not an expected
>    offset)
>  - make the final return (the one that returns the negative offset)
>    have a DEBUG_NET_WARN_ON_ONCE(ret >= 0)
> 
> Or if my math is correct, the caller's hdr_len after adding
> ->input_encap's return value is the full length of the ipv* header
> including options/extensions, so let ->input_encap return that
> directly instead of doing the "hdr_len += ret" dance (and keep the
> return -1 on error)? That would make things a bit clearer than
> "skb_network_header_len + skb_network_offset + [whatever
> ipv6_skip_exthdr returns]".

I'll try to make this a bit clearer in the next version.

> [...]
> > @@ -1164,13 +194,16 @@ static int esp4_rcv_cb(struct sk_buff *skb, int err)
> >  
> >  static const struct xfrm_type esp_type =
> >  {
> > -	.owner		= THIS_MODULE,
> > -	.proto	     	= IPPROTO_ESP,
> > -	.flags		= XFRM_TYPE_REPLAY_PROT,
> > -	.init_state	= esp_init_state,
> > -	.destructor	= esp_destroy,
> > -	.input		= esp_input,
> > -	.output		= esp_output,
> > +	.owner			= THIS_MODULE,
> > +	.proto	     		= IPPROTO_ESP,
> 
> nit: since you're reworking the whitespace, this is:
> 
> \t.proto\t     \t\t= IPPROTO_ESP,
> 
> (with a few spaces in the middle)

Fixed.

> [...]
> > diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
> > index e75da98f5283..b90e93fc1a70 100644
> > --- a/net/ipv6/esp6.c
> > +++ b/net/ipv6/esp6.c
> [...]
> > -int esp6_input_done2(struct sk_buff *skb, int err)
> > -{
> [...]
> > -		/*
> > -		 * 2) ignore UDP/TCP checksums in case
> > -		 *    of NAT-T in Transport Mode, or
> > -		 *    perform other post-processing fixes
> > -		 *    as per draft-ietf-ipsec-udp-encaps-06,
> > -		 *    section 3.1.2
> > -		 */
> > -		if (x->props.mode == XFRM_MODE_TRANSPORT)
> > -			skb->ip_summed = CHECKSUM_UNNECESSARY;
> >  	}
> >  
> > -	skb_postpull_rcsum(skb, skb_network_header(skb),
> > -			   skb_network_header_len(skb));
> 
> This doesn't get carried over to the new common function
> esp_input_done2. It originally came from commit a9b28c2bf05d ("esp6:
> Fix RX checksum after header pull"), not sure if it's still needed but
> I guess so?

I think this skb_postpull_rcsum() is misplaced here, and the
skb_pull_rcsum() call below as well. esp_input and esp6_input do:

skb->ip_summed = CHECKSUM_NONE;

So both function do nothing because the checksum is recomputed
anyway. The only case where we need this is HW offloding because
this does not call esp_input/esp6_input. So both function calls
skb_postpull_rcsum() and skb_pull_rcsum() should go to 
esp6_input_tail, and maybe even to esp_input_tail. Looks like
esp4 has the same problem, but we do it just for esp6.

I'll move these calls in the next version.

> > -	skb_pull_rcsum(skb, hlen);
> > -	if (x->props.mode == XFRM_MODE_TUNNEL ||
> > -	    x->props.mode == XFRM_MODE_IPTFS)
> > -		skb_reset_transport_header(skb);
> > -	else
> > -		skb_set_transport_header(skb, -hdr_len);
> > -
> > -	/* RFC4303: Drop dummy packets without any error */
> > -	if (err == IPPROTO_NONE)
> > -		err = -EINVAL;
> > -
> > -out:
> > -	return err;
> > -}
> > -EXPORT_SYMBOL_GPL(esp6_input_done2);
> 
> [...]
> > diff --git a/net/xfrm/xfrm_esp.c b/net/xfrm/xfrm_esp.c
> > new file mode 100644
> > index 000000000000..f277dc114ba6
> > --- /dev/null
> > +++ b/net/xfrm/xfrm_esp.c
> [...]
> > +
> > +
> > +#ifdef CONFIG_INET_ESPINTCP
> > +static struct ip_esp_hdr *esp_output_tcp_encap(struct xfrm_state *x,
> > +						    struct sk_buff *skb,
> > +						    struct esp_info *esp)
> 
> nit: alignment
> 
> (checkpatch complains about this and a few other whitespace things in
> this new file, probably all carried over from the existing code)

Fixed.

> [...]
> > +EXPORT_SYMBOL_GPL(esp_init_authenc);
> > +
> > +MODULE_LICENSE("GPL");
> > +MODULE_DESCRIPTION("Generic ESP");
> > +
> 
> nit: git complains about the empty line

Fixed.

Thank a lot for your review Sabrina!

      reply	other threads:[~2026-03-12  9:37 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-26  7:53 [PATCH ipsec-next] esp: Consolidate esp4 and esp6 Steffen Klassert
2026-03-01 17:03 ` Simon Horman
2026-03-02 11:26 ` Sabrina Dubroca
2026-03-12  9:37   ` Steffen Klassert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abKJZHIr2THzZk09@secunet.com \
    --to=steffen.klassert@secunet.com \
    --cc=devel@linux-ipsec.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=horms@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=sd@queasysnail.net \
    --cc=tobias@strongswan.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox