public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: Sabrina Dubroca <sd@queasysnail.net>
Cc: <netdev@vger.kernel.org>, <devel@linux-ipsec.org>
Subject: Re: [PATCH RFC ipsec-next] esp: Consolidate esp4 and esp6.
Date: Mon, 12 Jan 2026 13:23:51 +0100	[thread overview]
Message-ID: <aWTn12LAh-KnSoeM@secunet.com> (raw)
In-Reply-To: <aP-jXvmys9D37Hp6@krikkit>

On Mon, Oct 27, 2025 at 05:52:46PM +0100, Sabrina Dubroca wrote:
> 2025-10-22, 08:03:07 +0200, Steffen Klassert wrote:
> > This patch merges common code of esp4.c and esp6.c into
> > xfrm_esp.c. This almost halves the size of the ESP
> > implementation for the prize of three indirect calls
> > on UDP/TCP encapsulation.
> 
> If that turns out to be a problem, maybe we can figure out a way to
> use the INDIRECT_CALL* helpers here as well (but currently they don't
> work for modules, I played with that a while back and could look into
> it again).

I have no indication that it could be a problem so far,
but would be nice to have the INDIRECT_CALL* helpers
for modules anyway!

> > -int esp_input_done2(struct sk_buff *skb, int err)
> > -{
> [...]
> > +	if (iph->saddr != x->props.saddr.a4 ||
> > +	    source != encap->encap_sport) {
> > +		xfrm_address_t ipaddr;
> > +
> > +		ipaddr.a4 = iph->saddr;
> > +		km_new_mapping(x, &ipaddr, source);
> > +
> > +		/* XXX: perhaps add an extra
> > +		 * policy check here, to see
> > +		 * if we should allow or
> > +		 * reject a packet from a
> > +		 * different source
> > +		 * address/port.
> >  		 */
> 
> Maybe we can get rid of those "XXX" comments? Unless you think the
> suggestion still makes sense. But the comments (here and in
> esp6_input_done2) have been here a long time and it doesn't seem to
> bother users.

Good point, I'll remove them in the next version.

> 
> [...]
> >  static int esp_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
> 
> Looking at esp_init_state and esp6_init_state, they also have a lot in
> common (pretty much everything except some x->props.header_len
> adjustments) that could be extracted into the new file.

Yes, true. This patch is just a start, there is much more possible.

> [...]
> > +static int esp6_input_encap(struct sk_buff *skb, struct xfrm_state *x)
> >  {
> > +	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
> > +	int offset = skb_network_offset(skb) + sizeof(*ip6h);
> > +	int hdr_len = skb_network_header_len(skb);
> 
> As Simon mentioned, it's set/incremented but not used. The current
> code in esp6_input_done2 would then use that value in
> skb_set_transport_header, but now the common esp_input_done2 calls
> ->input_encap and doesn't get that increased value back. I think we'll
> need to have esp_input_done2 pass &hdr_len when it calls the
> ->input_encap to get the correct value and take into account ipv6
> exthdrs.

Either this, or ->input_encap returns the offset of the ipv6
exthdrs and then add this value to hdr_len in esp_input_done2().

> [...]
> > +static void esp_output_done(void *data, int err)
> > +{
> > +	struct sk_buff *skb = data;
> > +	struct xfrm_offload *xo = xfrm_offload(skb);
> > +	void *tmp;
> > +	struct xfrm_state *x;
> > +
> > +	if (xo && (xo->flags & XFRM_DEV_RESUME)) {
> > +		struct sec_path *sp = skb_sec_path(skb);
> > +
> > +		x = sp->xvec[sp->len - 1];
> > +	} else {
> > +		x = skb_dst(skb)->xfrm;
> > +	}
> > +
> > +	tmp = ESP_SKB_CB(skb)->tmp;
> > +	esp_ssg_unref(x, tmp, skb);
> > +	kfree(tmp);
> > +
> > +	x->type->output_encap_csum(skb);
> 
> Since the ipv4 variant doesn't do anything, maybe
> 
> if (x->type->output_encap_csum)
> 	x->type->output_encap_csum(skb);
> 
> and don't set it in esp_type?
> 
> OTOH it's nice to have a consistent "all CBs are always defined" and
> just call them unconditionally.

Let's keeep the CBs always defined.

> > +static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
> > +					       int encap_type,
> > +					       struct esp_info *esp,
> > +					       __be16 sport,
> > +					       __be16 dport)
> > +{
> > +	struct udphdr *uh;
> > +	unsigned int len;
> > +	struct xfrm_offload *xo = xfrm_offload(skb);
> > +
> > +	len = skb->len + esp->tailen - skb_transport_offset(skb);
> > +	if (len + sizeof(struct iphdr) > IP_MAX_MTU)
> 
> This is now used by both IPv4 and IPv6, and esp6_output_udp_encap
> didn't have any adjustment in this test.

I wonder whether this check makes sense here at all. We add ESP
headers and trailers, and just in case of UDP/TCP encapsulation,
we check the size of the resulting packet. If we need such a check,
it should be on a more generic place.

Another thing I noticed when looking into this. The code
above is from ipv4 udpencap. All other encap functions do:

len = skb->len + esp->tailen - skb_transport_offset(skb);
if (len > IP_MAX_MTU)
	return ERR_PTR(-EMSGSIZE);

I think this is 'kind of' correct for ipv6, but not for ipv4 tcpencap:

#define IP_MAX_MTU      0xFFFFU
#define IP6_MAX_MTU (0xFFFF + sizeof(struct ipv6hdr))

> > +		return ERR_PTR(-EMSGSIZE);
> > +
> > +	uh = (struct udphdr *)esp->esph;
> > +	uh->source = sport;
> > +	uh->dest = dport;
> > +	uh->len = htons(len);
> > +	uh->check = 0;
> > +
> > +	/* For IPv4 ESP with UDP encapsulation, if xo is not null, the skb is in the crypto offload
> > +	 * data path, which means that esp_output_udp_encap is called outside of the XFRM stack.
> > +	 * In this case, the mac header doesn't point to the IPv4 protocol field, so don't set it.
> > +	 */
> > +	if (!xo || encap_type != UDP_ENCAP_ESPINUDP)
> > +		*skb_mac_header(skb) = IPPROTO_UDP;
> 
> That was absent in esp6_output_udp_encap, commit 447bc4b1906f ("xfrm:
> Support crypto offload for outbound IPv4 UDP-encapsulated ESP packet")
> only took care of IPv4. I'm not sure adding this to the IPv6 code
> without adapting the rest of 447bc4b1906f in the esp6_offload code is
> correct.

Hm, that would mean ipv6 udpencap offload can be configured,
but does not work in the current codebase. Maybe this needs
a separate fix.

> > +
> > +	return (struct ip_esp_hdr *)(uh + 1);
> > +}
> 
> [...]
> > + int esp_output(struct xfrm_state *x, struct sk_buff *skb)
> 
> nit: ' ' at the start of the line

Fixed.

Thanks!

      parent reply	other threads:[~2026-01-12 12:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22  6:03 [PATCH RFC ipsec-next] esp: Consolidate esp4 and esp6 Steffen Klassert
2025-10-27 15:48 ` Simon Horman
2025-10-27 16:52 ` Sabrina Dubroca
2025-10-27 17:14   ` [devel-ipsec] " Paul Wouters
2026-01-12 12:23   ` Steffen Klassert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWTn12LAh-KnSoeM@secunet.com \
    --to=steffen.klassert@secunet.com \
    --cc=devel@linux-ipsec.org \
    --cc=netdev@vger.kernel.org \
    --cc=sd@queasysnail.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox