All of lore.kernel.org
 help / color / mirror / Atom feed
From: Steffen Klassert <steffen.klassert@secunet.com>
To: Sabrina Dubroca <sd@queasysnail.net>
Cc: <netdev@vger.kernel.org>, <devel@linux-ipsec.org>
Subject: Re: [PATCH RFC ipsec-next] esp: Consolidate esp4 and esp6.
Date: Mon, 12 Jan 2026 13:23:51 +0100	[thread overview]
Message-ID: <aWTn12LAh-KnSoeM@secunet.com> (raw)
In-Reply-To: <aP-jXvmys9D37Hp6@krikkit>

On Mon, Oct 27, 2025 at 05:52:46PM +0100, Sabrina Dubroca wrote:
> 2025-10-22, 08:03:07 +0200, Steffen Klassert wrote:
> > This patch merges common code of esp4.c and esp6.c into
> > xfrm_esp.c. This almost halves the size of the ESP
> > implementation for the prize of three indirect calls
> > on UDP/TCP encapsulation.
> 
> If that turns out to be a problem, maybe we can figure out a way to
> use the INDIRECT_CALL* helpers here as well (but currently they don't
> work for modules, I played with that a while back and could look into
> it again).

I have no indication that it could be a problem so far,
but would be nice to have the INDIRECT_CALL* helpers
for modules anyway!

> > -int esp_input_done2(struct sk_buff *skb, int err)
> > -{
> [...]
> > +	if (iph->saddr != x->props.saddr.a4 ||
> > +	    source != encap->encap_sport) {
> > +		xfrm_address_t ipaddr;
> > +
> > +		ipaddr.a4 = iph->saddr;
> > +		km_new_mapping(x, &ipaddr, source);
> > +
> > +		/* XXX: perhaps add an extra
> > +		 * policy check here, to see
> > +		 * if we should allow or
> > +		 * reject a packet from a
> > +		 * different source
> > +		 * address/port.
> >  		 */
> 
> Maybe we can get rid of those "XXX" comments? Unless you think the
> suggestion still makes sense. But the comments (here and in
> esp6_input_done2) have been here a long time and it doesn't seem to
> bother users.

Good point, I'll remove them in the next version.

> 
> [...]
> >  static int esp_init_state(struct xfrm_state *x, struct netlink_ext_ack *extack)
> 
> Looking at esp_init_state and esp6_init_state, they also have a lot in
> common (pretty much everything except some x->props.header_len
> adjustments) that could be extracted into the new file.

Yes, true. This patch is just a start, there is much more possible.

> [...]
> > +static int esp6_input_encap(struct sk_buff *skb, struct xfrm_state *x)
> >  {
> > +	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
> > +	int offset = skb_network_offset(skb) + sizeof(*ip6h);
> > +	int hdr_len = skb_network_header_len(skb);
> 
> As Simon mentioned, it's set/incremented but not used. The current
> code in esp6_input_done2 would then use that value in
> skb_set_transport_header, but now the common esp_input_done2 calls
> ->input_encap and doesn't get that increased value back. I think we'll
> need to have esp_input_done2 pass &hdr_len when it calls the
> ->input_encap to get the correct value and take into account ipv6
> exthdrs.

Either this, or ->input_encap returns the offset of the ipv6
exthdrs and then add this value to hdr_len in esp_input_done2().

> [...]
> > +static void esp_output_done(void *data, int err)
> > +{
> > +	struct sk_buff *skb = data;
> > +	struct xfrm_offload *xo = xfrm_offload(skb);
> > +	void *tmp;
> > +	struct xfrm_state *x;
> > +
> > +	if (xo && (xo->flags & XFRM_DEV_RESUME)) {
> > +		struct sec_path *sp = skb_sec_path(skb);
> > +
> > +		x = sp->xvec[sp->len - 1];
> > +	} else {
> > +		x = skb_dst(skb)->xfrm;
> > +	}
> > +
> > +	tmp = ESP_SKB_CB(skb)->tmp;
> > +	esp_ssg_unref(x, tmp, skb);
> > +	kfree(tmp);
> > +
> > +	x->type->output_encap_csum(skb);
> 
> Since the ipv4 variant doesn't do anything, maybe
> 
> if (x->type->output_encap_csum)
> 	x->type->output_encap_csum(skb);
> 
> and don't set it in esp_type?
> 
> OTOH it's nice to have a consistent "all CBs are always defined" and
> just call them unconditionally.

Let's keeep the CBs always defined.

> > +static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
> > +					       int encap_type,
> > +					       struct esp_info *esp,
> > +					       __be16 sport,
> > +					       __be16 dport)
> > +{
> > +	struct udphdr *uh;
> > +	unsigned int len;
> > +	struct xfrm_offload *xo = xfrm_offload(skb);
> > +
> > +	len = skb->len + esp->tailen - skb_transport_offset(skb);
> > +	if (len + sizeof(struct iphdr) > IP_MAX_MTU)
> 
> This is now used by both IPv4 and IPv6, and esp6_output_udp_encap
> didn't have any adjustment in this test.

I wonder whether this check makes sense here at all. We add ESP
headers and trailers, and just in case of UDP/TCP encapsulation,
we check the size of the resulting packet. If we need such a check,
it should be on a more generic place.

Another thing I noticed when looking into this. The code
above is from ipv4 udpencap. All other encap functions do:

len = skb->len + esp->tailen - skb_transport_offset(skb);
if (len > IP_MAX_MTU)
	return ERR_PTR(-EMSGSIZE);

I think this is 'kind of' correct for ipv6, but not for ipv4 tcpencap:

#define IP_MAX_MTU      0xFFFFU
#define IP6_MAX_MTU (0xFFFF + sizeof(struct ipv6hdr))

> > +		return ERR_PTR(-EMSGSIZE);
> > +
> > +	uh = (struct udphdr *)esp->esph;
> > +	uh->source = sport;
> > +	uh->dest = dport;
> > +	uh->len = htons(len);
> > +	uh->check = 0;
> > +
> > +	/* For IPv4 ESP with UDP encapsulation, if xo is not null, the skb is in the crypto offload
> > +	 * data path, which means that esp_output_udp_encap is called outside of the XFRM stack.
> > +	 * In this case, the mac header doesn't point to the IPv4 protocol field, so don't set it.
> > +	 */
> > +	if (!xo || encap_type != UDP_ENCAP_ESPINUDP)
> > +		*skb_mac_header(skb) = IPPROTO_UDP;
> 
> That was absent in esp6_output_udp_encap, commit 447bc4b1906f ("xfrm:
> Support crypto offload for outbound IPv4 UDP-encapsulated ESP packet")
> only took care of IPv4. I'm not sure adding this to the IPv6 code
> without adapting the rest of 447bc4b1906f in the esp6_offload code is
> correct.

Hm, that would mean ipv6 udpencap offload can be configured,
but does not work in the current codebase. Maybe this needs
a separate fix.

> > +
> > +	return (struct ip_esp_hdr *)(uh + 1);
> > +}
> 
> [...]
> > + int esp_output(struct xfrm_state *x, struct sk_buff *skb)
> 
> nit: ' ' at the start of the line

Fixed.

Thanks!

      parent reply	other threads:[~2026-01-12 12:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-22  6:03 [PATCH RFC ipsec-next] esp: Consolidate esp4 and esp6 Steffen Klassert
2025-10-22 23:32 ` kernel test robot
2025-10-27 15:48 ` Simon Horman
2025-10-27 16:52 ` Sabrina Dubroca
2025-10-27 17:14   ` [devel-ipsec] " Paul Wouters
2026-01-12 12:23   ` Steffen Klassert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aWTn12LAh-KnSoeM@secunet.com \
    --to=steffen.klassert@secunet.com \
    --cc=devel@linux-ipsec.org \
    --cc=netdev@vger.kernel.org \
    --cc=sd@queasysnail.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.