netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Simon Horman <horms@kernel.org>
To: Christian Hopps <chopps@chopps.org>
Cc: devel@linux-ipsec.org,
	Steffen Klassert <steffen.klassert@secunet.com>,
	netdev@vger.kernel.org, Christian Hopps <chopps@labn.net>
Subject: Re: [PATCH ipsec-next v5 09/17] xfrm: iptfs: add user packet (tunnel ingress) handling
Date: Mon, 15 Jul 2024 13:55:55 +0100	[thread overview]
Message-ID: <20240715125555.GC45692@kernel.org> (raw)
In-Reply-To: <20240714202246.1573817-10-chopps@chopps.org>

On Sun, Jul 14, 2024 at 04:22:37PM -0400, Christian Hopps wrote:
> From: Christian Hopps <chopps@labn.net>
> 
> Add tunnel packet output functionality. This is code handles
> the ingress to the tunnel.
> 
> Signed-off-by: Christian Hopps <chopps@labn.net>
> ---
>  net/xfrm/xfrm_iptfs.c | 535 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 532 insertions(+), 3 deletions(-)
> 
> diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c

...

> +static int iptfs_get_cur_pmtu(struct xfrm_state *x,
> +			      struct xfrm_iptfs_data *xtfs, struct sk_buff *skb)
> +{
> +	struct xfrm_dst *xdst = (struct xfrm_dst *)skb_dst(skb);
> +	u32 payload_mtu = xtfs->payload_mtu;
> +	u32 pmtu = iptfs_get_inner_mtu(x, xdst->child_mtu_cached);

Hi Christian,

Please consider arranging local variable declarations in Networking
code in reverse xmas tree order - longest line to shortest.
I think that in this case that would involve separating the
declaration and assignment of pmtu.

Edward Cree's tool can be helpful here:
https://github.com/ecree-solarflare/xmastree

> +
> +	if (payload_mtu && payload_mtu < pmtu)
> +		pmtu = payload_mtu;
> +
> +	return pmtu;
> +}

...

> +/* IPv4/IPv6 packet ingress to IPTFS tunnel, arrange to send in IPTFS payload
> + * (i.e., aggregating or fragmenting as appropriate).
> + * This is set in dst->output for an SA.
> + */
> +static int iptfs_output_collect(struct net *net, struct sock *sk,
> +				struct sk_buff *skb)
> +{
> +	struct dst_entry *dst = skb_dst(skb);
> +	struct xfrm_state *x = dst->xfrm;
> +	struct xfrm_iptfs_data *xtfs = x->mode_data;
> +	struct sk_buff *segs, *nskb;
> +	u32 pmtu = 0;
> +	bool ok = true;
> +	bool was_gso;
> +
> +	/* We have hooked into dst_entry->output which means we have skipped the
> +	 * protocol specific netfilter (see xfrm4_output, xfrm6_output).
> +	 * when our timer runs we will end up calling xfrm_output directly on
> +	 * the encapsulated traffic.
> +	 *
> +	 * For both cases this is the NF_INET_POST_ROUTING hook which allows
> +	 * changing the skb->dst entry which then may not be xfrm based anymore
> +	 * in which case a REROUTED flag is set. and dst_output is called.
> +	 *
> +	 * For IPv6 we are also skipping fragmentation handling for local
> +	 * sockets, which may or may not be good depending on our tunnel DF
> +	 * setting. Normally with fragmentation supported we want to skip this
> +	 * fragmentation.
> +	 */
> +
> +	BUG_ON(!xtfs);
> +
> +	pmtu = iptfs_get_cur_pmtu(x, xtfs, skb);
> +
> +	/* Break apart GSO skbs. If the queue is nearing full then we want the
> +	 * accounting and queuing to be based on the individual packets not on the
> +	 * aggregate GSO buffer.
> +	 */
> +	was_gso = skb_is_gso(skb);
> +	if (!was_gso) {
> +		segs = skb;
> +	} else {
> +		segs = skb_gso_segment(skb, 0);
> +		if (IS_ERR_OR_NULL(segs)) {
> +			XFRM_INC_STATS(net, LINUX_MIB_XFRMOUTERROR);
> +			kfree_skb(skb);
> +			return PTR_ERR(segs);

This will return 0 is skb_gso_segment returns NULL,
which occurs if skb doesn't require segmentation.
Is that intentional?

If so, I wonder if it would be slightly nicer
to use PTR_ERR_OR_ZERO() instead of PTR_ERR().

Flagged by Smatch (suggestion is my own).
 (suggestion is my own)
> +		}
> +		consume_skb(skb);
> +		skb = NULL;
> +	}
> +
> +	/* We can be running on multiple cores and from the network softirq or
> +	 * from user context depending on where the packet is coming from.
> +	 */
> +	spin_lock_bh(&x->lock);
> +
> +	skb_list_walk_safe(segs, skb, nskb)
> +	{
> +		skb_mark_not_on_list(skb);
> +
> +		/* Once we drop due to no queue space we continue to drop the
> +		 * rest of the packets from that GRO.
> +		 */
> +		if (!ok) {
> +nospace:
> +			if (skb->dev)
> +				XFRM_INC_STATS(dev_net(skb->dev),
> +					       LINUX_MIB_XFRMOUTNOQSPACE);
> +			kfree_skb_reason(skb, SKB_DROP_REASON_FULL_RING);
> +			continue;
> +		}
> +
> +		/* Fragmenting handled in following commits. */
> +		if (iptfs_is_too_big(sk, skb, pmtu)) {
> +			kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
> +			continue;
> +		}
> +
> +		/* Enqueue to send in tunnel */
> +		ok = iptfs_enqueue(xtfs, skb);
> +		if (!ok)
> +			goto nospace;
> +	}
> +
> +	/* Start a delay timer if we don't have one yet */
> +	if (!hrtimer_is_queued(&xtfs->iptfs_timer)) {
> +		hrtimer_start(&xtfs->iptfs_timer, xtfs->init_delay_ns,
> +			      IPTFS_HRTIMER_MODE);
> +		xtfs->iptfs_settime = ktime_get_raw_fast_ns();
> +	}
> +
> +	spin_unlock_bh(&x->lock);
> +	return 0;
> +}

...

> +static enum hrtimer_restart iptfs_delay_timer(struct hrtimer *me)
> +{
> +	struct sk_buff_head list;
> +	struct xfrm_iptfs_data *xtfs;
> +	struct xfrm_state *x;
> +	time64_t settime;
> +
> +	xtfs = container_of(me, typeof(*xtfs), iptfs_timer);
> +	x = xtfs->x;
> +
> +	/* Process all the queued packets
> +	 *
> +	 * softirq execution order: timer > tasklet > hrtimer
> +	 *
> +	 * Network rx will have run before us giving one last chance to queue
> +	 * ingress packets for us to process and transmit.
> +	 */
> +
> +	spin_lock(&x->lock);
> +	__skb_queue_head_init(&list);
> +	skb_queue_splice_init(&xtfs->queue, &list);
> +	xtfs->queue_size = 0;
> +	settime = xtfs->iptfs_settime;

nit: settime is set but otherwise unused in this function.

     Flagged by W=1 x86_64 allmodconfig builds with gcc-14 and clang-18.

> +	spin_unlock(&x->lock);
> +
> +	/* After the above unlock, packets can begin queuing again, and the
> +	 * timer can be set again, from another CPU either in softirq or user
> +	 * context (not from this one since we are running at softirq level
> +	 * already).
> +	 */
> +
> +	iptfs_output_queued(x, &list);
> +
> +	return HRTIMER_NORESTART;
> +}

...

> @@ -98,10 +607,23 @@ static int iptfs_copy_to_user(struct xfrm_state *x, struct sk_buff *skb)
>  {
>  	struct xfrm_iptfs_data *xtfs = x->mode_data;
>  	struct xfrm_iptfs_config *xc = &xtfs->cfg;
> -	int ret = 0;
> +	int ret;
> +	u64 q;
> +
> +	if (x->dir == XFRM_SA_DIR_OUT) {
> +		q = xtfs->init_delay_ns;
> +		(void)do_div(q, NSECS_IN_USEC);
> +		ret = nla_put_u32(skb, XFRMA_IPTFS_INIT_DELAY, q);
> +		if (ret)
> +			return ret;
> +
> +		ret = nla_put_u32(skb, XFRMA_IPTFS_MAX_QSIZE,
> +				  xc->max_queue_size);
> +		if (ret)
> +			return ret;
>  
> -	if (x->dir == XFRM_SA_DIR_OUT)
>  		ret = nla_put_u32(skb, XFRMA_IPTFS_PKT_SIZE, xc->pkt_size);
> +	}

ret will be used uninitialised here unless the if condition above is true.

Flagged by W=1 x86_64 allmodconfig build with clang-18, and Smatch.

>  
>  	return ret;
>  }
...

  reply	other threads:[~2024-07-15 12:55 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-14 20:22 [PATCH ipsec-next v5 00/17] Add IP-TFS mode to xfrm Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 01/17] xfrm: config: add CONFIG_XFRM_IPTFS Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 02/17] include: uapi: add ip_tfs_*_hdr packet formats Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 03/17] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 04/17] xfrm: sysctl: allow configuration of global default values Christian Hopps
2024-07-18 13:13   ` Sabrina Dubroca
2024-07-18 13:15     ` Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 05/17] xfrm: netlink: add config (netlink) options Christian Hopps
2024-07-18 13:45   ` Sabrina Dubroca
2024-07-18 14:08     ` [devel-ipsec] " Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 06/17] xfrm: add mode_cbs module functionality Christian Hopps
2024-07-19  6:14   ` [devel-ipsec] " Florian Westphal
2024-07-20  0:06     ` Christian Hopps
2024-07-25 13:32   ` Sabrina Dubroca
2024-07-30 21:29     ` Christian Hopps
2024-07-31 17:10       ` Sabrina Dubroca
2024-07-31 18:32         ` Christian Hopps
2024-07-31 18:41           ` Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 07/17] xfrm: add generic iptfs defines and functionality Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 08/17] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
2024-07-15 12:39   ` Simon Horman
2024-07-18  5:56     ` [devel-ipsec] " Christian Hopps
2024-07-19  6:16   ` Florian Westphal
2024-07-20  0:30     ` Christian Hopps
2024-07-25 13:33   ` Sabrina Dubroca
2024-07-25 14:56     ` Christian Hopps
2024-07-31 16:29     ` Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 09/17] xfrm: iptfs: add user packet (tunnel ingress) handling Christian Hopps
2024-07-15 12:55   ` Simon Horman [this message]
2024-07-18  5:35     ` [devel-ipsec] " Christian Hopps
2024-07-18  6:32       ` Christian Hopps
2024-07-16 14:00   ` kernel test robot
2024-07-14 20:22 ` [PATCH ipsec-next v5 10/17] xfrm: iptfs: share page fragments of inner packets Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 11/17] xfrm: iptfs: add fragmenting of larger than MTU user packets Christian Hopps
2024-07-15 13:12   ` Simon Horman
2024-07-18  5:57     ` [devel-ipsec] " Christian Hopps
2024-07-18  6:56       ` Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 12/17] xfrm: iptfs: add basic receive packet (tunnel egress) handling Christian Hopps
2024-07-15 13:16   ` Simon Horman
2024-07-18  6:14     ` [devel-ipsec] " Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 13/17] xfrm: iptfs: handle received fragmented inner packets Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 14/17] xfrm: iptfs: add reusing received skb for the tunnel egress packet Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 15/17] xfrm: iptfs: add skb-fragment sharing code Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 16/17] xfrm: iptfs: handle reordering of received packets Christian Hopps
2024-07-14 20:22 ` [PATCH ipsec-next v5 17/17] xfrm: iptfs: add tracepoint functionality Christian Hopps
2024-07-15 11:53   ` kernel test robot
2024-07-17  5:21   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240715125555.GC45692@kernel.org \
    --to=horms@kernel.org \
    --cc=chopps@chopps.org \
    --cc=chopps@labn.net \
    --cc=devel@linux-ipsec.org \
    --cc=netdev@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).