From: Sabrina Dubroca <sd@queasysnail.net>
To: Christian Hopps <chopps@chopps.org>
Cc: devel@linux-ipsec.org,
Steffen Klassert <steffen.klassert@secunet.com>,
netdev@vger.kernel.org, Christian Hopps <chopps@labn.net>
Subject: Re: [PATCH ipsec-next v8 10/16] xfrm: iptfs: add fragmenting of larger than MTU user packets
Date: Mon, 5 Aug 2024 00:25:57 +0200 [thread overview]
Message-ID: <Zq__9Z4ckXNdR-Ec@hog> (raw)
In-Reply-To: <20240804203346.3654426-11-chopps@chopps.org>
Please CC the reviewers from previous versions of the patchset. It's
really hard to keep track of discussions and reposts otherwise.
2024-08-04, 16:33:39 -0400, Christian Hopps wrote:
> From: Christian Hopps <chopps@labn.net>
>
> Add support for tunneling user (inner) packets that are larger than the
> tunnel's path MTU (outer) using IP-TFS fragmentation.
>
> Signed-off-by: Christian Hopps <chopps@labn.net>
> ---
> net/xfrm/xfrm_iptfs.c | 407 +++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 381 insertions(+), 26 deletions(-)
>
> diff --git a/net/xfrm/xfrm_iptfs.c b/net/xfrm/xfrm_iptfs.c
> index 20c19894720e..38735e2d64c3 100644
> --- a/net/xfrm/xfrm_iptfs.c
> +++ b/net/xfrm/xfrm_iptfs.c
> @@ -46,12 +46,23 @@
> */
> #define IPTFS_DEFAULT_MAX_QUEUE_SIZE (1024 * 10240)
>
> +/* 1) skb->head should be cache aligned.
> + * 2) when resv is for L2 headers (i.e., ethernet) we want the cacheline to
> + * start -16 from data.
> + * 3) when resv is for L3+L2 headers IOW skb->data points at the IPTFS payload
> + * we want data to be cache line aligned so all the pushed headers will be in
> + * another cacheline.
> + */
> +#define XFRM_IPTFS_MIN_L3HEADROOM 128
> +#define XFRM_IPTFS_MIN_L2HEADROOM (64 + 16)
How did you pick those values?
> +static struct sk_buff *iptfs_alloc_skb(struct sk_buff *tpl, u32 len,
> + bool l3resv)
> +{
> + struct sk_buff *skb;
> + u32 resv;
> +
> + if (!l3resv) {
> + resv = XFRM_IPTFS_MIN_L2HEADROOM;
> + } else {
> + resv = skb_headroom(tpl);
> + if (resv < XFRM_IPTFS_MIN_L3HEADROOM)
> + resv = XFRM_IPTFS_MIN_L3HEADROOM;
> + }
> +
> + skb = alloc_skb(len + resv, GFP_ATOMIC);
> + if (!skb) {
> + XFRM_INC_STATS(dev_net(tpl->dev), LINUX_MIB_XFRMNOSKBERROR);
Hmpf, so we've gone from incrementing the wrong counter to
incrementing a new counter that doesn't have a precise meaning.
> + return NULL;
> + }
> +
> + skb_reserve(skb, resv);
> +
> + /* We do not want any of the tpl->headers copied over, so we do
> + * not use `skb_copy_header()`.
> + */
This is a bit of a bad sign for the implementation. It also worries
me, as this may not be updated when changes are made to
__copy_skb_header().
(c/p'd from v1 review since this was still not answered)
> +/**
> + * skb_copy_bits_seq - copy bits from a skb_seq_state to kernel buffer
> + * @st: source skb_seq_state
> + * @offset: offset in source
> + * @to: destination buffer
> + * @len: number of bytes to copy
> + *
> + * Copy @len bytes from @offset bytes into the source @st to the destination
> + * buffer @to. `offset` should increase (or be unchanged) with each subsequent
> + * call to this function. If offset needs to decrease from the previous use `st`
> + * should be reset first.
> + *
> + * Return: 0 on success or a negative error code on failure
> + */
> +static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
> + int len)
Probably belongs in net/core/skbuff.c, although I'm really not
convinced copying data around is the right way to implement the type
of packet splitting IPTFS does (which sounds a bit like a kind of
GSO). And there are helpers in net/core/skbuff.c (such as
pskb_carve/pskb_extract) that seem to do similar things to what you
need here, without as much data copying.
> +static int iptfs_first_skb(struct sk_buff **skbp, struct xfrm_iptfs_data *xtfs,
> + u32 mtu)
> +{
> + struct sk_buff *skb = *skbp;
> + int err;
> +
> + /* Classic ESP skips the don't fragment ICMP error if DF is clear on
> + * the inner packet or ignore_df is set. Otherwise it will send an ICMP
> + * or local error if the inner packet won't fit it's MTU.
> + *
> + * With IPTFS we do not care about the inner packet DF bit. If the
> + * tunnel is configured to "don't fragment" we error back if things
> + * don't fit in our max packet size. Otherwise we iptfs-fragment as
> + * normal.
> + */
> +
> + /* The opportunity for HW offload has ended */
> + if (skb->ip_summed == CHECKSUM_PARTIAL) {
> + err = skb_checksum_help(skb);
> + if (err)
> + return err;
> + }
> +
> + /* We've split these up before queuing */
> + BUG_ON(skb_is_gso(skb));
As I've said previously, I don't think that's a valid reason to
crash. BUG_ON should be used very rarely:
https://elixir.bootlin.com/linux/v6.10/source/Documentation/process/coding-style.rst#L1230
Dropping a bogus packet is an easy way to recover from this situation,
so we should not crash here (and probably in all of IPTFS).
--
Sabrina
next prev parent reply other threads:[~2024-08-04 22:27 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-04 20:33 [PATCH ipsec-next v8 00/16] Add IP-TFS mode to xfrm Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 01/16] xfrm: config: add CONFIG_XFRM_IPTFS Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 02/16] include: uapi: add ip_tfs_*_hdr packet formats Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 03/16] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 04/16] xfrm: netlink: add config (netlink) options Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 05/16] xfrm: add mode_cbs module functionality Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 06/16] xfrm: add generic iptfs defines and functionality Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 07/16] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 08/16] xfrm: iptfs: add user packet (tunnel ingress) handling Christian Hopps
2024-08-05 17:10 ` Simon Horman
2024-08-06 10:19 ` [devel-ipsec] " Christian Hopps
2024-08-06 15:24 ` Simon Horman
2024-08-04 20:33 ` [PATCH ipsec-next v8 09/16] xfrm: iptfs: share page fragments of inner packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 10/16] xfrm: iptfs: add fragmenting of larger than MTU user packets Christian Hopps
2024-08-04 22:25 ` Sabrina Dubroca [this message]
2024-08-05 2:33 ` Christian Hopps
2024-08-05 4:19 ` Christian Hopps
2024-08-06 8:47 ` Sabrina Dubroca
2024-08-06 8:54 ` Christian Hopps
2024-08-06 10:03 ` Florian Westphal
2024-08-06 10:05 ` Christian Hopps
2024-08-06 11:05 ` Sabrina Dubroca
2024-08-06 11:07 ` Christian Hopps
2024-08-08 11:30 ` Christian Hopps
2024-08-08 13:28 ` Sabrina Dubroca
2024-08-08 13:35 ` Christian Hopps
2024-08-08 14:01 ` Sabrina Dubroca
2024-08-08 21:42 ` Christian Hopps
2024-08-06 11:07 ` Steffen Klassert
2024-08-07 16:23 ` Christian Hopps
2024-08-06 11:32 ` Steffen Klassert
2024-08-07 19:40 ` Christian Hopps
2024-08-08 9:26 ` Sabrina Dubroca
2024-08-08 11:23 ` Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 11/16] xfrm: iptfs: add basic receive packet (tunnel egress) handling Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 12/16] xfrm: iptfs: handle received fragmented inner packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 13/16] xfrm: iptfs: add reusing received skb for the tunnel egress packet Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 14/16] xfrm: iptfs: add skb-fragment sharing code Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 15/16] xfrm: iptfs: handle reordering of received packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 16/16] xfrm: iptfs: add tracepoint functionality Christian Hopps
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zq__9Z4ckXNdR-Ec@hog \
--to=sd@queasysnail.net \
--cc=chopps@chopps.org \
--cc=chopps@labn.net \
--cc=devel@linux-ipsec.org \
--cc=netdev@vger.kernel.org \
--cc=steffen.klassert@secunet.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).