netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sabrina Dubroca <sd@queasysnail.net>
To: Christian Hopps <chopps@chopps.org>
Cc: devel@linux-ipsec.org,
	Steffen Klassert <steffen.klassert@secunet.com>,
	netdev@vger.kernel.org, Christian Hopps <chopps@labn.net>
Subject: Re: [PATCH ipsec-next v8 10/16] xfrm: iptfs: add fragmenting of larger than MTU user packets
Date: Tue, 6 Aug 2024 10:47:03 +0200	[thread overview]
Message-ID: <ZrHjByjZnnDgjvfo@hog> (raw)
In-Reply-To: <m2a5hr7iek.fsf@ja-home.int.chopps.org>

2024-08-04, 22:33:05 -0400, Christian Hopps wrote:
> > > +/* 1) skb->head should be cache aligned.
> > > + * 2) when resv is for L2 headers (i.e., ethernet) we want the cacheline to
> > > + * start -16 from data.
> > > + * 3) when resv is for L3+L2 headers IOW skb->data points at the IPTFS payload
> > > + * we want data to be cache line aligned so all the pushed headers will be in
> > > + * another cacheline.
> > > + */
> > > +#define XFRM_IPTFS_MIN_L3HEADROOM 128
> > > +#define XFRM_IPTFS_MIN_L2HEADROOM (64 + 16)
> > 
> > How did you pick those values?
> 
> That's what the comment is talking to. When reserving space for L2 headers we pick 64 + 16 (a 2^(<=6) cacheline + 16 bytes so the the cacheline should start -16 from where skb->data will point at.

Hard-coding the x86 cacheline size is not a good idea. And what's the
16B for? You don't know that it's enough for the actual L2 headers.

> For L3 we reserve double the power of 2 space we reserved for L2 only.

But that's the core of my question. Why is that correct/enough?

> 
> We have to reserve some amount of space for pushed headers, so the above made sense to me for good performance/cache locality.
> 
> > > +static struct sk_buff *iptfs_alloc_skb(struct sk_buff *tpl, u32 len,
> > > +				       bool l3resv)
> > > +{
> > > +	struct sk_buff *skb;
> > > +	u32 resv;
> > > +
> > > +	if (!l3resv) {
> > > +		resv = XFRM_IPTFS_MIN_L2HEADROOM;
> > > +	} else {
> > > +		resv = skb_headroom(tpl);
> > > +		if (resv < XFRM_IPTFS_MIN_L3HEADROOM)
> > > +			resv = XFRM_IPTFS_MIN_L3HEADROOM;
> > > +	}
> > > +
> > > +	skb = alloc_skb(len + resv, GFP_ATOMIC);
> > > +	if (!skb) {
> > > +		XFRM_INC_STATS(dev_net(tpl->dev), LINUX_MIB_XFRMNOSKBERROR);
> > 
> > Hmpf, so we've gone from incrementing the wrong counter to
> > incrementing a new counter that doesn't have a precise meaning.
> 
> The new "No SKB" counter is supposed to mean "couldn't get an SKB",
> given plenty of other errors are logged under "OutErr" or "InErr"
> i'm not sure what level of precision you're looking for here. :)

OutErr and InErr would be better than that new counter IMO.


> > > +		return NULL;
> > > +	}
> > > +
> > > +	skb_reserve(skb, resv);
> > > +
> > > +	/* We do not want any of the tpl->headers copied over, so we do
> > > +	 * not use `skb_copy_header()`.
> > > +	 */
> > 
> > This is a bit of a bad sign for the implementation. It also worries
> > me, as this may not be updated when changes are made to
> > __copy_skb_header().
> > (c/p'd from v1 review since this was still not answered)
> 
> I don't agree that this is a bad design at all, I'm curious what you think a good design to be.

Strange skb manipulations hiding in a protocol module is not good
design.

c/p bits of core code into a module (where they will never get fixed
up when the core code gets updated) is always a bad idea.


> I did specifically state why we are not re-using
> skb_copy_header(). The functionality is different. We are not trying
> to make a copy of an skb we are using an skb as a template for new
> skbs.

I saw that. That doesn't mean it's a good thing to do.

> > > +/**
> > > + * skb_copy_bits_seq - copy bits from a skb_seq_state to kernel buffer
> > > + * @st: source skb_seq_state
> > > + * @offset: offset in source
> > > + * @to: destination buffer
> > > + * @len: number of bytes to copy
> > > + *
> > > + * Copy @len bytes from @offset bytes into the source @st to the destination
> > > + * buffer @to. `offset` should increase (or be unchanged) with each subsequent
> > > + * call to this function. If offset needs to decrease from the previous use `st`
> > > + * should be reset first.
> > > + *
> > > + * Return: 0 on success or a negative error code on failure
> > > + */
> > > +static int skb_copy_bits_seq(struct skb_seq_state *st, int offset, void *to,
> > > +			     int len)
> > 
> > Probably belongs in net/core/skbuff.c, although I'm really not
> > convinced copying data around is the right way to implement the type
> > of packet splitting IPTFS does (which sounds a bit like a kind of
> > GSO). And there are helpers in net/core/skbuff.c (such as
> > pskb_carve/pskb_extract) that seem to do similar things to what you
> > need here, without as much data copying.
> 
> I don't have an issue with moving more general skb functionality
> into skbuff.c; however, I do not want to gate IP-TFS on this change
> to the general net infra, it is appropriate for a patchset of it's
> own.

If you need helpers that don't exist, it's part of your job to make
the core changes that are required to implement the functionality.

> Re copying: Let's be clear here, we are not always copying data,
> there are sharing code paths as well; however, there are times when
> it is the best (and even fastest) way to accomplish things (e.g.,
> b/c the packet is small or the data is arranged in skbs in a way to
> make sharing ridiculously complex and thus slow).

I'm not finding the sharing code. You mean iptfs_first_should_copy
returning false?

-- 
Sabrina


  parent reply	other threads:[~2024-08-06  8:47 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-04 20:33 [PATCH ipsec-next v8 00/16] Add IP-TFS mode to xfrm Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 01/16] xfrm: config: add CONFIG_XFRM_IPTFS Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 02/16] include: uapi: add ip_tfs_*_hdr packet formats Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 03/16] include: uapi: add IPPROTO_AGGFRAG for AGGFRAG in ESP Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 04/16] xfrm: netlink: add config (netlink) options Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 05/16] xfrm: add mode_cbs module functionality Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 06/16] xfrm: add generic iptfs defines and functionality Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 07/16] xfrm: iptfs: add new iptfs xfrm mode impl Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 08/16] xfrm: iptfs: add user packet (tunnel ingress) handling Christian Hopps
2024-08-05 17:10   ` Simon Horman
2024-08-06 10:19     ` [devel-ipsec] " Christian Hopps
2024-08-06 15:24       ` Simon Horman
2024-08-04 20:33 ` [PATCH ipsec-next v8 09/16] xfrm: iptfs: share page fragments of inner packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 10/16] xfrm: iptfs: add fragmenting of larger than MTU user packets Christian Hopps
2024-08-04 22:25   ` Sabrina Dubroca
2024-08-05  2:33     ` Christian Hopps
2024-08-05  4:19       ` Christian Hopps
2024-08-06  8:47       ` Sabrina Dubroca [this message]
2024-08-06  8:54         ` Christian Hopps
2024-08-06 10:03           ` Florian Westphal
2024-08-06 10:05             ` Christian Hopps
2024-08-06 11:05           ` Sabrina Dubroca
2024-08-06 11:07             ` Christian Hopps
2024-08-08 11:30             ` Christian Hopps
2024-08-08 13:28               ` Sabrina Dubroca
2024-08-08 13:35                 ` Christian Hopps
2024-08-08 14:01                   ` Sabrina Dubroca
2024-08-08 21:42                     ` Christian Hopps
2024-08-06 11:07           ` Steffen Klassert
2024-08-07 16:23             ` Christian Hopps
2024-08-06 11:32     ` Steffen Klassert
2024-08-07 19:40       ` Christian Hopps
2024-08-08  9:26         ` Sabrina Dubroca
2024-08-08 11:23           ` Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 11/16] xfrm: iptfs: add basic receive packet (tunnel egress) handling Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 12/16] xfrm: iptfs: handle received fragmented inner packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 13/16] xfrm: iptfs: add reusing received skb for the tunnel egress packet Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 14/16] xfrm: iptfs: add skb-fragment sharing code Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 15/16] xfrm: iptfs: handle reordering of received packets Christian Hopps
2024-08-04 20:33 ` [PATCH ipsec-next v8 16/16] xfrm: iptfs: add tracepoint functionality Christian Hopps

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZrHjByjZnnDgjvfo@hog \
    --to=sd@queasysnail.net \
    --cc=chopps@chopps.org \
    --cc=chopps@labn.net \
    --cc=devel@linux-ipsec.org \
    --cc=netdev@vger.kernel.org \
    --cc=steffen.klassert@secunet.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).