netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
@ 2018-08-07 18:09 Doron Roberts-Kedes
  2018-08-08 19:14 ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-07 18:09 UTC (permalink / raw)
  To: David S . Miller
  Cc: Dave Watson, Vakul Garg, Boris Pismenny, Aviad Yehezkel, netdev,
	Doron Roberts-Kedes

decrypt_skb fails if the number of sg elements required to map is
greater than MAX_SKB_FRAGS. nsg must always be calculated, but 
skb_cow_data adds unnecessary memcpy's for the zerocopy case.

The new function skb_nsg calculates the number of scatterlist elements
required to map the skb without the extra overhead of skb_cow_data. This
function mimics the structure of skb_to_sgvec.

Reported-by: Vakul Garg <Vakul.garg@nxp.com>
Reviewed-by: Vakul Garg <Vakul.garg@nxp.com>
Tested-by: Vakul Garg <Vakul.garg@nxp.com>
Fixes: c46234ebb4d1 ("tls: RX path for ktls")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
---
 net/tls/tls_sw.c | 96 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 93 insertions(+), 3 deletions(-)

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index ff3a6904a722..eb87f931a0d6 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -43,6 +43,80 @@
 
 #define MAX_IV_SIZE	TLS_CIPHER_AES_GCM_128_IV_SIZE
 
+static int __skb_nsg(struct sk_buff *skb, int offset, int len,
+		     unsigned int recursion_level)
+{
+	int start = skb_headlen(skb);
+	int i, copy = start - offset;
+	struct sk_buff *frag_iter;
+	int elt = 0;
+
+	if (unlikely(recursion_level >= 24))
+		return -EMSGSIZE;
+
+	if (copy > 0) {
+		if (copy > len)
+			copy = len;
+		elt++;
+		len -= copy;
+		if (len == 0)
+			return elt;
+		offset += copy;
+	}
+
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		int end;
+
+		WARN_ON(start > offset + len);
+
+		end = start + skb_frag_size(&skb_shinfo(skb)->frags[i]);
+		copy = end - offset;
+		if (copy > 0) {
+			if (copy > len)
+				copy = len;
+			elt++;
+			len -= copy;
+			if (len == 0)
+				return elt;
+			offset += copy;
+		}
+		start = end;
+	}
+
+	skb_walk_frags(skb, frag_iter) {
+		int end, ret;
+
+		WARN_ON(start > offset + len);
+
+		end = start + frag_iter->len;
+		copy = end - offset;
+		if (copy > 0) {
+			if (copy > len)
+				copy = len;
+			ret = __skb_nsg(frag_iter, offset - start, copy,
+					recursion_level + 1);
+			if (unlikely(ret < 0))
+				return ret;
+			elt += ret;
+			len -= copy;
+			if (len == 0)
+				return elt;
+			offset += copy;
+		}
+		start = end;
+	}
+	BUG_ON(len);
+	return elt;
+}
+
+/* Return the number of scatterlist elements required to completely map the
+ * skb, or -EMSGSIZE if the recursion depth is exceeded.
+ */
+static int skb_nsg(struct sk_buff *skb, int offset, int len)
+{
+	return __skb_nsg(skb, offset, len, 0);
+}
+
 static int tls_do_decryption(struct sock *sk,
 			     struct scatterlist *sgin,
 			     struct scatterlist *sgout,
@@ -693,7 +767,7 @@ int decrypt_skb(struct sock *sk, struct sk_buff *skb,
 	struct scatterlist sgin_arr[MAX_SKB_FRAGS + 2];
 	struct scatterlist *sgin = &sgin_arr[0];
 	struct strp_msg *rxm = strp_msg(skb);
-	int ret, nsg = ARRAY_SIZE(sgin_arr);
+	int ret, nsg;
 	struct sk_buff *unused;
 
 	ret = skb_copy_bits(skb, rxm->offset + TLS_HEADER_SIZE,
@@ -704,11 +778,27 @@ int decrypt_skb(struct sock *sk, struct sk_buff *skb,
 
 	memcpy(iv, tls_ctx->rx.iv, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
 	if (!sgout) {
-		nsg = skb_cow_data(skb, 0, &unused) + 1;
+		nsg = skb_cow_data(skb, 0, &unused);
+	} else {
+		nsg = skb_nsg(skb,
+			      rxm->offset + tls_ctx->rx.prepend_size,
+			      rxm->full_len - tls_ctx->rx.prepend_size);
+		if (nsg <= 0)
+			return nsg;
+	}
+
+	// We need one extra for ctx->rx_aad_ciphertext
+	nsg++;
+
+	if (nsg > ARRAY_SIZE(sgin_arr)) {
 		sgin = kmalloc_array(nsg, sizeof(*sgin), sk->sk_allocation);
-		sgout = sgin;
+		if (!sgin)
+			return -ENOMEM;
 	}
 
+	if (!sgout)
+		sgout = sgin;
+
 	sg_init_table(sgin, nsg);
 	sg_set_buf(&sgin[0], ctx->rx_aad_ciphertext, TLS_AAD_SPACE_SIZE);
 
-- 
2.17.1

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
  2018-08-07 18:09 [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data Doron Roberts-Kedes
@ 2018-08-08 19:14 ` David Miller
  2018-08-09 22:43   ` Doron Roberts-Kedes
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2018-08-08 19:14 UTC (permalink / raw)
  To: doronrk; +Cc: davejwatson, vakul.garg, borisp, aviadye, netdev

From: Doron Roberts-Kedes <doronrk@fb.com>
Date: Tue, 7 Aug 2018 11:09:39 -0700

> +static int __skb_nsg(struct sk_buff *skb, int offset, int len,
> +		     unsigned int recursion_level)
> +{
> +	int start = skb_headlen(skb);
> +	int i, copy = start - offset;
> +	struct sk_buff *frag_iter;
> +	int elt = 0;
> +
> +	if (unlikely(recursion_level >= 24))
> +		return -EMSGSIZE;

This recursion is kinda crazy.

Even skb_cow_data() doesn't recurse like this (of course because it copies
into linear buffers).

There has to be a way to simplify this.  Fragment lists are such a rarely
used SKB geometry, and few if any devices support it for transmission
(so the fraglist will get undone at transmit time anyways).

> +	// We need one extra for ctx->rx_aad_ciphertext

Please do not use C++ style comments in code.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
  2018-08-08 19:14 ` David Miller
@ 2018-08-09 22:43   ` Doron Roberts-Kedes
  2018-08-11 18:54     ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-09 22:43 UTC (permalink / raw)
  To: David Miller; +Cc: davejwatson, vakul.garg, borisp, aviadye, netdev

On Wed, Aug 08, 2018 at 12:14:30PM -0700, David Miller wrote:
> From: Doron Roberts-Kedes <doronrk@fb.com>
> Date: Tue, 7 Aug 2018 11:09:39 -0700
> 
> > +static int __skb_nsg(struct sk_buff *skb, int offset, int len,
> > +		     unsigned int recursion_level)
> > +{
> > +	int start = skb_headlen(skb);
> > +	int i, copy = start - offset;
> > +	struct sk_buff *frag_iter;
> > +	int elt = 0;
> > +
> > +	if (unlikely(recursion_level >= 24))
> > +		return -EMSGSIZE;
> 
> This recursion is kinda crazy.
> 
> Even skb_cow_data() doesn't recurse like this (of course because it copies
> into linear buffers).
> 
> There has to be a way to simplify this.  Fragment lists are such a rarely
> used SKB geometry, and few if any devices support it for transmission
> (so the fraglist will get undone at transmit time anyways).
> 

Interesting. Just wanted to clarify whether the issue is the use of
recursion or the fact that the function is handling the frag_list at
all. This is the rx path, so my understanding was that we need to handle
the frag_list. Please let me know if I'm misunderstanding your point
about the rare use of fragment lists.

If the issue is the recursion, I can rewrite the function to not use
recursion, but skb_to_sgvec uses a similar pattern and is invoked
immediately afterwards.

Taking a step back, is there an existing solution for what this function
is trying to do? I was surprised to find that there did not seem to
exist a function for determining the number of scatterlist elements
required to map an skb without COW. 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
  2018-08-09 22:43   ` Doron Roberts-Kedes
@ 2018-08-11 18:54     ` David Miller
  2018-08-21  0:27       ` Doron Roberts-Kedes
  0 siblings, 1 reply; 6+ messages in thread
From: David Miller @ 2018-08-11 18:54 UTC (permalink / raw)
  To: doronrk; +Cc: davejwatson, vakul.garg, borisp, aviadye, netdev

From: Doron Roberts-Kedes <doronrk@fb.com>
Date: Thu, 9 Aug 2018 15:43:44 -0700

> Taking a step back, is there an existing solution for what this function
> is trying to do? I was surprised to find that there did not seem to
> exist a function for determining the number of scatterlist elements
> required to map an skb without COW. 

The reason is that we usually never need to "map" an SKB on receive,
and on transmit the SKB geometry is normalized by the destination
device features which means no frag_list.

And the other existing receive path doing SW crypto, IPSEC, always
COWs the data so get the number of SG array entries needed from
skb_cow_data().

Frag lists on RX should be pretty rare.  They occur when GRO
segmentation encouters poorly arranged incoming SKBs.  For example
this happens if the incoming frames use lots of tiny SKB page frags,
and therefore prevent accumulation into the page frag array of the
head GRO skb.

So yes it can happen, and we have to account for it.

So I guess you really do need to accomodate this situation.  Why
don't you try to rearrange your new function with some likely()
and unlikely() tests so that the straight code path optimizes for
the non-frag-list case?

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
  2018-08-11 18:54     ` David Miller
@ 2018-08-21  0:27       ` Doron Roberts-Kedes
  2018-08-21  0:30         ` David Miller
  0 siblings, 1 reply; 6+ messages in thread
From: Doron Roberts-Kedes @ 2018-08-21  0:27 UTC (permalink / raw)
  To: David Miller; +Cc: davejwatson, vakul.garg, borisp, aviadye, netdev

On Sat, Aug 11, 2018 at 11:54:53AM -0700, David Miller wrote:
> From: Doron Roberts-Kedes <doronrk@fb.com>
> Date: Thu, 9 Aug 2018 15:43:44 -0700
> 
> The reason is that we usually never need to "map" an SKB on receive,
> and on transmit the SKB geometry is normalized by the destination
> device features which means no frag_list.
> 
> And the other existing receive path doing SW crypto, IPSEC, always
> COWs the data so get the number of SG array entries needed from
> skb_cow_data().

Makes sense, thanks! 

> Frag lists on RX should be pretty rare.  They occur when GRO
> segmentation encouters poorly arranged incoming SKBs.  For example
> this happens if the incoming frames use lots of tiny SKB page frags,
> and therefore prevent accumulation into the page frag array of the
> head GRO skb.
> 
> So yes it can happen, and we have to account for it.
> 
> So I guess you really do need to accomodate this situation.  Why
> don't you try to rearrange your new function with some likely()
> and unlikely() tests so that the straight code path optimizes for
> the non-frag-list case?

So I did some poking around and found that basically 100% of skb's
recieved by ktls have a frag_list because of how strparser is
implemented. skb's from TCP that do not a have a frag_list are
accumulated by strparser using frag_list of a new skb. skb's from TCP
that do have a frag_list can become part of skb's with nested frag_lists
(skb's with non-NULL frag_list that are themselves part of a frag_list).
Unfortunatley, frag_list seems to be the common case, so is probably not a
good candidate for an unlikely() test. 

Regarding nested frag_list's more generally, is a good rule of thumb
that multiple layers of frag_list will not occur except for
post-processing such as in strparser? When should skb-handling code be
prepared for nested frag_lists? 

Given that frag_lists are not unlikely in this case, I believe the only
remaining feedback on the original patch was the recursive
implementation. If you'd like, I can re-submit with an iterative
implementation, but I noticed that goes against the existing recursive
pattern in functions like skb_release_data -> kfree_skb_list -> kfree_skb 
-> __kfree_skb -> skb_release_all -> skb_release_data, as well as
skb_to_sgvec. Let me know whether an iterative implementation is
preferred here, or whether I can simply rebase and resubmit a patch
similar to the original (modulo some variable renaming improvements). 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data.
  2018-08-21  0:27       ` Doron Roberts-Kedes
@ 2018-08-21  0:30         ` David Miller
  0 siblings, 0 replies; 6+ messages in thread
From: David Miller @ 2018-08-21  0:30 UTC (permalink / raw)
  To: doronrk; +Cc: davejwatson, vakul.garg, borisp, aviadye, netdev

From: Doron Roberts-Kedes <doronrk@fb.com>
Date: Mon, 20 Aug 2018 17:27:23 -0700

> Given that frag_lists are not unlikely in this case, I believe the only
> remaining feedback on the original patch was the recursive
> implementation. If you'd like, I can re-submit with an iterative
> implementation, but I noticed that goes against the existing recursive
> pattern in functions like skb_release_data -> kfree_skb_list -> kfree_skb 
> -> __kfree_skb -> skb_release_all -> skb_release_data, as well as
> skb_to_sgvec. Let me know whether an iterative implementation is
> preferred here, or whether I can simply rebase and resubmit a patch
> similar to the original (modulo some variable renaming improvements). 

Ok, I guess staying with the recursive implementation is fine.

It's a real shame that frag lists are so common in this code path,
especially nested ones :-/

In the long term, perhaps we can do something about that.

In the short term, I guess this means your original change is OK.

Please resubmit when the net-next tree opens back up, thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-21  3:47 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-07 18:09 [PATCH net-next,v4] net/tls: Calculate nsg for zerocopy path without skb_cow_data Doron Roberts-Kedes
2018-08-08 19:14 ` David Miller
2018-08-09 22:43   ` Doron Roberts-Kedes
2018-08-11 18:54     ` David Miller
2018-08-21  0:27       ` Doron Roberts-Kedes
2018-08-21  0:30         ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).