netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] sctp: linearize early if it's not GSO
@ 2016-08-16 22:35 Marcelo Ricardo Leitner
  2016-08-16 22:59 ` Daniel Borkmann
  0 siblings, 1 reply; 3+ messages in thread
From: Marcelo Ricardo Leitner @ 2016-08-16 22:35 UTC (permalink / raw)
  To: netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich

Because otherwise when crc computation is still needed it's way more
expensive than on a linear buffer to the point that it affects
performance.

It's so expensive that netperf test gives a perf output as below:

Overhead  Shared Object        Symbol
  69,44%  [kernel]             [k] gf2_matrix_square
   2,84%  [kernel]             [k] crc32_generic_combine.part.0
   2,78%  [kernel]             [k] _raw_spin_lock_bh

And performance goes from 2Gbit/s to 0.5Gbit/s on this test. Doing the
linearization before checksumming is enough to restore it.

Fixes: 3acb50c18d8d ("sctp: delay as much as possible skb_linearize")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---

Please consider this to -stable too as v4.7 performance is affected.
Thanks

 net/sctp/input.c   | 11 +++++++----
 net/sctp/inqueue.c | 13 -------------
 2 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/net/sctp/input.c b/net/sctp/input.c
index c182db7d691ff44a52923fb36c9170e49c141c04..69444d32ecda6cd1a4924911172feba89c5ae976 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -119,7 +119,13 @@ int sctp_rcv(struct sk_buff *skb)
 		       skb_transport_offset(skb))
 		goto discard_it;
 
-	if (!pskb_may_pull(skb, sizeof(struct sctphdr)))
+	/* If the packet is fragmented and we need to do crc checking,
+	 * it's better to just linearize it otherwise crc computing
+	 * takes longer.
+	 */
+	if ((!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP) &&
+	     skb_linearize(skb)) ||
+	    !pskb_may_pull(skb, sizeof(struct sctphdr)))
 		goto discard_it;
 
 	/* Pull up the IP header. */
@@ -1177,9 +1183,6 @@ static struct sctp_association *__sctp_rcv_lookup_harder(struct net *net,
 	if ((skb_shinfo(skb)->gso_type & SKB_GSO_SCTP) == SKB_GSO_SCTP)
 		return NULL;
 
-	if (skb_linearize(skb))
-		return NULL;
-
 	ch = (sctp_chunkhdr_t *) skb->data;
 
 	/* The code below will attempt to walk the chunk and extract
diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c
index c30ddb0f31907f57c5ce85b00dbe04260ca1cb2e..6437aa97cfd79f14c633499c2b131389204c435b 100644
--- a/net/sctp/inqueue.c
+++ b/net/sctp/inqueue.c
@@ -170,19 +170,6 @@ next_chunk:
 
 		chunk = list_entry(entry, struct sctp_chunk, list);
 
-		/* Linearize if it's not GSO */
-		if ((skb_shinfo(chunk->skb)->gso_type & SKB_GSO_SCTP) != SKB_GSO_SCTP &&
-		    skb_is_nonlinear(chunk->skb)) {
-			if (skb_linearize(chunk->skb)) {
-				__SCTP_INC_STATS(dev_net(chunk->skb->dev), SCTP_MIB_IN_PKT_DISCARDS);
-				sctp_chunk_free(chunk);
-				goto next_chunk;
-			}
-
-			/* Update sctp_hdr as it probably changed */
-			chunk->sctp_hdr = sctp_hdr(chunk->skb);
-		}
-
 		if ((skb_shinfo(chunk->skb)->gso_type & SKB_GSO_SCTP) == SKB_GSO_SCTP) {
 			/* GSO-marked skbs but without frags, handle
 			 * them normally
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net] sctp: linearize early if it's not GSO
  2016-08-16 22:35 [PATCH net] sctp: linearize early if it's not GSO Marcelo Ricardo Leitner
@ 2016-08-16 22:59 ` Daniel Borkmann
  2016-08-17  0:49   ` Marcelo Ricardo Leitner
  0 siblings, 1 reply; 3+ messages in thread
From: Daniel Borkmann @ 2016-08-16 22:59 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner, netdev; +Cc: linux-sctp, Neil Horman, Vlad Yasevich

On 08/17/2016 12:35 AM, Marcelo Ricardo Leitner wrote:
> Because otherwise when crc computation is still needed it's way more
> expensive than on a linear buffer to the point that it affects
> performance.
>
> It's so expensive that netperf test gives a perf output as below:
>
> Overhead  Shared Object        Symbol
>    69,44%  [kernel]             [k] gf2_matrix_square
>     2,84%  [kernel]             [k] crc32_generic_combine.part.0
>     2,78%  [kernel]             [k] _raw_spin_lock_bh

What kernel is this, seems not net kernel?

$ git grep -n gf2_matrix_square
$ git grep -n crc32_generic_combine
$

Maybe RHEL? Did you consider backporting 6d514b4e7737 et al?

> And performance goes from 2Gbit/s to 0.5Gbit/s on this test. Doing the
> linearization before checksumming is enough to restore it.
>
> Fixes: 3acb50c18d8d ("sctp: delay as much as possible skb_linearize")
> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net] sctp: linearize early if it's not GSO
  2016-08-16 22:59 ` Daniel Borkmann
@ 2016-08-17  0:49   ` Marcelo Ricardo Leitner
  0 siblings, 0 replies; 3+ messages in thread
From: Marcelo Ricardo Leitner @ 2016-08-17  0:49 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev, linux-sctp, Neil Horman, Vlad Yasevich

On Wed, Aug 17, 2016 at 12:59:17AM +0200, Daniel Borkmann wrote:
> On 08/17/2016 12:35 AM, Marcelo Ricardo Leitner wrote:
> > Because otherwise when crc computation is still needed it's way more
> > expensive than on a linear buffer to the point that it affects
> > performance.
> > 
> > It's so expensive that netperf test gives a perf output as below:
> > 
> > Overhead  Shared Object        Symbol
> >    69,44%  [kernel]             [k] gf2_matrix_square
> >     2,84%  [kernel]             [k] crc32_generic_combine.part.0
> >     2,78%  [kernel]             [k] _raw_spin_lock_bh
> 
> What kernel is this, seems not net kernel?
> 
> $ git grep -n gf2_matrix_square
> $ git grep -n crc32_generic_combine
> $
> 
> Maybe RHEL? Did you consider backporting 6d514b4e7737 et al?

Damn, correct. I'll post a v2 later with a proper changelog.
No I hadn't considered backporting that commit.

Now from a different environment, upstream kernel, without the patch, using
mlx4 and perf record -a -- sleep 5 during netperf (Xeon E5-2690 v3, 24 cpus):

Overhead  Command         Shared Object       Symbol
  16,85%  netserver       [kernel.vmlinux]    [k] crc32_generic_shift
   3,46%  swapper         [kernel.vmlinux]    [k] intel_idle
   2,00%  netserver       [kernel.vmlinux]    [k] __pskb_pull_tail
   1,73%  netserver       [kernel.vmlinux]    [k] copy_user_enhanced_fast_string
   1,72%  swapper         [kernel.vmlinux]    [k] crc32_generic_shift
   1,64%  swapper         [kernel.vmlinux]    [k] poll_idle
   1,59%  netserver       [kernel.vmlinux]    [k] memcpy_erms
   1,57%  netserver       [kernel.vmlinux]    [k] fib_table_lookup
   1,47%  netserver       [kernel.vmlinux]    [k] _raw_spin_lock
   1,37%  netserver       [kernel.vmlinux]    [k] __slab_free
   1,32%  netserver       [sctp]              [k] sctp_packet_transmit
   1,18%  netserver       [kernel.vmlinux]    [k] skb_copy_datagram_iter

With the patch:
Overhead  Command         Shared Object        Symbol
   4,71%  swapper         [kernel.vmlinux]     [k] intel_idle
   2,11%  netserver       [kernel.vmlinux]     [k] copy_user_enhanced_fast_string
   1,45%  netserver       [kernel.vmlinux]     [k] memcpy_erms
   1,29%  swapper         [kernel.vmlinux]     [k] memcpy_erms
   1,28%  netserver       [kernel.vmlinux]     [k] fib_table_lookup
   1,27%  netserver       [kernel.vmlinux]     [k] __slab_free
   1,27%  swapper         [kernel.vmlinux]     [k] fib_table_lookup
   1,26%  netserver       [kernel.vmlinux]     [k] kmem_cache_free
   1,14%  netserver       [kernel.vmlinux]     [k] _raw_spin_lock
   1,07%  netserver       [kernel.vmlinux]     [k] __pskb_pull_tail
   1,06%  netserver       [kernel.vmlinux]     [k] skb_copy_datagram_iter
   1,04%  netserver       [sctp]               [k] sctp_packet_transmit
   1,04%  swapper         [kernel.vmlinux]     [k] __pskb_pull_tail
   1,01%  swapper         [mlx4_en]            [k] mlx4_en_process_rx_cq
   0,99%  swapper         [kernel.vmlinux]     [k] native_queued_spin_lock_slowpath
   0,96%  swapper         [kernel.vmlinux]     [k] _raw_spin_lock
   0,89%  swapper         [sctp]               [k] sctp_packet_transmit

Without the patch:
# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

212992 212992  12000    10.00      2896.13   3.34     3.88     2.267   2.635


With the patch:
# netperf -H 192.168.10.1 -l 10 -t SCTP_STREAM -cC -- -m 12000
SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.10.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

212992 212992  12000    10.00      3444.89   3.88     3.02     2.216   1.721

And without the patch netperf fluctuates more as there are more packet
drops and netserver is constantly at 100% cpu usage.

Thanks,
Marcelo

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-08-17  0:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-16 22:35 [PATCH net] sctp: linearize early if it's not GSO Marcelo Ricardo Leitner
2016-08-16 22:59 ` Daniel Borkmann
2016-08-17  0:49   ` Marcelo Ricardo Leitner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).