netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/6] net: Checksum offload changes - Part V
@ 2014-08-22 20:33 Tom Herbert
  2014-08-25  1:24 ` David Miller
  0 siblings, 1 reply; 2+ messages in thread
From: Tom Herbert @ 2014-08-22 20:33 UTC (permalink / raw)
  To: davem, netdev

I am working on overhauling RX checksum offload. Goals of this effort
are:

- Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
- Preserve CHECKSUM_COMPLETE through encapsulation layers
- Don't do skb_checksum more than once per packet
- Unify GRO and non-GRO csum verification as much as possible
- Unify the checksum functions (checksum_init)
- Simplify code

What is in this fifth patch set:

- Added GRO checksum validation functions
- Call the GRO validations functions from TCP and GRE gro_receive
- Perform checksum verification in the UDP gro_receive path using
  GRO functions and add support for gro_receive in UDP6

Changes in V2:

- Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
  to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
  performance penalty in checksumming bytes which are before the header
  GRO is at.

Please review carefully and test if possible, mucking with basic
checksum functions is always a little precarious :-)

----

Test results with this patch set are below. I did not notice any
performace regression.

Tests run:
   TCP_STREAM: super_netperf with 200 streams
   TCP_RR: super_netperf with 200 streams and -r 1,1

Device bnx2x (10Gbps):
   No GRE RSS hash (RX interrupts occur on one core)
   UDP RSS port hashing enabled.

* GRE with checksum with IPv4 encapsulated packets
  With fix:
    TCP_STREAM
        9.91% CPU utilization
        5163.78 Mbps
    TCP_RR
        50.64% CPU utilization
        219/347/502 90/95/99% latencies
        834103 tps
  Without fix:
    TCP_STREAM
        10.05% CPU utilization
        5186.22 tps
    TCP_RR
        49.70% CPU utilization
        227/338/486 90/95/99% latencies
        813450 tps

* GRE without checksum with IPv4 encapsulated packets
  With fix:
    TCP_STREAM
        10.18% CPU utilization
        5159 Mbps
    TCP_RR
        51.86% CPU utilization
        214/325/471 90/95/99% latencies
        865943 tps
  Without fix:
    TCP_STREAM
        10.26% CPU utilization
        5307.87 Mbps
    TCP_RR
        50.59% CPU utilization
        224/325/476 90/95/99% latencies
        846429 tps

*** Simulate device returns CHECKSUM_COMPLETE

* VXLAN with checksum
  With fix:
    TCP_STREAM
        13.03% CPU utilization
        9093.9 Mbps
    TCP_RR
        95.96% CPU utilization
        161/259/474 90/95/99% latencies
        1.14806e+06 tps
  Without fix:
    TCP_STREAM
        13.59% CPU utilization
        9093.97 Mbps
    TCP_RR
        93.95% CPU utilization
        160/259/484 90/95/99% latencies
        1.10262e+06 tps

* VXLAN without checksum
  With fix:
    TCP_STREAM
        13.28% CPU utilization
        9093.87 Mbps
    TCP_RR
        95.04% CPU utilization
        155/246/439 90/95/99% latencies
        1.15e+06 tps
  Without fix:
    TCP_STREAM
        13.37% CPU utilization
        9178.45 Mbps
    TCP_RR
        93.74% CPU utilization
        161/257/469 90/95/99% latencies
        1.1068e+06 Mbps

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH v2 net-next 0/6] net: Checksum offload changes - Part V
  2014-08-22 20:33 [PATCH v2 net-next 0/6] net: Checksum offload changes - Part V Tom Herbert
@ 2014-08-25  1:24 ` David Miller
  0 siblings, 0 replies; 2+ messages in thread
From: David Miller @ 2014-08-25  1:24 UTC (permalink / raw)
  To: therbert; +Cc: netdev

From: Tom Herbert <therbert@google.com>
Date: Fri, 22 Aug 2014 13:33:38 -0700 (PDT)

> I am working on overhauling RX checksum offload. Goals of this effort
> are:
> 
> - Specify what exactly it means when driver returns CHECKSUM_UNNECESSARY
> - Preserve CHECKSUM_COMPLETE through encapsulation layers
> - Don't do skb_checksum more than once per packet
> - Unify GRO and non-GRO csum verification as much as possible
> - Unify the checksum functions (checksum_init)
> - Simplify code
> 
> What is in this fifth patch set:
> 
> - Added GRO checksum validation functions
> - Call the GRO validations functions from TCP and GRE gro_receive
> - Perform checksum verification in the UDP gro_receive path using
>   GRO functions and add support for gro_receive in UDP6
> 
> Changes in V2:
> 
> - Change ip_summed to CHECKSUM_UNNECESSARY instead of moving it
>   to CHECKSUM_COMPLETE from GRO checksum validation. This avoids
>   performance penalty in checksumming bytes which are before the header
>   GRO is at.
> 
> Please review carefully and test if possible, mucking with basic
> checksum functions is always a little precarious :-)

Looks good, applied, thanks Tom.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-08-25  1:24 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-22 20:33 [PATCH v2 net-next 0/6] net: Checksum offload changes - Part V Tom Herbert
2014-08-25  1:24 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).