netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* tx-nocache-copy performance
@ 2014-01-06 20:27 Benjamin Poirier
  2014-01-06 20:57 ` Eric Dumazet
  2014-01-06 20:59 ` tx-nocache-copy performance Tom Herbert
  0 siblings, 2 replies; 9+ messages in thread
From: Benjamin Poirier @ 2014-01-06 20:27 UTC (permalink / raw)
  To: Tom Herbert; +Cc: netdev

Hi Tom,

In commit "c6e1a0d net: Allow no-cache copy from user on transmit
(v3.0-rc1)" you introduced the tx-nocache-copy performance optimization
and set it to on by default. I've tried to reproduce your testcase, as
well as a few more, but I did not find any performance improvement from
turning on tx-nocache-copy. Do you think tx-nocache-copy is still a
worthwhile optimization and it should remain on by default? In which
situations does it help?

I've ran latency tests similar to the ones you described in the commit
log. I've also tested how the option affects single stream throughput
tests. According to the results I obtained, it seems that
tx-nocache-copy has either no impact (in the latency test) or a negative
impact (in the throughput test).

My test results follow. I tested using 3.12.6 on one Intel Xeon W3565
and one i7 920 connected by ixgbe adapters. The results are from the
Xeon, but they're similar on the i7. All numbers report the mean±stddev
over 10 runs of 10s.

1) latency tests similar to what you described
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)

200x netperf -r 1400,1
tx-nocache-copy off
        692000±1000 tps
        50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
        693000±1000 tps
        50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7

200x netperf -r 14000,14000
tx-nocache-copy off
        86450±80 tps
        50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
        86110±60 tps
        50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20

2) single stream throughput tests
tx-nocache-copy leads to higher service demand

                        throughput  cpu0        cpu1        demand
                        (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)

nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)

tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004

nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)

tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002

-Benjamin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tx-nocache-copy performance
  2014-01-06 20:27 tx-nocache-copy performance Benjamin Poirier
@ 2014-01-06 20:57 ` Eric Dumazet
  2014-01-06 21:13   ` Eric Dumazet
  2014-01-06 23:20   ` [PATCH] net: Do not enable tx-nocache-copy by default Benjamin Poirier
  2014-01-06 20:59 ` tx-nocache-copy performance Tom Herbert
  1 sibling, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2014-01-06 20:57 UTC (permalink / raw)
  To: Benjamin Poirier; +Cc: Tom Herbert, netdev

On Mon, 2014-01-06 at 15:27 -0500, Benjamin Poirier wrote:
> Hi Tom,
> 
> In commit "c6e1a0d net: Allow no-cache copy from user on transmit
> (v3.0-rc1)" you introduced the tx-nocache-copy performance optimization
> and set it to on by default. I've tried to reproduce your testcase, as
> well as a few more, but I did not find any performance improvement from
> turning on tx-nocache-copy. Do you think tx-nocache-copy is still a
> worthwhile optimization and it should remain on by default? In which
> situations does it help?
> 
> I've ran latency tests similar to the ones you described in the commit
> log. I've also tested how the option affects single stream throughput
> tests. According to the results I obtained, it seems that
> tx-nocache-copy has either no impact (in the latency test) or a negative
> impact (in the throughput test).
> 
> My test results follow. I tested using 3.12.6 on one Intel Xeon W3565
> and one i7 920 connected by ixgbe adapters. The results are from the
> Xeon, but they're similar on the i7. All numbers report the mean±stddev
> over 10 runs of 10s.
> 
> 1) latency tests similar to what you described
> There is no statistically significant difference between tx-nocache-copy
> on/off.
> nic irqs spread out (one queue per cpu)
> 
> 200x netperf -r 1400,1
> tx-nocache-copy off
>         692000±1000 tps
>         50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
> tx-nocache-copy on
>         693000±1000 tps
>         50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7
> 
> 200x netperf -r 14000,14000
> tx-nocache-copy off
>         86450±80 tps
>         50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
> tx-nocache-copy on
>         86110±60 tps
>         50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20
> 
> 2) single stream throughput tests
> tx-nocache-copy leads to higher service demand
> 
>                         throughput  cpu0        cpu1        demand
>                         (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)
> 
> nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)
> 
> tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
> tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004
> 
> nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)
> 
> tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
> tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002
> 
> -Benjamin

We raised this point last year, for same reasons.

http://www.spinics.net/lists/netdev/msg226501.html

I agree we should revert the default.

I'll post new perf data using latest net-next kernel and a bnx2x NIC.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tx-nocache-copy performance
  2014-01-06 20:27 tx-nocache-copy performance Benjamin Poirier
  2014-01-06 20:57 ` Eric Dumazet
@ 2014-01-06 20:59 ` Tom Herbert
  2014-01-06 21:20   ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Tom Herbert @ 2014-01-06 20:59 UTC (permalink / raw)
  To: Benjamin Poirier; +Cc: Linux Netdev List

On Mon, Jan 6, 2014 at 12:27 PM, Benjamin Poirier <bpoirier@suse.de> wrote:
> Hi Tom,
>
> In commit "c6e1a0d net: Allow no-cache copy from user on transmit
> (v3.0-rc1)" you introduced the tx-nocache-copy performance optimization
> and set it to on by default. I've tried to reproduce your testcase, as
> well as a few more, but I did not find any performance improvement from
> turning on tx-nocache-copy. Do you think tx-nocache-copy is still a
> worthwhile optimization and it should remain on by default? In which
> situations does it help?
>
Unfortunately, I think this is probably not a worthwhile optimization
at this point. The benefits should manifest themselves under high
networking load and high CPU load where we are getting a lot of
pressure on the cache, the non-temporal copy should alleviate that
case. In reality, I suspect that rep movsq is more efficient that
movntq's so the advantages of skipping the cache might be wiped out.
It would be nice if Intel had a movntsq instruction!

btw, I still believe it would be a win if we could use vmsplice to
mitigate the copy altogether, unfortunately no one has yet to come up
with an interface to reliably reclaim buffers :-(.

> I've ran latency tests similar to the ones you described in the commit
> log. I've also tested how the option affects single stream throughput
> tests. According to the results I obtained, it seems that
> tx-nocache-copy has either no impact (in the latency test) or a negative
> impact (in the throughput test).
>
> My test results follow. I tested using 3.12.6 on one Intel Xeon W3565
> and one i7 920 connected by ixgbe adapters. The results are from the
> Xeon, but they're similar on the i7. All numbers report the meanąstddev
> over 10 runs of 10s.
>
> 1) latency tests similar to what you described
> There is no statistically significant difference between tx-nocache-copy
> on/off.
> nic irqs spread out (one queue per cpu)
>
> 200x netperf -r 1400,1
> tx-nocache-copy off
>         692000ą1000 tps
>         50/90/95/99% latency (us): 275ą2/643.8ą0.4/799ą1/2474.4ą0.3
> tx-nocache-copy on
>         693000ą1000 tps
>         50/90/95/99% latency (us): 274ą1/644.1ą0.7/800ą2/2474.5ą0.7
>
> 200x netperf -r 14000,14000
> tx-nocache-copy off
>         86450ą80 tps
>         50/90/95/99% latency (us): 334.37ą0.02/838ą1/2100ą20/3990ą40
> tx-nocache-copy on
>         86110ą60 tps
>         50/90/95/99% latency (us): 334.28ą0.01/837ą2/2110ą20/3990ą20
>
> 2) single stream throughput tests
> tx-nocache-copy leads to higher service demand
>
>                         throughput  cpu0        cpu1        demand
>                         (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)
>
> nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)
>
> tx-nocache-copy off     9402ą5      9.4ą0.2                 0.80ą0.01
> tx-nocache-copy on      9403ą3      9.85ą0.04               0.838ą0.004
>
> nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)
>
> tx-nocache-copy off     9401ą5      5.83ą0.03   5.0ą0.1     0.923ą0.007
> tx-nocache-copy on      9404ą2      5.74ą0.03   5.523ą0.009 0.958ą0.002
>
> -Benjamin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tx-nocache-copy performance
  2014-01-06 20:57 ` Eric Dumazet
@ 2014-01-06 21:13   ` Eric Dumazet
  2014-01-06 23:20   ` [PATCH] net: Do not enable tx-nocache-copy by default Benjamin Poirier
  1 sibling, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2014-01-06 21:13 UTC (permalink / raw)
  To: Benjamin Poirier; +Cc: Tom Herbert, netdev

On Mon, 2014-01-06 at 12:57 -0800, Eric Dumazet wrote:

> We raised this point last year, for same reasons.
> 
> http://www.spinics.net/lists/netdev/msg226501.html
> 
> I agree we should revert the default.
> 
> I'll post new perf data using latest net-next kernel and a bnx2x NIC.


Updated results with latest net-next :

(cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)

lpq83:~# ./ethtool -K eth0 tx-nocache-copy on 
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000 

 Performance counter stats for './netperf -H lpq84 -c':

       4282.648396 task-clock                #    0.423 CPUs utilized          
             9,348 context-switches          #    0.002 M/sec                  
                88 CPU-migrations            #    0.021 K/sec                  
               355 page-faults               #    0.083 K/sec                  
    11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
     9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
     4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
     6,053,172,792 instructions              #    0.51  insns per cycle        
                                             #    1.49  stalled cycles per insn [83.64%]
       597,275,583 branches                  #  139.464 M/sec                   [83.70%]
         8,960,541 branch-misses             #    1.50% of all branches         [83.65%]

      10.128990264 seconds time elapsed

lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000 

 Performance counter stats for './netperf -H lpq84 -c':

       2847.375441 task-clock                #    0.281 CPUs utilized          
            11,632 context-switches          #    0.004 M/sec                  
                49 CPU-migrations            #    0.017 K/sec                  
               354 page-faults               #    0.124 K/sec                  
     7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
     6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
     1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
     2,079,702,453 instructions              #    0.27  insns per cycle        
                                             #    2.94  stalled cycles per insn [83.22%]
       363,773,213 branches                  #  127.757 M/sec                   [83.29%]
         4,242,732 branch-misses             #    1.17% of all branches         [83.51%]

      10.128449949 seconds time elapsed

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: tx-nocache-copy performance
  2014-01-06 20:59 ` tx-nocache-copy performance Tom Herbert
@ 2014-01-06 21:20   ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2014-01-06 21:20 UTC (permalink / raw)
  To: therbert; +Cc: bpoirier, netdev

From: Tom Herbert <therbert@google.com>
Date: Mon, 6 Jan 2014 12:59:59 -0800

> btw, I still believe it would be a win if we could use vmsplice to
> mitigate the copy altogether, unfortunately no one has yet to come up
> with an interface to reliably reclaim buffers :-(.

Indeed, this issue keeps coming up continually.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] net: Do not enable tx-nocache-copy by default
  2014-01-06 20:57 ` Eric Dumazet
  2014-01-06 21:13   ` Eric Dumazet
@ 2014-01-06 23:20   ` Benjamin Poirier
  2014-01-07  1:00     ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Benjamin Poirier @ 2014-01-06 23:20 UTC (permalink / raw)
  To: David S. Miller; +Cc: Eric Dumazet, netdev, linux-kernel, Tom Herbert

There are many cases where this feature does not improve performance or even
reduces it. See the following discussion for example perf numbers:
http://thread.gmane.org/gmane.linux.network/298345

CC: Tom Herbert <therbert@google.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
---
 net/core/dev.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 153ee2f..2e242583 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5783,13 +5783,8 @@ int register_netdevice(struct net_device *dev)
 	dev->features |= NETIF_F_SOFT_FEATURES;
 	dev->wanted_features = dev->features & dev->hw_features;
 
-	/* Turn on no cache copy if HW is doing checksum */
 	if (!(dev->flags & IFF_LOOPBACK)) {
 		dev->hw_features |= NETIF_F_NOCACHE_COPY;
-		if (dev->features & NETIF_F_ALL_CSUM) {
-			dev->wanted_features |= NETIF_F_NOCACHE_COPY;
-			dev->features |= NETIF_F_NOCACHE_COPY;
-		}
 	}
 
 	/* Make NETIF_F_HIGHDMA inheritable to VLAN devices.
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] net: Do not enable tx-nocache-copy by default
  2014-01-06 23:20   ` [PATCH] net: Do not enable tx-nocache-copy by default Benjamin Poirier
@ 2014-01-07  1:00     ` David Miller
  2014-01-07 15:11       ` [PATCH v2] " Benjamin Poirier
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2014-01-07  1:00 UTC (permalink / raw)
  To: bpoirier; +Cc: edumazet, netdev, linux-kernel, therbert

From: Benjamin Poirier <bpoirier@suse.de>
Date: Mon,  6 Jan 2014 18:20:19 -0500

> There are many cases where this feature does not improve performance or even
> reduces it. See the following discussion for example perf numbers:
> http://thread.gmane.org/gmane.linux.network/298345
> 
> CC: Tom Herbert <therbert@google.com>
> Signed-off-by: Benjamin Poirier <bpoirier@suse.de>

Please don't provide data via a URL, which is transient.

Instead, include it inline in the commit message proper.

Thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] net: Do not enable tx-nocache-copy by default
  2014-01-07  1:00     ` David Miller
@ 2014-01-07 15:11       ` Benjamin Poirier
  2014-01-07 21:20         ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Benjamin Poirier @ 2014-01-07 15:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: Eric Dumazet, netdev, linux-kernel, Tom Herbert

There are many cases where this feature does not improve performance or even
reduces it.

For example, here are the results from tests that I've run using 3.12.6 on one
Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
from the Xeon, but they're similar on the i7. All numbers report the
mean±stddev over 10 runs of 10s.

1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
copy from user on transmit"
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)

200x netperf -r 1400,1
tx-nocache-copy off
        692000±1000 tps
        50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
        693000±1000 tps
        50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7

200x netperf -r 14000,14000
tx-nocache-copy off
        86450±80 tps
        50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
        86110±60 tps
        50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20

2) single stream throughput tests
tx-nocache-copy leads to higher service demand

                        throughput  cpu0        cpu1        demand
                        (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)

nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)

tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004

nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)

tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002

As a second example, here are some results from Eric Dumazet with latest
net-next.
tx-nocache-copy also leads to higher service demand

(cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)

lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       4282.648396 task-clock                #    0.423 CPUs utilized
             9,348 context-switches          #    0.002 M/sec
                88 CPU-migrations            #    0.021 K/sec
               355 page-faults               #    0.083 K/sec
    11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
     9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
     4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
     6,053,172,792 instructions              #    0.51  insns per cycle
                                             #    1.49  stalled cycles per insn [83.64%]
       597,275,583 branches                  #  139.464 M/sec                   [83.70%]
         8,960,541 branch-misses             #    1.50% of all branches         [83.65%]

      10.128990264 seconds time elapsed

lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       2847.375441 task-clock                #    0.281 CPUs utilized
            11,632 context-switches          #    0.004 M/sec
                49 CPU-migrations            #    0.017 K/sec
               354 page-faults               #    0.124 K/sec
     7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
     6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
     1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
     2,079,702,453 instructions              #    0.27  insns per cycle
                                             #    2.94  stalled cycles per insn [83.22%]
       363,773,213 branches                  #  127.757 M/sec                   [83.29%]
         4,242,732 branch-misses             #    1.17% of all branches         [83.51%]

      10.128449949 seconds time elapsed

CC: Tom Herbert <therbert@google.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
---
 net/core/dev.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 4fc1722..0e82e77 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5831,13 +5831,8 @@ int register_netdevice(struct net_device *dev)
 	dev->features |= NETIF_F_SOFT_FEATURES;
 	dev->wanted_features = dev->features & dev->hw_features;
 
-	/* Turn on no cache copy if HW is doing checksum */
 	if (!(dev->flags & IFF_LOOPBACK)) {
 		dev->hw_features |= NETIF_F_NOCACHE_COPY;
-		if (dev->features & NETIF_F_ALL_CSUM) {
-			dev->wanted_features |= NETIF_F_NOCACHE_COPY;
-			dev->features |= NETIF_F_NOCACHE_COPY;
-		}
 	}
 
 	/* Make NETIF_F_HIGHDMA inheritable to VLAN devices.
-- 
1.8.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] net: Do not enable tx-nocache-copy by default
  2014-01-07 15:11       ` [PATCH v2] " Benjamin Poirier
@ 2014-01-07 21:20         ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2014-01-07 21:20 UTC (permalink / raw)
  To: bpoirier; +Cc: edumazet, netdev, linux-kernel, therbert

From: Benjamin Poirier <bpoirier@suse.de>
Date: Tue,  7 Jan 2014 10:11:10 -0500

> There are many cases where this feature does not improve performance or even
> reduces it.
> 
> For example, here are the results from tests that I've run using 3.12.6 on one
> Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
> from the Xeon, but they're similar on the i7. All numbers report the
> mean±stddev over 10 runs of 10s.
> 
> 1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
> copy from user on transmit"
> There is no statistically significant difference between tx-nocache-copy
> on/off.
> nic irqs spread out (one queue per cpu)
 ...
> CC: Tom Herbert <therbert@google.com>
> Signed-off-by: Benjamin Poirier <bpoirier@suse.de>

Looks good, applied, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-01-07 21:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-06 20:27 tx-nocache-copy performance Benjamin Poirier
2014-01-06 20:57 ` Eric Dumazet
2014-01-06 21:13   ` Eric Dumazet
2014-01-06 23:20   ` [PATCH] net: Do not enable tx-nocache-copy by default Benjamin Poirier
2014-01-07  1:00     ` David Miller
2014-01-07 15:11       ` [PATCH v2] " Benjamin Poirier
2014-01-07 21:20         ` David Miller
2014-01-06 20:59 ` tx-nocache-copy performance Tom Herbert
2014-01-06 21:20   ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).