From: lizf@kernel.org
To: stable@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Benjamin Poirier <bpoirier@suse.de>,
Tom Herbert <therbert@google.com>,
"David S. Miller" <davem@davemloft.net>,
Zefan Li <lizefan@huawei.com>
Subject: [PATCH 3.4 89/91] net: Do not enable tx-nocache-copy by default
Date: Thu, 27 Nov 2014 16:43:12 +0800 [thread overview]
Message-ID: <1417077794-9299-89-git-send-email-lizf@kernel.org> (raw)
In-Reply-To: <1417077368-9217-1-git-send-email-lizf@kernel.org>
From: Benjamin Poirier <bpoirier@suse.de>
3.4.105-rc1 review patch. If anyone has any objections, please let me know.
------------------
commit cdb3f4a31b64c3a1c6eef40bc01ebc9594c58a8c upstream.
There are many cases where this feature does not improve performance or even
reduces it.
For example, here are the results from tests that I've run using 3.12.6 on one
Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
from the Xeon, but they're similar on the i7. All numbers report the
mean±stddev over 10 runs of 10s.
1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
copy from user on transmit"
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)
200x netperf -r 1400,1
tx-nocache-copy off
692000±1000 tps
50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
693000±1000 tps
50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7
200x netperf -r 14000,14000
tx-nocache-copy off
86450±80 tps
50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
86110±60 tps
50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20
2) single stream throughput tests
tx-nocache-copy leads to higher service demand
throughput cpu0 cpu1 demand
(Gb/s) (Gcycle) (Gcycle) (cycle/B)
nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)
tx-nocache-copy off 9402±5 9.4±0.2 0.80±0.01
tx-nocache-copy on 9403±3 9.85±0.04 0.838±0.004
nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)
tx-nocache-copy off 9401±5 5.83±0.03 5.0±0.1 0.923±0.007
tx-nocache-copy on 9404±2 5.74±0.03 5.523±0.009 0.958±0.002
As a second example, here are some results from Eric Dumazet with latest
net-next.
tx-nocache-copy also leads to higher service demand
(cpu is Intel(R) Xeon(R) CPU X5660 @ 2.80GHz)
lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 9407.44 2.50 -1.00 0.522 -1.000
Performance counter stats for './netperf -H lpq84 -c':
4282.648396 task-clock # 0.423 CPUs utilized
9,348 context-switches # 0.002 M/sec
88 CPU-migrations # 0.021 K/sec
355 page-faults # 0.083 K/sec
11,812,797,651 cycles # 2.758 GHz [82.79%]
9,020,522,817 stalled-cycles-frontend # 76.36% frontend cycles idle [82.54%]
4,579,889,681 stalled-cycles-backend # 38.77% backend cycles idle [67.33%]
6,053,172,792 instructions # 0.51 insns per cycle
# 1.49 stalled cycles per insn [83.64%]
597,275,583 branches # 139.464 M/sec [83.70%]
8,960,541 branch-misses # 1.50% of all branches [83.65%]
10.128990264 seconds time elapsed
lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
87380 16384 16384 10.00 9412.45 2.15 -1.00 0.449 -1.000
Performance counter stats for './netperf -H lpq84 -c':
2847.375441 task-clock # 0.281 CPUs utilized
11,632 context-switches # 0.004 M/sec
49 CPU-migrations # 0.017 K/sec
354 page-faults # 0.124 K/sec
7,646,889,749 cycles # 2.686 GHz [83.34%]
6,115,050,032 stalled-cycles-frontend # 79.97% frontend cycles idle [83.31%]
1,726,460,071 stalled-cycles-backend # 22.58% backend cycles idle [66.55%]
2,079,702,453 instructions # 0.27 insns per cycle
# 2.94 stalled cycles per insn [83.22%]
363,773,213 branches # 127.757 M/sec [83.29%]
4,242,732 branch-misses # 1.17% of all branches [83.51%]
10.128449949 seconds time elapsed
CC: Tom Herbert <therbert@google.com>
Signed-off-by: Benjamin Poirier <bpoirier@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
---
net/core/dev.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index b47375d..0770364 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5546,13 +5546,8 @@ int register_netdevice(struct net_device *dev)
dev->features |= NETIF_F_SOFT_FEATURES;
dev->wanted_features = dev->features & dev->hw_features;
- /* Turn on no cache copy if HW is doing checksum */
if (!(dev->flags & IFF_LOOPBACK)) {
dev->hw_features |= NETIF_F_NOCACHE_COPY;
- if (dev->features & NETIF_F_ALL_CSUM) {
- dev->wanted_features |= NETIF_F_NOCACHE_COPY;
- dev->features |= NETIF_F_NOCACHE_COPY;
- }
}
/* Make NETIF_F_HIGHDMA inheritable to VLAN devices.
--
1.9.1
next prev parent reply other threads:[~2014-11-27 8:43 UTC|newest]
Thread overview: 98+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-11-27 8:36 [PATCH 3.4 00/91] 3.4.105-rc1 review lizf
2014-11-27 8:41 ` [PATCH 3.4 01/91] KVM: s390: Fix user triggerable bug in dead code lizf
2014-11-27 8:41 ` [PATCH 3.4 02/91] regmap: Fix handling of volatile registers for format_write() chips lizf
2014-11-27 8:41 ` [PATCH 3.4 03/91] drm/i915: Remove bogus __init annotation from DMI callbacks lizf
2014-11-27 8:41 ` [PATCH 3.4 04/91] get rid of propagate_umount() mistakenly treating slaves as busy lizf
2014-11-27 8:41 ` [PATCH 3.4 05/91] drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle lizf
2014-11-27 8:41 ` [PATCH 3.4 06/91] ALSA: hda - Fix COEF setups for ALC1150 codec lizf
2014-11-27 8:41 ` [PATCH 3.4 07/91] ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock lizf
2014-11-27 8:41 ` [PATCH 3.4 08/91] regulatory: add NUL to alpha2 lizf
2014-11-27 8:41 ` [PATCH 3.4 09/91] percpu: fix pcpu_alloc_pages() failure path lizf
2014-11-27 8:41 ` [PATCH 3.4 10/91] percpu: perform tlb flush after pcpu_map_pages() failure lizf
2014-11-27 8:41 ` [PATCH 3.4 11/91] percpu: free percpu allocation info for uniprocessor system lizf
2014-11-27 8:41 ` [PATCH 3.4 12/91] cgroup: reject cgroup names with ' ' lizf
2014-11-27 8:41 ` [PATCH 3.4 13/91] rtlwifi: rtl8192cu: Add new ID lizf
2014-11-27 8:41 ` [PATCH 3.4 14/91] ahci: Add Device IDs for Intel 9 Series PCH lizf
2014-11-27 8:41 ` [PATCH 3.4 15/91] ata_piix: " lizf
2014-11-27 8:41 ` [PATCH 3.4 16/91] USB: ftdi_sio: add support for NOVITUS Bono E thermal printer lizf
2014-11-27 8:42 ` [PATCH 3.4 17/91] USB: sierra: avoid CDC class functions on "68A3" devices lizf
2014-11-27 8:42 ` [PATCH 3.4 18/91] USB: sierra: add 1199:68AA device ID lizf
2014-11-27 8:42 ` [PATCH 3.4 19/91] xen/manage: Always freeze/thaw processes when suspend/resuming lizf
2014-11-27 8:42 ` [PATCH 3.4 20/91] block: Fix dev_t minor allocation lifetime lizf
2014-11-27 8:42 ` [PATCH 3.4 21/91] usb: dwc3: core: fix order of PM runtime calls lizf
2014-11-27 8:42 ` [PATCH 3.4 22/91] ahci: add pcid for Marvel 0x9182 controller lizf
2014-11-27 8:42 ` [PATCH 3.4 23/91] drm/radeon: add connector quirk for fujitsu board lizf
2014-11-27 8:42 ` [PATCH 3.4 24/91] usb: host: xhci: fix compliance mode workaround lizf
2014-11-27 8:42 ` [PATCH 3.4 25/91] Input: elantech - fix detection of touchpad on ASUS s301l lizf
2014-11-27 8:42 ` [PATCH 3.4 26/91] USB: ftdi_sio: Add support for GE Healthcare Nemo Tracker device lizf
2014-11-27 8:42 ` [PATCH 3.4 27/91] uwb: init beacon cache entry before registering uwb device lizf
2014-11-27 8:42 ` [PATCH 3.4 28/91] Input: synaptics - add support for ForcePads lizf
2014-11-27 8:42 ` [PATCH 3.4 29/91] libceph: gracefully handle large reply messages from the mon lizf
2014-11-27 8:42 ` [PATCH 3.4 30/91] libceph: add process_one_ticket() helper lizf
2014-11-27 8:42 ` [PATCH 3.4 31/91] libceph: do not hard code max auth ticket len lizf
2014-11-27 8:42 ` [PATCH 3.4 32/91] Input: serport - add compat handling for SPIOCSTYPE ioctl lizf
2014-11-27 8:42 ` [PATCH 3.4 33/91] usb: hub: take hub->hdev reference when processing from eventlist lizf
2014-11-27 8:42 ` [PATCH 3.4 34/91] storage: Add single-LUN quirk for Jaz USB Adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 35/91] xhci: Fix null pointer dereference if xhci initialization fails lizf
2014-11-27 8:42 ` [PATCH 3.4 36/91] futex: Unlock hb->lock in futex_wait_requeue_pi() error path lizf
2014-11-27 8:42 ` [PATCH 3.4 37/91] alarmtimer: Return relative times in timer_gettime lizf
2014-11-27 8:42 ` [PATCH 3.4 38/91] alarmtimer: Do not signal SIGEV_NONE timers lizf
2014-11-27 8:42 ` [PATCH 3.4 39/91] alarmtimer: Lock k_itimer during timer callback lizf
2014-11-27 8:42 ` [PATCH 3.4 40/91] don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() lizf
2014-11-27 8:42 ` [PATCH 3.4 41/91] jiffies: Fix timeval conversion to jiffies lizf
2014-11-27 8:42 ` [PATCH 3.4 42/91] MIPS: ZBOOT: add missing <linux/string.h> include lizf
2014-11-27 8:42 ` [PATCH 3.4 43/91] perf: Fix a race condition in perf_remove_from_context() lizf
2014-12-01 18:43 ` Sukadev Bhattiprolu
2014-12-02 1:22 ` Zefan Li
2014-11-27 8:42 ` [PATCH 3.4 44/91] ASoC: samsung-i2s: Check secondary DAI exists before referencing lizf
2014-11-27 8:42 ` [PATCH 3.4 45/91] Input: i8042 - add Fujitsu U574 to no_timeout dmi table lizf
2014-11-27 8:42 ` [PATCH 3.4 46/91] Input: i8042 - add nomux quirk for Avatar AVIU-145A6 lizf
2014-11-27 8:42 ` [PATCH 3.4 47/91] iscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid lizf
2014-11-27 8:42 ` [PATCH 3.4 48/91] iscsi-target: avoid NULL pointer in iscsi_copy_param_list failure lizf
2014-11-27 8:42 ` [PATCH 3.4 49/91] NFSv4: Fix another bug in the close/open_downgrade code lizf
2014-11-27 8:42 ` [PATCH 3.4 50/91] libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu lizf
2014-11-27 8:42 ` [PATCH 3.4 51/91] USB: storage: Add quirk for Adaptec USBConnect 2000 USB-to-SCSI Adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 52/91] USB: storage: Add quirk for Ariston Technologies iConnect USB to SCSI adapter lizf
2014-11-27 8:42 ` [PATCH 3.4 53/91] USB: storage: Add quirks for Entrega/Xircom USB to SCSI converters lizf
2014-11-27 8:42 ` [PATCH 3.4 54/91] can: flexcan: mark TX mailbox as TX_INACTIVE lizf
2014-11-27 8:42 ` [PATCH 3.4 55/91] can: flexcan: correctly initialize mailboxes lizf
2014-11-27 8:42 ` [PATCH 3.4 56/91] can: flexcan: implement workaround for errata ERR005829 lizf
2014-11-27 8:42 ` [PATCH 3.4 57/91] can: flexcan: put TX mailbox into TX_INACTIVE mode after tx-complete lizf
2014-11-27 8:42 ` [PATCH 3.4 58/91] can: at91_can: add missing prepare and unprepare of the clock lizf
2014-11-27 8:42 ` [PATCH 3.4 59/91] ALSA: pcm: fix fifo_size frame calculation lizf
2014-11-27 8:42 ` [PATCH 3.4 60/91] Fix nasty 32-bit overflow bug in buffer i/o code lizf
2014-11-27 8:42 ` [PATCH 3.4 61/91] parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds lizf
2014-11-27 8:42 ` [PATCH 3.4 62/91] sched: Fix unreleased llc_shared_mask bit during CPU hotplug lizf
2014-11-27 8:42 ` [PATCH 3.4 63/91] sched: add macros to define bitops for task atomic flags lizf
2014-11-27 8:42 ` [PATCH 3.4 64/91] cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be " lizf
2014-11-27 8:42 ` [PATCH 3.4 65/91] MIPS: mcount: Adjust stack pointer for static trace in MIPS32 lizf
2014-11-27 8:42 ` [PATCH 3.4 66/91] nilfs2: fix data loss with mmap() lizf
2014-11-27 8:42 ` [PATCH 3.4 67/91] ocfs2/dlm: do not get resource spinlock if lockres is new lizf
2014-11-27 8:42 ` [PATCH 3.4 68/91] shmem: fix nlink for rename overwrite directory lizf
2014-11-27 8:42 ` [PATCH 3.4 69/91] ARM: 8165/1: alignment: don't break misaligned NEON load/store lizf
2014-11-27 8:42 ` [PATCH 3.4 70/91] ASoC: core: fix possible ZERO_SIZE_PTR pointer dereferencing error lizf
2014-11-27 8:42 ` [PATCH 3.4 71/91] mm: migrate: Close race between migration completion and mprotect lizf
2014-11-27 8:42 ` [PATCH 3.4 72/91] perf: fix perf bug in fork() lizf
2014-11-27 8:42 ` [PATCH 3.4 73/91] init/Kconfig: Hide printk log config if CONFIG_PRINTK=n lizf
2014-11-27 8:42 ` [PATCH 3.4 74/91] genhd: fix leftover might_sleep() in blk_free_devt() lizf
2014-11-27 8:42 ` [PATCH 3.4 75/91] nl80211: clear skb cb before passing to netlink lizf
2014-11-27 8:42 ` [PATCH 3.4 76/91] ext4: propagate errors up to ext4_find_entry()'s callers lizf
2014-11-27 8:43 ` [PATCH 3.4 77/91] ext4: avoid trying to kfree an ERR_PTR pointer lizf
2014-11-27 8:43 ` [PATCH 3.4 78/91] NFS: fix stable regression lizf
2014-11-27 8:43 ` [PATCH 3.4 79/91] perf: Handle compat ioctl lizf
2014-11-27 8:43 ` [PATCH 3.4 80/91] bluetooth: hci_ldisc: fix deadlock condition lizf
2014-11-27 8:43 ` [PATCH 3.4 81/91] mnt: Only change user settable mount flags in remount lizf
2014-11-27 8:43 ` [PATCH 3.4 82/91] dm crypt: fix access beyond the end of allocated space lizf
2014-11-27 8:43 ` [PATCH 3.4 83/91] Fix spurious request sense in error handling lizf
2014-11-27 8:43 ` [PATCH 3.4 84/91] ipv4: move route garbage collector to work queue lizf
2014-11-27 8:43 ` [PATCH 3.4 85/91] ipv4: avoid parallel route cache gc executions lizf
2014-11-27 8:43 ` [PATCH 3.4 86/91] ipv4: disable bh while doing route gc lizf
2014-11-27 8:43 ` [PATCH 3.4 87/91] rtl8192ce: Fix null dereference in watchdog lizf
2014-11-27 8:43 ` [PATCH 3.4 88/91] ipv6: reuse ip6_frag_id from ip6_ufo_append_data lizf
2014-11-27 8:43 ` lizf [this message]
2014-11-27 8:43 ` [PATCH 3.4 90/91] ixgbevf: Prevent RX/TX statistics getting reset to zero lizf
2014-11-27 8:43 ` [PATCH 3.4 91/91] l2tp: fix race while getting PMTU on PPP pseudo-wire lizf
2014-11-27 9:07 ` [PATCH 3.4 00/91] 3.4.105-rc1 review Zefan Li
2014-11-27 20:15 ` Guenter Roeck
2014-11-29 5:54 ` Zefan Li
-- strict thread matches above, loose matches on Subject: below --
2015-01-28 4:07 [PATCH 3.4 000/177] 3.4.106-rc1 review lizf
2015-01-28 4:09 ` [PATCH 3.4 89/91] net: Do not enable tx-nocache-copy by default lizf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1417077794-9299-89-git-send-email-lizf@kernel.org \
--to=lizf@kernel.org \
--cc=bpoirier@suse.de \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=lizefan@huawei.com \
--cc=stable@vger.kernel.org \
--cc=therbert@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).