From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:52376)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ikrabbe.ask@gmail.com>) id 1ayXTo-0007G1-Sj
	for qemu-devel@nongnu.org; Fri, 06 May 2016 00:35:15 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <ikrabbe.ask@gmail.com>) id 1ayXTc-00068B-Hm
	for qemu-devel@nongnu.org; Fri, 06 May 2016 00:35:03 -0400
Received: from mailout.ish.de ([80.69.98.248]:51961)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <ikrabbe.ask@gmail.com>) id 1ayXTc-00062x-6P
	for qemu-devel@nongnu.org; Fri, 06 May 2016 00:34:56 -0400
Message-ID: <c7af14cf813a8db984f58e8ac68c37d8@yourdomain.dom>
Date: Fri, 6 May 2016 06:34:33 +0200
From: Ingo Krabbe <ikrabbe.ask@gmail.com>
In-Reply-To: <20160505174203.GC14181@stefanha-x1.localdomain>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] TCP Segementation Offloading
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org
Cc: mst@redhat.com, jasowang@redhat.com

> On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
>> Good Mayday Qemu Developers,
>> 
>> today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.
>> 
>> When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.
> 
> I can't parse this sentence.  In what cases do you think it's a "known
> best practice" to disable tso and gso?  Maybe a table would be a clearer
> way to communicate this.
> 
> Can you provide a link to the source claiming tso and gso should be
> disabled?

Sorry for that long sentence. The consequence seems to be, that it is most stable to turn off tso and gso for host bridges and for adapters in virtual machines.

One of the most comprehensive collections of arguments is this article

	https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/

while I also found a documentation for Centos 6

	https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html

In google groups this one is discussed

	https://code.google.com/p/ganeti/wiki/PerformanceTuning

Of course the same is found for Xen Machines

	http://cloudnull.io/2012/07/xenserver-network-tuning/

You see there are several Links in the internet and my first question is: Why can't I find this discussion in the qemu-wiki space.

I think the bug

	https://bugs.launchpad.net/bugs/1202289

is related.

>> I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.
>> 
>> There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.
>> 
>> Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).
>> 
>> My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?
>> 
>> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
>> 
>> In our company we have similar or even worse problems with Centos7 hosts and guest machines.
> 
> Have haven't explained what problem you are experiencing.  If you want
> help with your setup please include your QEMU command-line (ps aux |
> grep qemu), the traffic pattern (ideally how to reproduce it with a
> benchmarking tool), and what observation you are making (e.g. netstat
> counters showing dropped packets).

I was quite astonished about the many hints about virtio drivers as we had this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.

	e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue             <0>#012  TDH                  <42>#012  TDT                  <42>#012  next_to_use          <2e>#012  next_to_clean        <42>#012buffer_info[next_to_clean]#012  time_stamp           <104aff1b8>#012  next_to_watch        <44>#012  jiffies              <104b00ee9>#012  next_to_watch.status <0>
	Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
	Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x270/0x280()
	Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit queue 0 timed out
	Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd pcspkr virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel virtio_pci e1000 i2c_core virtio_ring libata serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
	Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-327.13.1.el7.x86_64 #1
	Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
	Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb ffff88126f483d40 ffffffff8163571c
	Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200 0000000000000000 ffff881203b9a000
	Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001 0000000000000002 ffff88126f483de0
	Apr 25 21:08:48 db03 kernel: Call Trace:
	Apr 25 21:08:48 db03 kernel: <IRQ>  [<ffffffff8163571c>] dump_stack+0x19/0x1b
	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>] dev_watchdog+0x270/0x280
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>] call_timer_fn+0x36/0x110
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>] run_timer_softirq+0x237/0x340
	Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
	Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
	Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
	Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
	Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>] smp_apic_timer_interrupt+0x45/0x60
	Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>] apic_timer_interrupt+0x6d/0x80
	Apr 25 21:08:48 db03 kernel: <EOI>  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
	Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
	Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
	Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>] cpu_startup_entry+0x245/0x290
	Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>] start_secondary+0x1ba/0x230
	Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
	Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter


I'm still not sure why this happens on this host "db03", while db02 and db01 are not affected. All guests are running on different hosts and the network is controlled by an openvswitch.