[Qemu-devel] TCP Segementation Offloading

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] TCP Segementation Offloading
@ 2016-05-01 12:31 Ingo Krabbe
  2016-05-05 17:42 ` Stefan Hajnoczi
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Krabbe @ 2016-05-01 12:31 UTC (permalink / raw)
  To: qemu-devel

Good Mayday Qemu Developers,

today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.

When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.

I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.

There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.

Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).

My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?

Most problem reports found relate to deprecated Centos6 qemu-kvm packages.

In our company we have similar or even worse problems with Centos7 hosts and guest machines.

I'm going to analyze these problems next week anyway and I woud be happy to share my observation with you. (Where can I register for the wiki, or whom should I sent my reports about this topic?).

Regards,

Ingo Krabbe

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] TCP Segementation Offloading
  2016-05-01 12:31 [Qemu-devel] TCP Segementation Offloading Ingo Krabbe
@ 2016-05-05 17:42 ` Stefan Hajnoczi
  2016-05-06  4:34   ` Ingo Krabbe
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-05-05 17:42 UTC (permalink / raw)
  To: Ingo Krabbe; +Cc: qemu-devel, Michael S. Tsirkin, jasowang

[-- Attachment #1: Type: text/plain, Size: 2453 bytes --]

On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> Good Mayday Qemu Developers,
> 
> today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.
> 
> When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.

I can't parse this sentence.  In what cases do you think it's a "known
best practice" to disable tso and gso?  Maybe a table would be a clearer
way to communicate this.

Can you provide a link to the source claiming tso and gso should be
disabled?

> I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.
> 
> There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.
> 
> Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).
> 
> My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?
> 
> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
> 
> In our company we have similar or even worse problems with Centos7 hosts and guest machines.

Have haven't explained what problem you are experiencing.  If you want
help with your setup please include your QEMU command-line (ps aux |
grep qemu), the traffic pattern (ideally how to reproduce it with a
benchmarking tool), and what observation you are making (e.g. netstat
counters showing dropped packets).

> I'm going to analyze these problems next week anyway and I woud be happy to share my observation with you. (Where can I register for the wiki, or whom should I sent my reports about this topic?).

I have CCed Michael Tsirkin and Jason Wang.  They do most of the
virtio-net development.

> 
> Regards,
> 
> Ingo Krabbe
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] TCP Segementation Offloading
  2016-05-05 17:42 ` Stefan Hajnoczi
@ 2016-05-06  4:34   ` Ingo Krabbe
  2016-05-06 16:28     ` Stefan Hajnoczi
  0 siblings, 1 reply; 5+ messages in thread
From: Ingo Krabbe @ 2016-05-06  4:34 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, jasowang

> On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
>> Good Mayday Qemu Developers,
>> 
>> today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.
>> 
>> When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.
> 
> I can't parse this sentence.  In what cases do you think it's a "known
> best practice" to disable tso and gso?  Maybe a table would be a clearer
> way to communicate this.
> 
> Can you provide a link to the source claiming tso and gso should be
> disabled?

Sorry for that long sentence. The consequence seems to be, that it is most stable to turn off tso and gso for host bridges and for adapters in virtual machines.

One of the most comprehensive collections of arguments is this article

	https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/

while I also found a documentation for Centos 6

	https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html

In google groups this one is discussed

	https://code.google.com/p/ganeti/wiki/PerformanceTuning

Of course the same is found for Xen Machines

	http://cloudnull.io/2012/07/xenserver-network-tuning/

You see there are several Links in the internet and my first question is: Why can't I find this discussion in the qemu-wiki space.

I think the bug

	https://bugs.launchpad.net/bugs/1202289

is related.

>> I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.
>> 
>> There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.
>> 
>> Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).
>> 
>> My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?
>> 
>> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
>> 
>> In our company we have similar or even worse problems with Centos7 hosts and guest machines.
> 
> Have haven't explained what problem you are experiencing.  If you want
> help with your setup please include your QEMU command-line (ps aux |
> grep qemu), the traffic pattern (ideally how to reproduce it with a
> benchmarking tool), and what observation you are making (e.g. netstat
> counters showing dropped packets).

I was quite astonished about the many hints about virtio drivers as we had this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.

	e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue             <0>#012  TDH                  <42>#012  TDT                  <42>#012  next_to_use          <2e>#012  next_to_clean        <42>#012buffer_info[next_to_clean]#012  time_stamp           <104aff1b8>#012  next_to_watch        <44>#012  jiffies              <104b00ee9>#012  next_to_watch.status <0>
	Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
	Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x270/0x280()
	Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit queue 0 timed out
	Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd pcspkr virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel virtio_pci e1000 i2c_core virtio_ring libata serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
	Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-327.13.1.el7.x86_64 #1
	Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
	Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb ffff88126f483d40 ffffffff8163571c
	Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200 0000000000000000 ffff881203b9a000
	Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001 0000000000000002 ffff88126f483de0
	Apr 25 21:08:48 db03 kernel: Call Trace:
	Apr 25 21:08:48 db03 kernel: <IRQ>  [<ffffffff8163571c>] dump_stack+0x19/0x1b
	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>] dev_watchdog+0x270/0x280
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>] call_timer_fn+0x36/0x110
	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
	Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>] run_timer_softirq+0x237/0x340
	Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
	Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
	Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
	Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
	Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>] smp_apic_timer_interrupt+0x45/0x60
	Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>] apic_timer_interrupt+0x6d/0x80
	Apr 25 21:08:48 db03 kernel: <EOI>  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
	Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
	Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
	Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>] cpu_startup_entry+0x245/0x290
	Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>] start_secondary+0x1ba/0x230
	Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
	Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter


I'm still not sure why this happens on this host "db03", while db02 and db01 are not affected. All guests are running on different hosts and the network is controlled by an openvswitch.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] TCP Segementation Offloading
  2016-05-06  4:34   ` Ingo Krabbe
@ 2016-05-06 16:28     ` Stefan Hajnoczi
  2016-05-09 12:12       ` Michael S. Tsirkin
  0 siblings, 1 reply; 5+ messages in thread
From: Stefan Hajnoczi @ 2016-05-06 16:28 UTC (permalink / raw)
  To: Ingo Krabbe; +Cc: qemu-devel, jasowang, mst

[-- Attachment #1: Type: text/plain, Size: 8555 bytes --]

On Fri, May 06, 2016 at 06:34:33AM +0200, Ingo Krabbe wrote:
> > On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> >> Good Mayday Qemu Developers,
> >> 
> >> today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.
> >> 
> >> When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.
> > 
> > I can't parse this sentence.  In what cases do you think it's a "known
> > best practice" to disable tso and gso?  Maybe a table would be a clearer
> > way to communicate this.
> > 
> > Can you provide a link to the source claiming tso and gso should be
> > disabled?
> 
> Sorry for that long sentence. The consequence seems to be, that it is most stable to turn off tso and gso for host bridges and for adapters in virtual machines.
> 
> One of the most comprehensive collections of arguments is this article
> 
> 	https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/
> 
> while I also found a documentation for Centos 6
> 
> 	https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html

This documentation is about (ancient) RHEL 3.9 guests.  I would not
apply anything on that page to modern Linux distro releases without
re-checking.

> 
> In google groups this one is discussed
> 
> 	https://code.google.com/p/ganeti/wiki/PerformanceTuning
> 
> Of course the same is found for Xen Machines
> 
> 	http://cloudnull.io/2012/07/xenserver-network-tuning/
> 
> You see there are several Links in the internet and my first question is: Why can't I find this discussion in the qemu-wiki space.
> 
> I think the bug
> 
> 	https://bugs.launchpad.net/bugs/1202289
> 
> is related.

Thanks for posting all the links!

I hope Michael and/or Jason explain the current status for RHEL 6/7 and
other modern distros.  Maybe they can also follow up with the kris.io
blog author if an update to the post is necessary.

TSO/GSO is enabled by default on my Fedora and RHEL host/guests.  If it
was a best practice for those distros I'd expect the default settings to
reflect that.  Also, I would be surprised if the offload features were
bad since work was put into supporting and extending them in virtio-net
over the years.

> >> I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.
> >> 
> >> There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.
> >> 
> >> Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).
> >> 
> >> My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?
> >> 
> >> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
> >> 
> >> In our company we have similar or even worse problems with Centos7 hosts and guest machines.
> > 
> > Have haven't explained what problem you are experiencing.  If you want
> > help with your setup please include your QEMU command-line (ps aux |
> > grep qemu), the traffic pattern (ideally how to reproduce it with a
> > benchmarking tool), and what observation you are making (e.g. netstat
> > counters showing dropped packets).
> 
> I was quite astonished about the many hints about virtio drivers as we had this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.
> 
> 	e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue             <0>#012  TDH                  <42>#012  TDT                  <42>#012  next_to_use          <2e>#012  next_to_clean        <42>#012buffer_info[next_to_clean]#012  time_stamp           <104aff1b8>#012  next_to_watch        <44>#012  jiffies              <104b00ee9>#012  next_to_watch.status <0>
> 	Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
> 	Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x270/0x280()
> 	Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit queue 0 timed out
> 	Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd pcspkr virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel virtio_pci e1000 i2c_core virtio_ring libata serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
> 	Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-327.13.1.el7.x86_64 #1
> 	Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> 	Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb ffff88126f483d40 ffffffff8163571c
> 	Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200 0000000000000000 ffff881203b9a000
> 	Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001 0000000000000002 ffff88126f483de0
> 	Apr 25 21:08:48 db03 kernel: Call Trace:
> 	Apr 25 21:08:48 db03 kernel: <IRQ>  [<ffffffff8163571c>] dump_stack+0x19/0x1b
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>] dev_watchdog+0x270/0x280
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>] call_timer_fn+0x36/0x110
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>] run_timer_softirq+0x237/0x340
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>] smp_apic_timer_interrupt+0x45/0x60
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>] apic_timer_interrupt+0x6d/0x80
> 	Apr 25 21:08:48 db03 kernel: <EOI>  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>] cpu_startup_entry+0x245/0x290
> 	Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>] start_secondary+0x1ba/0x230
> 	Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
> 	Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter
> 
> 
> I'm still not sure why this happens on this host "db03", while db02 and db01 are not affected. All guests are running on different hosts and the network is controlled by an openvswitch.

This looks interesting.  It could be a bug in QEMU's e1000 NIC
emulation.  Maybe it has already been fixed in qemu.git but I didn't see
any relevant commits.

Please post the RPM version numbers you are using (rpm -qa | grep qemu
in host, rpm -qa | grep kernel in host).

The e1000 driver can print additional information (to dump the contents
of the tx ring).  Please increase your kernel's log level to collect
that information:
 # echo 8 >/proc/sys/kernel/printk

The tx ring dump may allow someone to figure out why the packet caused
tx to stall.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] TCP Segementation Offloading
  2016-05-06 16:28     ` Stefan Hajnoczi
@ 2016-05-09 12:12       ` Michael S. Tsirkin
  0 siblings, 0 replies; 5+ messages in thread
From: Michael S. Tsirkin @ 2016-05-09 12:12 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Ingo Krabbe, qemu-devel, jasowang

On Fri, May 06, 2016 at 05:28:55PM +0100, Stefan Hajnoczi wrote:
> On Fri, May 06, 2016 at 06:34:33AM +0200, Ingo Krabbe wrote:
> > > On Sun, May 01, 2016 at 02:31:57PM +0200, Ingo Krabbe wrote:
> > >> Good Mayday Qemu Developers,
> > >> 
> > >> today I tried to find a reference to a networking problem, that seems to be of quite general nature: TCP Segmentation Offloading (TSO) in virtual environments.
> > >> 
> > >> When I setup TAP network adapter for a virtual machine and put it into a host bridge, the known best practice is to manually set "tso off gso off" with ethtool, for the guest driver if I use a hardware emulation, such as e1000 and/or "tso off gso off" for the host driver and/or for the bridge adapter, if I use the virtio driver, as otherwise you experience (sometimes?) performance problems or even lost packages.
> > > 
> > > I can't parse this sentence.  In what cases do you think it's a "known
> > > best practice" to disable tso and gso?  Maybe a table would be a clearer
> > > way to communicate this.
> > > 
> > > Can you provide a link to the source claiming tso and gso should be
> > > disabled?
> > 
> > Sorry for that long sentence. The consequence seems to be, that it is most stable to turn off tso and gso for host bridges and for adapters in virtual machines.
> > 
> > One of the most comprehensive collections of arguments is this article
> > 
> > 	https://kris.io/2015/10/01/kvm-network-performance-tso-and-gso-turn-it-off/
> > 
> > while I also found a documentation for Centos 6
> > 
> > 	https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Virtualization_Host_Configuration_and_Guest_Installation_Guide/ch10s04.html
> 
> This documentation is about (ancient) RHEL 3.9 guests.  I would not
> apply anything on that page to modern Linux distro releases without
> re-checking.

I think this refers to
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/6.5_Technical_Notes/kernel.html

which lists a couple of TSO bugs.

These should have been addressed by now, and I don't see anything
like this in RHEL7 docs.


> > 
> > In google groups this one is discussed
> > 
> > 	https://code.google.com/p/ganeti/wiki/PerformanceTuning
> > 
> > Of course the same is found for Xen Machines
> > 
> > 	http://cloudnull.io/2012/07/xenserver-network-tuning/
> > 
> > You see there are several Links in the internet and my first question is: Why can't I find this discussion in the qemu-wiki space.
> > 
> > I think the bug
> > 
> > 	https://bugs.launchpad.net/bugs/1202289
> > 
> > is related.
> 
> Thanks for posting all the links!
> 
> I hope Michael and/or Jason explain the current status for RHEL 6/7 and
> other modern distros.  Maybe they can also follow up with the kris.io
> blog author if an update to the post is necessary.
> 
> TSO/GSO is enabled by default on my Fedora and RHEL host/guests.  If it
> was a best practice for those distros I'd expect the default settings to
> reflect that.  Also, I would be surprised if the offload features were
> bad since work was put into supporting and extending them in virtio-net
> over the years.

The unfortunate side-effect of documenting work-arounds is that people
get used to using them. TSO, s/g and checksum offloads are advanced
features, as such there's always a chance that using them makes you hit
a bug.  Enabling them gives better performance for most users so I think
that our defaults are good.


> > >> I haven't found a complete analysis of the background of these problems, but there seem to be some effects on MTU based fragmentation and UDP checksums.
> > >> 
> > >> There is a tso related bug on launchpad, but the context of this bug is too narrow, for the generality of the problem.
> > >> 
> > >> Also it seems that there is a problem in LXC contexts too (I found such a reference, without detailed description in a Post about Xen setup).
> > >> 
> > >> My question now is: Is there a bug in the driver code and shouldn't this be documented somewhere in wiki.qemu.org? Where there developments about this topic in the past or is there any planned/ongoing work todo on the qemu drivers?
> > >> 
> > >> Most problem reports found relate to deprecated Centos6 qemu-kvm packages.
> > >> 
> > >> In our company we have similar or even worse problems with Centos7 hosts and guest machines.
> > > 
> > > Have haven't explained what problem you are experiencing.  If you want
> > > help with your setup please include your QEMU command-line (ps aux |
> > > grep qemu), the traffic pattern (ideally how to reproduce it with a
> > > benchmarking tool), and what observation you are making (e.g. netstat
> > > counters showing dropped packets).
> > 
> > I was quite astonished about the many hints about virtio drivers as we had this problem with the e1000 driver in a Centos7 Guest on a Centos6 Host.
> > 
> > 	e1000 0000:00:03.0 ens3: Detected Tx Unit Hang#012  Tx Queue             <0>#012  TDH                  <42>#012  TDT                  <42>#012  next_to_use          <2e>#012  next_to_clean        <42>#012buffer_info[next_to_clean]#012  time_stamp           <104aff1b8>#012  next_to_watch        <44>#012  jiffies              <104b00ee9>#012  next_to_watch.status <0>
> > 	Apr 25 21:08:48 db03 kernel: ------------[ cut here ]------------
> > 	Apr 25 21:08:48 db03 kernel: WARNING: at net/sched/sch_generic.c:297 dev_watchdog+0x270/0x280()
> > 	Apr 25 21:08:48 db03 kernel: NETDEV WATCHDOG: ens3 (e1000): transmit queue 0 timed out
> > 	Apr 25 21:08:48 db03 kernel: Modules linked in: binfmt_misc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btrfs zlib_deflate raid6_pq xor ext4 mbcache jbd2 crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper i2c_piix4 ppdev cryptd pcspkr virtio_balloon parport_pc parport sg nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi virtio_scsi cirrus syscopyarea sysfillrect sysimgblt drm_kms_helper ttm drm crct10dif_pclmul crct10dif_common ata_piix crc32c_intel virtio_pci e1000 i2c_core virtio_ring libata serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod
> > 	Apr 25 21:08:48 db03 kernel: CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.10.0-327.13.1.el7.x86_64 #1
> > 	Apr 25 21:08:48 db03 kernel: Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
> > 	Apr 25 21:08:48 db03 kernel: ffff88126f483d88 685d892e8a452abb ffff88126f483d40 ffffffff8163571c
> > 	Apr 25 21:08:48 db03 kernel: ffff88126f483d78 ffffffff8107b200 0000000000000000 ffff881203b9a000
> > 	Apr 25 21:08:48 db03 kernel: ffff881201c3e080 0000000000000001 0000000000000002 ffff88126f483de0
> > 	Apr 25 21:08:48 db03 kernel: Call Trace:
> > 	Apr 25 21:08:48 db03 kernel: <IRQ>  [<ffffffff8163571c>] dump_stack+0x19/0x1b
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cd40>] dev_watchdog+0x270/0x280
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8108b0a6>] call_timer_fn+0x36/0x110
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8154cad0>] ? dev_graft_qdisc+0x80/0x80
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8108dd97>] run_timer_softirq+0x237/0x340
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff81084b0f>] __do_softirq+0xef/0x280
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff816477dc>] call_softirq+0x1c/0x30
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff81016fc5>] do_softirq+0x65/0xa0
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff81084ea5>] irq_exit+0x115/0x120
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff81648455>] smp_apic_timer_interrupt+0x45/0x60
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff81646b1d>] apic_timer_interrupt+0x6d/0x80
> > 	Apr 25 21:08:48 db03 kernel: <EOI>  [<ffffffff81058e96>] ? native_safe_halt+0x6/0x10
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8101dbcf>] default_idle+0x1f/0xc0
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff8101e4d6>] arch_cpu_idle+0x26/0x30
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff810d6325>] cpu_startup_entry+0x245/0x290
> > 	Apr 25 21:08:48 db03 kernel: [<ffffffff810475fa>] start_secondary+0x1ba/0x230
> > 	Apr 25 21:08:48 db03 kernel: ---[ end trace 71ac4360272e207e ]---
> > 	Apr 25 21:08:48 db03 kernel: e1000 0000:00:03.0 ens3: Reset adapter
> > 
> > 
> > I'm still not sure why this happens on this host "db03", while db02 and db01 are not affected. All guests are running on different hosts and the network is controlled by an openvswitch.
> 
> This looks interesting.  It could be a bug in QEMU's e1000 NIC
> emulation.  Maybe it has already been fixed in qemu.git but I didn't see
> any relevant commits.
> 
> Please post the RPM version numbers you are using (rpm -qa | grep qemu
> in host, rpm -qa | grep kernel in host).
> 
> The e1000 driver can print additional information (to dump the contents
> of the tx ring).  Please increase your kernel's log level to collect
> that information:
>  # echo 8 >/proc/sys/kernel/printk
> 
> The tx ring dump may allow someone to figure out why the packet caused
> tx to stall.
> 
> Stefan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-05-09 12:13 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-01 12:31 [Qemu-devel] TCP Segementation Offloading Ingo Krabbe
2016-05-05 17:42 ` Stefan Hajnoczi
2016-05-06  4:34   ` Ingo Krabbe
2016-05-06 16:28     ` Stefan Hajnoczi
2016-05-09 12:12       ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).