qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Broken live Migration in Qemu 2.5.1.1?
@ 2016-06-06 13:32 Peter Lieven
  2016-06-06 15:51 ` Peter Lieven
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Lieven @ 2016-06-06 13:32 UTC (permalink / raw)
  To: qemu-devel@nongnu.org, qemu-stable

Hi,

during internal testing of Qemu 2.5.1.1 I found a vServer running Ubuntu 12.04 (kernel 3.13) and a slave SQL server to
stop replicating from the master. This seems to be reproducible. It is possible to continue replication when issuing a slave stop / slave start.
There is no error visible on the vServer.

Has anyone a fix in mind that could be related to such an issue?

Host kernel in Linux 4.4, Guest kernel 3.13. Guest driver is virtio-blk via iSCSI. Emulated vCPU is Westmere.

I already have this patch here applied:

target-i386: do not read/write MSR_TSC_AUX from KVM if CPUID bit is not set

Without this migration for Westmere and earlier CPUs is completely broken.

I also found the following commits in git which are not applied (yet):

migration (ordinary): move bdrv_invalidate_cache_all of of coroutine context
migration: fix incorrect memory_global_dirty_log_start outside BQL

I will try with these and also with current master.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Broken live Migration in Qemu 2.5.1.1?
  2016-06-06 13:32 [Qemu-devel] Broken live Migration in Qemu 2.5.1.1? Peter Lieven
@ 2016-06-06 15:51 ` Peter Lieven
  2016-06-06 16:13   ` [Qemu-devel] [Qemu-stable] " Stefan Priebe - Profihost AG
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Lieven @ 2016-06-06 15:51 UTC (permalink / raw)
  To: qemu-devel@nongnu.org, qemu-stable

Am 06.06.2016 um 15:32 schrieb Peter Lieven:
> Hi,
>
> during internal testing of Qemu 2.5.1.1 I found a vServer running Ubuntu 12.04 (kernel 3.13) and a slave SQL server to
> stop replicating from the master. This seems to be reproducible. It is possible to continue replication when issuing a slave stop / slave start.
> There is no error visible on the vServer.
>
> Has anyone a fix in mind that could be related to such an issue?
>
> Host kernel in Linux 4.4, Guest kernel 3.13. Guest driver is virtio-blk via iSCSI. Emulated vCPU is Westmere.

After a lot of testing I found out that obviously thats no block driver problem, but a regression in the virtio-net or the network stack.

qemu_announce_self() is generating packets for all NICs, but it seems they are no longer emitted. This worked at least in qemu-2.2.0 with
the same guest kernel and host kernel.

I will continue debugging tomorrow why this happens.

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] Broken live Migration in Qemu 2.5.1.1?
  2016-06-06 15:51 ` Peter Lieven
@ 2016-06-06 16:13   ` Stefan Priebe - Profihost AG
  2016-06-07  7:11     ` Peter Lieven
  2016-06-07  7:38     ` Peter Lieven
  0 siblings, 2 replies; 6+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-06-06 16:13 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel@nongnu.org, qemu-stable

We're most probably seeing the same while migrating a machine running balanceng but haven't thought this might be a qemu bug. Instead we're investigating with balanceng people.

Waiting for your further results.

Greets,
Stefan

Excuse my typo sent from my mobile phone.

> Am 06.06.2016 um 17:51 schrieb Peter Lieven <pl@kamp.de>:
> 
>> Am 06.06.2016 um 15:32 schrieb Peter Lieven:
>> Hi,
>> 
>> during internal testing of Qemu 2.5.1.1 I found a vServer running Ubuntu 12.04 (kernel 3.13) and a slave SQL server to
>> stop replicating from the master. This seems to be reproducible. It is possible to continue replication when issuing a slave stop / slave start.
>> There is no error visible on the vServer.
>> 
>> Has anyone a fix in mind that could be related to such an issue?
>> 
>> Host kernel in Linux 4.4, Guest kernel 3.13. Guest driver is virtio-blk via iSCSI. Emulated vCPU is Westmere.
> 
> After a lot of testing I found out that obviously thats no block driver problem, but a regression in the virtio-net or the network stack.
> 
> qemu_announce_self() is generating packets for all NICs, but it seems they are no longer emitted. This worked at least in qemu-2.2.0 with
> the same guest kernel and host kernel.
> 
> I will continue debugging tomorrow why this happens.
> 
> Peter
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] Broken live Migration in Qemu 2.5.1.1?
  2016-06-06 16:13   ` [Qemu-devel] [Qemu-stable] " Stefan Priebe - Profihost AG
@ 2016-06-07  7:11     ` Peter Lieven
  2016-06-07  7:38     ` Peter Lieven
  1 sibling, 0 replies; 6+ messages in thread
From: Peter Lieven @ 2016-06-07  7:11 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG
  Cc: qemu-devel@nongnu.org, qemu-stable, yanghy, jasowang

Am 06.06.2016 um 18:13 schrieb Stefan Priebe - Profihost AG:
> We're most probably seeing the same while migrating a machine running balanceng but haven't thought this might be a qemu bug. Instead we're investigating with balanceng people.
>
> Waiting for your further results.

This obviously is a regression. It was introduced between v2.4.0 and v2.5.0. I can clearly see RARPs being no longer emitted after this commit:

fefe2a78abde932e0f340b21bded2c86def1d242 is the first bad commit
commit fefe2a78abde932e0f340b21bded2c86def1d242
Author: Yang Hongyang <yanghy@cn.fujitsu.com>
Date:   Wed Oct 7 11:52:16 2015 +0800

     net: merge qemu_deliver_packet and qemu_deliver_packet_iov

     qemu_deliver_packet_iov already have the compat delivery, we
     can drop qemu_deliver_packet.

     Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
     Signed-off-by: Jason Wang <jasowang@redhat.com>


I will analyze further.

Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] Broken live Migration in Qemu 2.5.1.1?
  2016-06-06 16:13   ` [Qemu-devel] [Qemu-stable] " Stefan Priebe - Profihost AG
  2016-06-07  7:11     ` Peter Lieven
@ 2016-06-07  7:38     ` Peter Lieven
  2016-06-07  8:33       ` Stefan Priebe - Profihost AG
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Lieven @ 2016-06-07  7:38 UTC (permalink / raw)
  To: Stefan Priebe - Profihost AG; +Cc: qemu-devel@nongnu.org, qemu-stable

Am 06.06.2016 um 18:13 schrieb Stefan Priebe - Profihost AG:
> We're most probably seeing the same while migrating a machine running balanceng but haven't thought this might be a qemu bug. Instead we're investigating with balanceng people.
>
> Waiting for your further results.

Can you try the patch

net: fix qemu_announce_self not emitting packets

I just send to the mailing list.

Thanks,
Peter

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] [Qemu-stable] Broken live Migration in Qemu 2.5.1.1?
  2016-06-07  7:38     ` Peter Lieven
@ 2016-06-07  8:33       ` Stefan Priebe - Profihost AG
  0 siblings, 0 replies; 6+ messages in thread
From: Stefan Priebe - Profihost AG @ 2016-06-07  8:33 UTC (permalink / raw)
  To: Peter Lieven; +Cc: qemu-devel@nongnu.org, qemu-stable


Am 07.06.2016 um 09:38 schrieb Peter Lieven:
> Am 06.06.2016 um 18:13 schrieb Stefan Priebe - Profihost AG:
>> We're most probably seeing the same while migrating a machine running
>> balanceng but haven't thought this might be a qemu bug. Instead we're
>> investigating with balanceng people.
>>
>> Waiting for your further results.
> 
> Can you try the patch
> 
> net: fix qemu_announce_self not emitting packets

Thx - will wait for a new patch based on Paolo's comments - if that's ok.

Stefan

> I just send to the mailing list.
> 
> Thanks,
> Peter
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-07  8:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-06 13:32 [Qemu-devel] Broken live Migration in Qemu 2.5.1.1? Peter Lieven
2016-06-06 15:51 ` Peter Lieven
2016-06-06 16:13   ` [Qemu-devel] [Qemu-stable] " Stefan Priebe - Profihost AG
2016-06-07  7:11     ` Peter Lieven
2016-06-07  7:38     ` Peter Lieven
2016-06-07  8:33       ` Stefan Priebe - Profihost AG

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).