From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40130) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1fYpRN-0006yB-9U for qemu-devel@nongnu.org; Fri, 29 Jun 2018 05:11:43 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1fYpRJ-0007Ps-QZ for qemu-devel@nongnu.org; Fri, 29 Jun 2018 05:11:41 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:36590) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1fYpRJ-0007PA-GE for qemu-devel@nongnu.org; Fri, 29 Jun 2018 05:11:37 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w5T95imL023627 for ; Fri, 29 Jun 2018 05:11:36 -0400 Received: from e06smtp01.uk.ibm.com (e06smtp01.uk.ibm.com [195.75.94.97]) by mx0a-001b2d01.pphosted.com with ESMTP id 2jwhpt8nqh-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 29 Jun 2018 05:11:36 -0400 Received: from localhost by e06smtp01.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 29 Jun 2018 10:11:33 +0100 Date: Fri, 29 Jun 2018 14:41:26 +0530 From: Balamuruhan S References: <20180627125604.15275-1-quintela@redhat.com> <20180627125604.15275-17-quintela@redhat.com> <20180628095547.GA24412@localhost.localdomain> <87h8lnw4pa.fsf@secure.laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <87h8lnw4pa.fsf@secure.laptop> Message-Id: <20180629091126.GA17425@localhost.localdomain> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PULL 16/16] migration: fix crash in when incoming client channel setup fails List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Juan Quintela Cc: qemu-devel@nongnu.org On Thu, Jun 28, 2018 at 01:06:25PM +0200, Juan Quintela wrote: > Balamuruhan S wrote: > > On Wed, Jun 27, 2018 at 02:56:04PM +0200, Juan Quintela wrote: > >> From: Daniel P. Berrang=E9 >=20 > .... >=20 > > Hi Juan, > > > > I tried to perform multifd enabled migration and from qemu monitor > > enabled mutlifd capability on source and target, > > (qemu) migrate_set_capability x-multifd on > > (qemu) migrate -d tcp:127.0.0.1:4444 > > > > The migration succeeds and its cool to have the feature :) >=20 > Thanks. >=20 > > (qemu) info migrate > > globals: > > store-global-state: on > > only-migratable: off > > send-configuration: on > > send-section-footer: on > > decompress-error-check: on > > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off > > zero-blocks: off compress: off events: off postcopy-ram: off x-colo: > > off release-ram: off block: off return-path: off > > pause-before-switchover: off x-multifd: on dirty-bitmaps: off > > postcopy-blocktime: off late-block-activate: off > > Migration status: completed > > total time: 1051 milliseconds > > downtime: 260 milliseconds > > setup: 17 milliseconds > > transferred ram: 8270 kbytes >=20 > What is your setup? This value looks really small. I can see that you I have applied this patchset to upstream qemu to test multifd migration, qemu commandline is as below, /home/bala/qemu/ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic = \ -vga none -machine pseries -m 4G,slots=3D32,maxmem=3D32G -smp 16,maxcpus=3D= 32 \ -device virtio-blk-pci,drive=3Drootdisk -drive file=3D/home/bala/hostos-p= pc64le.qcow2,\ if=3Dnone,cache=3Dnone,format=3Dqcow2,id=3Drootdisk -monitor telnet:127.0= .0.1:1234,\ server,nowait -net nic,model=3Dvirtio -net user -redir tcp:2000::22 > have 4GB of RAM, it should be a bit higher. And setup time is also > quite low from my experience. sure, I will try with 32G mem. I am not aware about the setup time value. >=20 > > throughput: 143.91 mbps >=20 > I don't know what networking are you using, but my experience is that > increasing packet_count to 64 or so helps a lot to increase bandwidth. how do I configure packet_count to 64 ? >=20 > What is your networking, page_count and number of channels? I tried local host migration but need to work on multihost migration. page_count and number of channels are default values, x-multifd-channels: 2 x-multifd-page-count: 16 >=20 > > remaining ram: 0 kbytes > > total ram: 4194560 kbytes > > duplicate: 940989 pages > > skipped: 0 pages > > normal: 109635 pages > > normal bytes: 438540 kbytes > > dirty sync count: 3 > > page size: 4 kbytes > > > > > > But when I just enable the multifd in souce but not in target > > > > source: > > x-multifd: on > > > > target: > > x-multifd: off > > > > when migration is triggered with, > > migrate -d tcp:127.0.0.1:4444 (port I used) > > > > The VM is lost in source with Segmentation fault. > > > > I think the correct way is to enable multifd on both source and targe= t > > similar to postcopy, but in this negative scenario we should consider > > the right way of handling not to loose the VM instead error out > > appropriately. >=20 > It is necesary to enable both sides. And it "used" to be that it > dectected correctly when it was not enable on one of the sides. Check > should be lost in some rebase, or any other change. >=20 > Will take a look. Thank you. -- Bala >=20 > > Please correct me if I miss something. >=20 > Sure, thanks for the report. >=20 > Later, Juan. >=20