From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1NFCB7-0003BX-KN for qemu-devel@nongnu.org; Mon, 30 Nov 2009 14:44:57 -0500 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1NFCB4-0003Aa-2s for qemu-devel@nongnu.org; Mon, 30 Nov 2009 14:44:57 -0500 Received: from [199.232.76.173] (port=54742 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1NFCB3-0003AR-Ng for qemu-devel@nongnu.org; Mon, 30 Nov 2009 14:44:53 -0500 Received: from fmmailgate01.web.de ([217.72.192.221]:49497) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1NFCB3-0003L5-1d for qemu-devel@nongnu.org; Mon, 30 Nov 2009 14:44:53 -0500 Message-ID: <4B1420B3.3030404@web.de> Date: Mon, 30 Nov 2009 20:44:51 +0100 From: Jan Kiszka MIME-Version: 1.0 Subject: Re: [Qemu-devel] [PATCH 00/23] block migration: Fixes, cleanups and speedups References: <20091130172119.22889.28114.stgit@mchn012c.ww002.siemens.net> <4B141038.2030909@codemonkey.ws> <5D82F732-EB1C-46D7-B179-33CD69732F12@irisa.fr> <4B141C2E.6020502@web.de> <3E8317F1-756A-4986-BA69-FD836B5DAA52@irisa.fr> In-Reply-To: <3E8317F1-756A-4986-BA69-FD836B5DAA52@irisa.fr> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC16431D07EB98B7F8E01B845" Sender: jan.kiszka@web.de List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Pierre Riteau Cc: Liran Schour , qemu-devel@nongnu.org This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC16431D07EB98B7F8E01B845 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Pierre Riteau wrote: > On 30 nov. 2009, at 20:25, Jan Kiszka wrote: >=20 >> Pierre Riteau wrote: >>> On 30 nov. 2009, at 19:34, Anthony Liguori wrote: >>> >>>> Jan Kiszka wrote: >>>>> This series is a larger rework of the block migration support qemu >>>>> recently gained. Besides lots of code refactorings the major change= s >>>>> are: >>>>> - Faster restore due to larger block sizes (even if the target disk= is >>>>> unallocated) >>>>> - Off-by-one fixes in the block dirty tracking code >>>>> - Allow for multiple migrations (after cancellation or if migrating= >>>>> into a backup image) >>>>> - Proper error handling >>>>> - Progress reporting fixes: report to monitor instead of stdout, re= port >>>>> sum of multiple disks >>>>> - Report disk migration progress via 'info migrate' >>>>> - Progress report during restore >>>>> >>>>> One patch is directly taken from Pierre Riteau queue [1] who happen= d to >>>>> work on the some topic the last days, two more are derived from his= >>>>> commits. >>>>> >>>>> These patches make block migration usable for us. Still, there are = two >>>>> more major improvements on my wish/todo list: >>>>> - Respect specified maximum migration downtime (will require tracki= ng >>>>> of the number of dirty blocks + some coordination with ram migrati= on) >>>>> - Do not transfere unallocated disk space (also for raw images, ie.= add >>>>> bdrv_is_allocated support for the latter) >>>>> >>>>> In an off-list chat, Liran additionally brought up the topic that R= AM >>>>> migration should not start too early so that we avoid re-transmitti= ng >>>>> dirty pages over and over again while the disk image is slowly beam= ed >>>>> over. >>>>> >>>>> I hope we can join our efforts to resolve the open topics quickly, = the >>>>> critical ones ideally before the merge window closes. >>>>> >>>> That really needs to happen no later than the end of this week. >>>> >>>> So Pierre/Liran, what do you think about Jan's series? >>>> >>>> Regards, >>>> >>>> Anthony Liguori >>> >>> I'm currently testing these patches. Here are a few issues I noticed,= before I forget about them. >>> >>> - "migrate -d -b tcp:dest:port" works, but "migrate -b -d tcp:dest:po= rt" doesn't, although "help migrate" doesn't really specify ordering as i= mportant. But anyway I think Liran is working on a new version of the com= mand. >> Saw that too. I think the monitor commands simply do very primitive >> option parsing so far. Should be addressed if the final format comes >> with this issue as well. >> >>> - We use bdrv_aio_readv() to read blocks from the disk. This function= increments rd_bytes and rd_ops, which are reported by "info blockstats".= I don't think this read operations should appear in VM activity, especia= lly if this interface is used by libvirt to report VM stats (and draw gra= phs in virt-manager, etc.). Same for write stats. >> Ack. >> >>> - We may need to call bdrv_reset_dirty() _before_ sending the data, t= o be sure the block is not rewritten in the meantime (maybe it's an issue= only with kvm?) >> Can you elaborate? Even in case of multi-threaded qemu, the iomutex >> should protect us here. >=20 > I only said that because I remember seeing this kind of behavior, but w= ith ram migration on kvm. > As I'm not familiar with the I/O emulation in qemu, if you say that it'= s OK, no problem. RAM is different as RAM access need not be synchronized across the vcpus and the iothread. >=20 > By multi-threaded, are you talking about the IO thread feature? Yes (which also includes per vcpu threads). >=20 >>> - I seem to remember that disk images with 0 size are now possible. I= 'm afraid we will hit a divide by zero in this case: "progress =3D comple= ted_sector_sum * 100 / block_mig_state.total_sector_sum;" >> Although I don't see their use, it should be handled gracefully, likel= y >> by skipping such disks. >=20 > From a patch by Stefan Weil a few weeks ago: >=20 >> Images with disk size 0 may be used for >> VM snapshots, but not to save normal block data. >> >> It is possible to create such images using >> qemu-img, but opening them later fails. >> >> So even "qemu-img info image.qcow2" is not >> possible for an image created with >> "qemu-img create -f qcow2 image.qcow2 0". >=20 > I'm not sure if that concerns us... >=20 Good point. Then my add-on patch is definitely required. Jan --------------enigC16431D07EB98B7F8E01B845 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAksUILMACgkQitSsb3rl5xR1IACdFGQ/cwBGM9gM3DcBI8hnyrhz J5AAoKxuyWwDsTXPmAv4tiRtFudAY9wM =0eCB -----END PGP SIGNATURE----- --------------enigC16431D07EB98B7F8E01B845--