From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MGIsI-0000fx-0z for qemu-devel@nongnu.org; Mon, 15 Jun 2009 16:33:50 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MGIsC-0000e8-GY for qemu-devel@nongnu.org; Mon, 15 Jun 2009 16:33:48 -0400 Received: from [199.232.76.173] (port=53228 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MGIsC-0000e4-9X for qemu-devel@nongnu.org; Mon, 15 Jun 2009 16:33:44 -0400 Received: from e7.ny.us.ibm.com ([32.97.182.137]:46185) by monty-python.gnu.org with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1MGIsB-0003Pm-Tv for qemu-devel@nongnu.org; Mon, 15 Jun 2009 16:33:44 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e7.ny.us.ibm.com (8.13.1/8.13.1) with ESMTP id n5FKLTdB026793 for ; Mon, 15 Jun 2009 16:21:29 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay02.pok.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n5FKXg6f252532 for ; Mon, 15 Jun 2009 16:33:42 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n5FKXgKt020863 for ; Mon, 15 Jun 2009 16:33:42 -0400 Message-ID: <4A36B025.2080602@us.ibm.com> Date: Mon, 15 Jun 2009 15:33:41 -0500 From: Anthony Liguori MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] Live migration broken when under heavy IO List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "qemu-devel@nongnu.org" , kvm-devel The basic issue is that: migrate_fd_put_ready(): bdrv_flush_all(); Does: block.c: foreach block driver: drv->flush(bs); Which in the case of raw, is just fsync(s->fd). Any submitted request is not queued or flushed which will lead to the request being dropped after the live migration. Is anyone working on fixing this? Does anyone have a clever idea how to fix this without just waiting for all IO requests to complete? --- Regards, Anthony Liguori