From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1G6YPu-0002bP-Mz for qemu-devel@nongnu.org; Fri, 28 Jul 2006 15:54:38 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1G6YPu-0002ad-3p for qemu-devel@nongnu.org; Fri, 28 Jul 2006 15:54:38 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1G6YPu-0002aa-1F for qemu-devel@nongnu.org; Fri, 28 Jul 2006 15:54:38 -0400 Received: from [66.187.233.31] (helo=mx1.redhat.com) by monty-python.gnu.org with esmtp (Exim 4.52) id 1G6YRx-0006TX-9B for qemu-devel@nongnu.org; Fri, 28 Jul 2006 15:56:45 -0400 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id k6SJsbMn014191 for ; Fri, 28 Jul 2006 15:54:37 -0400 Received: from mail.boston.redhat.com (mail.boston.redhat.com [172.16.76.12]) by int-mx1.corp.redhat.com (8.12.11.20060308/8.12.11) with ESMTP id k6SJsVMf017293 for ; Fri, 28 Jul 2006 15:54:32 -0400 Received: from [10.0.13.9] (vpn-248-25.boston.redhat.com [10.13.248.25]) by mail.boston.redhat.com (8.12.8/8.12.8) with ESMTP id k6SJsUAB025431 for ; Fri, 28 Jul 2006 15:54:30 -0400 Message-ID: <44CA6B76.7000004@redhat.com> Date: Fri, 28 Jul 2006 15:54:30 -0400 From: Rik van Riel MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------090109060208020607050809" Subject: [Qemu-devel] [RFC][PATCH] make sure disk writes actually hit disk Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org This is a multi-part message in MIME format. --------------090109060208020607050809 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit This is the simple approach to making sure that disk writes actually hit disk before we tell the guest OS that IO has completed. Thanks to DMA_MULTI_THREAD the performance still seems to be adequate. A fancier solution would be to make the sync/non-sync behaviour of the qemu disk backing store tunable from the guest OS, by tuning the IDE disk write cache on/off with hdparm, and having hw/ide.c call ->fsync functions in the block backends. I'm willing to code up the fancy solution if people prefer that. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan --------------090109060208020607050809 Content-Type: text/x-patch; name="xen-hvm-osync.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="xen-hvm-osync.patch" Make sure disk writes really made it to disk before we report I/O completion to the guest domain. The DMA_MULTI_THREAD functionality from the qemu-dm IDE emulation should make the performance overhead of synchronous writes bearable, or at least comparable to native hardware. Signed-off-by: Rik van Riel --- xen-unstable-10712/tools/ioemu/block-bochs.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block-bochs.c 2006-07-28 02:21:08.000000000 -0400 @@ -91,7 +91,7 @@ int fd, i; struct bochs_header bochs; - fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); + fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block.c 2006-07-28 02:19:27.000000000 -0400 @@ -677,7 +677,7 @@ int rv; #endif - fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); + fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-cloop.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block-cloop.c 2006-07-28 02:17:13.000000000 -0400 @@ -55,7 +55,7 @@ BDRVCloopState *s = bs->opaque; uint32_t offsets_size,max_compressed_block_size=1,i; - s->fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); + s->fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE | O_SYNC); if (s->fd < 0) return -1; bs->read_only = 1; --- xen-unstable-10712/tools/ioemu/block-cow.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block-cow.c 2006-07-28 02:21:34.000000000 -0400 @@ -69,7 +69,7 @@ struct cow_header_v2 cow_header; int64_t size; - fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); + fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-qcow.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block-qcow.c 2006-07-28 02:20:05.000000000 -0400 @@ -95,7 +95,7 @@ int fd, len, i, shift; QCowHeader header; - fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); + fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --- xen-unstable-10712/tools/ioemu/block-vmdk.c.osync 2006-07-28 02:15:56.000000000 -0400 +++ xen-unstable-10712/tools/ioemu/block-vmdk.c 2006-07-28 02:20:20.000000000 -0400 @@ -96,7 +96,7 @@ uint32_t magic; int l1_size; - fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE); + fd = open(filename, O_RDWR | O_BINARY | O_LARGEFILE | O_SYNC); if (fd < 0) { fd = open(filename, O_RDONLY | O_BINARY | O_LARGEFILE); if (fd < 0) --------------090109060208020607050809--