From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:47713) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XJsjo-0004FK-Eq for qemu-devel@nongnu.org; Tue, 19 Aug 2014 19:22:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XJsjk-0006ap-3y for qemu-devel@nongnu.org; Tue, 19 Aug 2014 19:22:48 -0400 Received: from xes-mad.com ([216.165.139.218]:9980) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XJsjj-0006Zs-U7 for qemu-devel@nongnu.org; Tue, 19 Aug 2014 19:22:44 -0400 Date: Tue, 19 Aug 2014 18:20:38 -0500 (CDT) From: Andrew Martin Message-ID: <838926932.102908.1408490438455.JavaMail.zimbra@xes-inc.com> In-Reply-To: <20140819145925.GB13680@stefanha-thinkpad.redhat.com> References: <1009168463.49610.1408133034828.JavaMail.zimbra@xes-inc.com> <985931631.51123.1408133895894.JavaMail.zimbra@xes-inc.com> <20140819145925.GB13680@stefanha-thinkpad.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: qemu-devel@nongnu.org ----- Original Message ----- > From: "Stefan Hajnoczi" > To: "Andrew Martin" > Cc: qemu-devel@nongnu.org > Sent: Tuesday, August 19, 2014 9:59:25 AM > Subject: Re: [Qemu-devel] Using cache=writeback safely on qemu 1.4.0 and later > > If you strace -f the QEMU process on the host, you will see fdatasync(2) > system calls when the guest flushes the disk. > > You can find the file descriptor number by checking ls -l > /proc/$PID_OF_QEMU/fd and looking for the disk image file. When the disk is set to cache=writethrough on one of the same VMs, I see frequent fdatasync(2) calls (every few seconds). However, when I change the disk over to cache=writeback, since boot I have not yet seen a single fdatasync(2) call, even after writing data 2x the amount of RAM: # time strace -ft -p4113 2>&1 | grep fdatasync ^C real 15m39.245s user 0m7.940s sys 0m18.280s Note that the disk is defined as follows:
> > I recently experienced UPS failure on several hosts which caused a hard > > shutdown. After restarting, 3 of the guests had corruption on their disks > > and > > required a fairly long fsck to fix. Afterwards, data that had been written > > to > > the disks several hours before the crash was corrupted, which makes me > > think > > that it was never fsync()-ed to the non-volatile storage. > > What exactly was the "corruption" you encountered? Which application, > error message, etc. Two of the servers are web servers with apache2. In one case, a python daemon copies JPGs onto the server - the last 100 copied onto the server were corrupted. In another case, some files had been uploaded several days prior to the www-root, but after the hard reset said files were no longer present in the filesystem. > > Is it safe in this setup to use cache=writeback? Or, should I use > > cache=writethrough instead? > > Ubuntu 12.04 is recent and sends write cache flushes. > > Are you sure the file system and/or application workload are flushing > the disk cache? Please check the mount options and application-specific > configuration. The mount options for the ext4 filesystem in the VM in both cases are: rw,relatime,errors=remount-ro,data=ordered Similarly, the host's ext4 filesystem holding the images is mounted with: rw,relatime,data=ordered I did not see any errors in the kernel log in the guest, probably because the root filesystem was read-only until the fsck had completed. Thanks, Andrew