From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1GA9c0-0007xa-PJ for qemu-devel@nongnu.org; Mon, 07 Aug 2006 14:14:00 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1GA9bz-0007v1-3E for qemu-devel@nongnu.org; Mon, 07 Aug 2006 14:14:00 -0400 Received: from [199.232.76.173] (helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1GA9by-0007ug-QW for qemu-devel@nongnu.org; Mon, 07 Aug 2006 14:13:58 -0400 Received: from [64.233.166.181] (helo=py-out-1112.google.com) by monty-python.gnu.org with esmtp (Exim 4.52) id 1GA9gG-0006lO-K2 for qemu-devel@nongnu.org; Mon, 07 Aug 2006 14:18:24 -0400 Received: by py-out-1112.google.com with SMTP id c63so201021pyc for ; Mon, 07 Aug 2006 11:13:57 -0700 (PDT) Message-ID: Date: Mon, 7 Aug 2006 20:13:57 +0200 From: "Thomas Steffen" Subject: Re: [Qemu-devel] Ensuring data is written to disk In-Reply-To: <44D73C0C.5010205@armiento.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20060801101743.GA31760@mail.shareable.org> <20060801104539.GO31908@suse.de> <20060801141705.GA7779@mail.shareable.org> <20060801190505.GA20108@suse.de> <20060801215046.GA15095@mail.shareable.org> <20060802065108.GK20108@suse.de> <44D73C0C.5010205@armiento.net> Reply-To: qemu-devel@nongnu.org List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 8/7/06, R. Armiento wrote: > Lets assume this typical website setup: HARDWARE: commodity SATA/PATA; > drive cache is not battery backed up. HOST OS: late Linux 2.6 kernel > (e.g. 2.6.15), directly, on top of host, a recent version of database > software (e.g. MySQL 5.1). Running in ~ 'production'. > > Now, if I understand the foregoing discussion: the *only* way of running > this setup with 'full' transactional guarantees on power loss, without > having to change/patch the Linux kernel, is to turn off write-caching? > And that severely decreases performance. And some IDE disks do not let you switch off write-caching. So as far as I know, you need SCSI for transactional guarantees. SATA might work, but since so many things "should work" and then don't in SATA, I would be very careful. > To reiterate the foregoing discussion: fsync in ext3 only goes to the > drive cache. ResiserFS v3, which is included in the kernel, does not > guarantee data integrity on power loss. I have heard this before. Basically, the OS can interprete the fsync command as a request to flush all caches, or it can interprete it as a write barrier. The later gives much higher performance and guarantees the consistency of the disk content, but it does not guarantee the consistency with the rest of the world. My impression was that Linux only does the later, but I did not find a lot of information on this. > This is somewhat surprising to me, given claims of data integrity made > by both ext3 and MySQL documentation. I don't have any problems with that. Both MySQL and ext3 are transaction safe if used on a correct disk (SCSI). But if your disk does not handle sync correctly, then the resulting system cannot be transaction safe. > And then, on top of this, if one instead runs the database in a QEMU > with a late Linux 2.6 kernel, one are just making data-loss more likely, > right? So QEMU is in no way to blame for any of this. If qemu works correctly: yes. It would be interesting to test that. > However, this severely decreases performance. Also note: in MySQL the > MyISAM table type still does not guarantee no data loss; you need innoDB > for that. Correct, and MyISAM is much more popular, because it is faster. Thomas