From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1Mgaqt-0005If-C4 for qemu-devel@nongnu.org; Thu, 27 Aug 2009 05:01:03 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1Mgaqo-0005GY-Nj for qemu-devel@nongnu.org; Thu, 27 Aug 2009 05:01:02 -0400 Received: from [199.232.76.173] (port=33164 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1Mgaqo-0005GV-LR for qemu-devel@nongnu.org; Thu, 27 Aug 2009 05:00:58 -0400 Received: from mail2.shareable.org ([80.68.89.115]:53386) by monty-python.gnu.org with esmtps (TLS-1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.60) (envelope-from ) id 1Mgaqo-0006RL-5h for qemu-devel@nongnu.org; Thu, 27 Aug 2009 05:00:58 -0400 Date: Thu, 27 Aug 2009 10:00:51 +0100 From: Jamie Lokier Subject: Re: [Qemu-devel] Re: Notes on block I/O data integrity Message-ID: <20090827090051.GD22631@shareable.org> References: <20090825181120.GA4863@lst.de> <90eb1dc70908251233m4b90ddfuabb4d26bccd62c63@mail.gmail.com> <20090825193621.GA19778@lst.de> <20090826185755.GF25726@shareable.org> <20090826221722.GA1962@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090826221722.GA1962@lst.de> List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Christoph Hellwig Cc: rusty@rustcorp.com.au, qemu-devel@nongnu.org, kvm@vger.kernel.org, Javier Guerra Christoph Hellwig wrote: > On Wed, Aug 26, 2009 at 07:57:55PM +0100, Jamie Lokier wrote: > > Christoph Hellwig wrote: > > > > what about LVM? iv'e read somewhere that it used to just eat barriers > > > > used by XFS, making it less safe than simple partitions. > > > > > > Oh, any additional layers open another by cans of worms. On Linux until > > > very recently using LVM or software raid means only disabled > > > write caches are safe. > > > > I believe that's still true except if there's more than one backing > > drive, so software RAID still isn't safe. Did that change? > > Yes, it did change. > I will recommend to keep doing what people caring for their data > have been doing since these volatile write caches came up: turn them > off. Unfortunately I tried that on a batch of 1000 or so embedded thingies with ext3, and the write performance plummeted. They are the same thingies where I observed lack of barriers resulting in filesystem corruption after power failure. We really need barriers with ATA disks to get decent write performance. It's a good recommendation though. > That being said with the amount of bugs in filesystems related to > write barriers my expectation for the RAID and device mapper code is > not too high. Turning off volatile write cache does not provide commit integrity with RAID. RAID needs barriers to plug, drain and unplug the queues across all backing devices in a coordinated manner quite apart from the volatile write cache. And then there's still that pesky problem of writes which reach one disk and not it's parity disk. Unfortunately turning off the volatile write caches could actually make the timing window for failure worse, in the case of system crash without power failure. -- Jamie