From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jamie Lokier Subject: Re: ordered I/O with multipath Date: Wed, 8 Apr 2009 15:30:54 +0100 Message-ID: <20090408143054.GB3841@shareable.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org To: =?utf-8?B?6LCi57qy?= Return-path: Received: from mail2.shareable.org ([80.68.89.115]:44951 "EHLO mail2.shareable.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752137AbZDHOaz (ORCPT ); Wed, 8 Apr 2009 10:30:55 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-fsdevel-owner@vger.kernel.org List-ID: =E8=B0=A2=E7=BA=B2 wrote: > Some journal filesystem use barrier i/o to ensure the order of the > committing data. But if the filesystem is on the top of volume manage= r > which support the raid and multipath. The barrier i/o might not be > handled correctly. How does journal filesystem deal with this? =46or software RAID and multipath, I think it isn't handled at all. Even if you disable write-caching in the underlying storage, ordered requests may not retain their order, so the common database advice to disable write-cache and use SCSI or SATA-NCQ may not work either. If the RAID code is changed to handle barriers, that would still have possible "scattershot" corruption on RAID-5, because writing a single sector on the logical device affects more than one visible sector if it is interrupted. In other words, the "radius of corruption" is bigger than one sector for RAID-5, and it's not contiguous either. In principle, journalling filesystems need to know the "radius of corruption" to provide robust journalling. If individual sector writes are atomic, this isn't an issue. Some people think sector writes are atomic on modern hard drives (but I wouldn't count on it). But it is definitely not atomic when writing to a RAID or multipath if the write affects more than one device. -- Jamie -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html