From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [RFC][PATCH 0/5] Fiemap, an extent mapping ioctl Date: Thu, 29 May 2008 09:01:34 -0400 Message-ID: <20080529130134.GA21299@infradead.org> References: <20080525000148.GJ8325@wotan.suse.de> <200805270948.51898.chris.mason@oracle.com> <483C3C4F.1090903@hp.com> <200805271319.29986.chris.mason@oracle.com> <20080528160931.GG7263@webber.adilger.int> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Chris Mason , jim owens , linux-fsdevel@vger.kernel.org, Christoph Hellwig , Mark Fasheh , Andreas Dilger , Kalpak Shah , Eric Sandeen , Josef Bacik To: Andreas Dilger Return-path: Received: from bombadil.infradead.org ([18.85.46.34]:57870 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755822AbYE2NBm (ORCPT ); Thu, 29 May 2008 09:01:42 -0400 Content-Disposition: inline In-Reply-To: <20080528160931.GG7263@webber.adilger.int> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Wed, May 28, 2008 at 10:09:31AM -0600, Andreas Dilger wrote: > ... but I don't think it should necessarily be _required_ to return a > real "dev_t" (major, minor) device. For network filesystems this is > meaningless. If it is possible for FIEMAP_EXTENT_NET to signal that the > device is not a local/physical device (where a dev_t has no meaning), > and simply allow an enumeration [0, 1, 2, ...] of the logical devices > then I think this is reasonable. The mapping of logical devices to > servers is available separately with a Lustre-specific ioctl. > > This passes more information for filesystems that have local devices > while not breaking the functionality for network filesystems and could > be used as an efficient replacement for lilo's use of FIBMAP. A dev_t actually means something for the only in-tree users of this interface, so there's no point making this interface worse for some long-term out of tree code. And it's not like you simply can't allow multiple anonymous blockdevices for your networked filesystems similar to the one used for st_dev already. > For RAID1/10 you can return multiple logical->physical extent mappings > for the same logical range of the file with different "device" IDs. You > could do the same for RAID5 returning each of the data and parity chunks > with "NO_DIRECT" if desired (maybe only on the parity extent, or don't > return the parity extent at all). The spec does not require that the > returned extents be non-overlapping. Umm, no. That's just make the interface too complicated. I can bet with your that userspace programmers will generally only test their code with simple filesystems and hell will break lose when they get these multiple ranges. Especially as that's a very unnatural interface.