From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andreas Dilger Subject: Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO Date: Wed, 09 Sep 2009 18:13:03 +0200 Message-ID: <20090909161303.GC32450@webber.adilger.int> References: <87y6osry4k.fsf@mid.deneb.enyo.de> <20090907105014.GV4197@webber.adilger.int> <87y6oqd8lt.fsf@mid.deneb.enyo.de> <20090908072349.GC4197@webber.adilger.int> <20090908084720.GB31909@shareable.org> Mime-Version: 1.0 Content-Type: text/plain; CHARSET=US-ASCII Content-Transfer-Encoding: 7BIT Cc: Florian Weimer , linux-fsdevel@vger.kernel.org To: Jamie Lokier Return-path: Received: from sca-es-mail-2.Sun.COM ([192.18.43.133]:41133 "EHLO sca-es-mail-2.sun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751242AbZIIQN1 (ORCPT ); Wed, 9 Sep 2009 12:13:27 -0400 Received: from fe-sfbay-09.sun.com ([192.18.43.129]) by sca-es-mail-2.sun.com (8.13.7+Sun/8.12.9) with ESMTP id n89GD7pD025051 for ; Wed, 9 Sep 2009 09:13:19 -0700 (PDT) Content-disposition: inline Received: from conversion-daemon.fe-sfbay-09.sun.com by fe-sfbay-09.sun.com (Sun Java(tm) System Messaging Server 7u2-7.04 64bit (built Jul 2 2009)) id <0KPP00100OSR0D00@fe-sfbay-09.sun.com> for linux-fsdevel@vger.kernel.org; Wed, 09 Sep 2009 09:13:07 -0700 (PDT) In-reply-to: <20090908084720.GB31909@shareable.org> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sep 08, 2009 09:47 +0100, Jamie Lokier wrote: > Andreas Dilger wrote: > > On Sep 07, 2009 16:26 +0000, Florian Weimer wrote: > > > * Andreas Dilger: > > > > If you are implementing a tool to use this, I would code it to try > > > > FIEMAP first, then FIBMAP (if it is running as root, or it gets > > > > fixed in some future kernel), then just do without (as it most likely > > > > does already today). > > > > > > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the > > > file, so using it would still be unsafe for backup purposes. And I > > > really only need the number of the first block. (I want to optimize > > > reading of Maildir-style folders, mainly for backup purposes.) > > > > Given that the worst that can happen for your particular application > > if FIBMAP gets the wrong block number is a sub-optimal ordering for > > the file copy, there is no risk in doing this. > > > > For the FIEMAP code, since you only need the first block number, just > > pass it a single fiemap_extent so that it doesn't spend time generating > > a full list of extents that you don't need to use. > > With FIEMAP, filesystems which don't use extents will still scan a > potentially large region of the disk doing block lookups won't they, > just to maximise the size of the first extent in a large file? Partly true - if the caller only passes in space for a single extent to be returned then the traversal would stop as soon as a discontinuity is found. For ext2/ext3 this would happen at 12 blocks into the filesystem, because the first indirect block is allocated after the 12th data block, so the amount of scanning would be very small. > Assuming yes, FIBMAP would be faster than FIEMAP in those cases, which > may be a reason to give it the same permissions, so it can be used. By all means, I don't object to FIBMAP being fixed to allow non-root access, but until that happens (and then users start using those kernels) it won't work for non-root users. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.