* FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO @ 2009-09-06 13:39 Florian Weimer 2009-09-07 10:50 ` Andreas Dilger 0 siblings, 1 reply; 7+ messages in thread From: Florian Weimer @ 2009-09-06 13:39 UTC (permalink / raw) To: linux-fsdevel The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's that? Is it that there is no backwards-compatible way to introduce locking on the bmap path? (Sorting file access based on the first block number of the file really improves performance, even compared to inode number sorting, that's why I'm asking.) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-06 13:39 FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO Florian Weimer @ 2009-09-07 10:50 ` Andreas Dilger 2009-09-07 13:28 ` Theodore Tso 2009-09-07 16:26 ` Florian Weimer 0 siblings, 2 replies; 7+ messages in thread From: Andreas Dilger @ 2009-09-07 10:50 UTC (permalink / raw) To: Florian Weimer; +Cc: linux-fsdevel On Sep 06, 2009 13:39 +0000, Florian Weimer wrote: > The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's > that? Is it that there is no backwards-compatible way to introduce > locking on the bmap path? I'm not sure why there is a root-only requirement for FIBMAP, but the FIEMAP data is definitely useful even for non-root users for many reasons, such as optimized file copies/rsync/tar/etc skipping holes in sparse files easily. If you are implementing a tool to use this, I would code it to try FIEMAP first, then FIBMAP (if it is running as root, or it gets fixed in some future kernel), then just do without (as it most likely does already today). Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-07 10:50 ` Andreas Dilger @ 2009-09-07 13:28 ` Theodore Tso 2009-09-07 16:26 ` Florian Weimer 1 sibling, 0 replies; 7+ messages in thread From: Theodore Tso @ 2009-09-07 13:28 UTC (permalink / raw) To: Andreas Dilger; +Cc: Florian Weimer, linux-fsdevel On Mon, Sep 07, 2009 at 12:50:14PM +0200, Andreas Dilger wrote: > On Sep 06, 2009 13:39 +0000, Florian Weimer wrote: > > The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's > > that? Is it that there is no backwards-compatible way to introduce > > locking on the bmap path? > > I'm not sure why there is a root-only requirement for FIBMAP Historical reasons, I suspect --- back then only LILO needed it, and whoever added it decided it was safer only to allow root to have access to it. I don't think there's any good justification for FIBMAP to require privileges. - Ted ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-07 10:50 ` Andreas Dilger 2009-09-07 13:28 ` Theodore Tso @ 2009-09-07 16:26 ` Florian Weimer 2009-09-08 7:23 ` Andreas Dilger 1 sibling, 1 reply; 7+ messages in thread From: Florian Weimer @ 2009-09-07 16:26 UTC (permalink / raw) To: Andreas Dilger; +Cc: linux-fsdevel * Andreas Dilger: > On Sep 06, 2009 13:39 +0000, Florian Weimer wrote: >> The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's >> that? Is it that there is no backwards-compatible way to introduce >> locking on the bmap path? > > I'm not sure why there is a root-only requirement for FIBMAP, but the > FIEMAP data is definitely useful even for non-root users for many > reasons, such as optimized file copies/rsync/tar/etc skipping holes > in sparse files easily. I'm slightly worried because the generic FIEMAP-on-FIBMAP implementation takes the inode mutex, but the FIBMAP ioctl doesn't. > If you are implementing a tool to use this, I would code it to try > FIEMAP first, then FIBMAP (if it is running as root, or it gets > fixed in some future kernel), then just do without (as it most likely > does already today). If FIBMAP is unsafe, it's likely exposed by concurrent changes to the file, so using it would still be unsafe for backup purposes. And I really only need the number of the first block. (I want to optimize reading of Maildir-style folders, mainly for backup purposes.) ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-07 16:26 ` Florian Weimer @ 2009-09-08 7:23 ` Andreas Dilger 2009-09-08 8:47 ` Jamie Lokier 0 siblings, 1 reply; 7+ messages in thread From: Andreas Dilger @ 2009-09-08 7:23 UTC (permalink / raw) To: Florian Weimer; +Cc: linux-fsdevel On Sep 07, 2009 16:26 +0000, Florian Weimer wrote: > * Andreas Dilger: > > If you are implementing a tool to use this, I would code it to try > > FIEMAP first, then FIBMAP (if it is running as root, or it gets > > fixed in some future kernel), then just do without (as it most likely > > does already today). > > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the > file, so using it would still be unsafe for backup purposes. And I > really only need the number of the first block. (I want to optimize > reading of Maildir-style folders, mainly for backup purposes.) Given that the worst that can happen for your particular application if FIBMAP gets the wrong block number is a sub-optimal ordering for the file copy, there is no risk in doing this. For the FIEMAP code, since you only need the first block number, just pass it a single fiemap_extent so that it doesn't spend time generating a full list of extents that you don't need to use. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-08 7:23 ` Andreas Dilger @ 2009-09-08 8:47 ` Jamie Lokier 2009-09-09 16:13 ` Andreas Dilger 0 siblings, 1 reply; 7+ messages in thread From: Jamie Lokier @ 2009-09-08 8:47 UTC (permalink / raw) To: Andreas Dilger; +Cc: Florian Weimer, linux-fsdevel Andreas Dilger wrote: > On Sep 07, 2009 16:26 +0000, Florian Weimer wrote: > > * Andreas Dilger: > > > If you are implementing a tool to use this, I would code it to try > > > FIEMAP first, then FIBMAP (if it is running as root, or it gets > > > fixed in some future kernel), then just do without (as it most likely > > > does already today). > > > > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the > > file, so using it would still be unsafe for backup purposes. And I > > really only need the number of the first block. (I want to optimize > > reading of Maildir-style folders, mainly for backup purposes.) > > Given that the worst that can happen for your particular application > if FIBMAP gets the wrong block number is a sub-optimal ordering for > the file copy, there is no risk in doing this. > > For the FIEMAP code, since you only need the first block number, just > pass it a single fiemap_extent so that it doesn't spend time generating > a full list of extents that you don't need to use. With FIEMAP, filesystems which don't use extents will still scan a potentially large region of the disk doing block lookups won't they, just to maximise the size of the first extent in a large file? Assuming yes, FIBMAP would be faster than FIEMAP in those cases, which may be a reason to give it the same permissions, so it can be used. -- Jamie ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO 2009-09-08 8:47 ` Jamie Lokier @ 2009-09-09 16:13 ` Andreas Dilger 0 siblings, 0 replies; 7+ messages in thread From: Andreas Dilger @ 2009-09-09 16:13 UTC (permalink / raw) To: Jamie Lokier; +Cc: Florian Weimer, linux-fsdevel On Sep 08, 2009 09:47 +0100, Jamie Lokier wrote: > Andreas Dilger wrote: > > On Sep 07, 2009 16:26 +0000, Florian Weimer wrote: > > > * Andreas Dilger: > > > > If you are implementing a tool to use this, I would code it to try > > > > FIEMAP first, then FIBMAP (if it is running as root, or it gets > > > > fixed in some future kernel), then just do without (as it most likely > > > > does already today). > > > > > > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the > > > file, so using it would still be unsafe for backup purposes. And I > > > really only need the number of the first block. (I want to optimize > > > reading of Maildir-style folders, mainly for backup purposes.) > > > > Given that the worst that can happen for your particular application > > if FIBMAP gets the wrong block number is a sub-optimal ordering for > > the file copy, there is no risk in doing this. > > > > For the FIEMAP code, since you only need the first block number, just > > pass it a single fiemap_extent so that it doesn't spend time generating > > a full list of extents that you don't need to use. > > With FIEMAP, filesystems which don't use extents will still scan a > potentially large region of the disk doing block lookups won't they, > just to maximise the size of the first extent in a large file? Partly true - if the caller only passes in space for a single extent to be returned then the traversal would stop as soon as a discontinuity is found. For ext2/ext3 this would happen at 12 blocks into the filesystem, because the first indirect block is allocated after the 12th data block, so the amount of scanning would be very small. > Assuming yes, FIBMAP would be faster than FIEMAP in those cases, which > may be a reason to give it the same permissions, so it can be used. By all means, I don't object to FIBMAP being fixed to allow non-root access, but until that happens (and then users start using those kernels) it won't work for non-root users. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-09-09 16:13 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-09-06 13:39 FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO Florian Weimer 2009-09-07 10:50 ` Andreas Dilger 2009-09-07 13:28 ` Theodore Tso 2009-09-07 16:26 ` Florian Weimer 2009-09-08 7:23 ` Andreas Dilger 2009-09-08 8:47 ` Jamie Lokier 2009-09-09 16:13 ` Andreas Dilger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).