* FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
@ 2009-09-06 13:39 Florian Weimer
2009-09-07 10:50 ` Andreas Dilger
0 siblings, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2009-09-06 13:39 UTC (permalink / raw)
To: linux-fsdevel
The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's
that? Is it that there is no backwards-compatible way to introduce
locking on the bmap path?
(Sorting file access based on the first block number of the file
really improves performance, even compared to inode number sorting,
that's why I'm asking.)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-06 13:39 FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO Florian Weimer
@ 2009-09-07 10:50 ` Andreas Dilger
2009-09-07 13:28 ` Theodore Tso
2009-09-07 16:26 ` Florian Weimer
0 siblings, 2 replies; 7+ messages in thread
From: Andreas Dilger @ 2009-09-07 10:50 UTC (permalink / raw)
To: Florian Weimer; +Cc: linux-fsdevel
On Sep 06, 2009 13:39 +0000, Florian Weimer wrote:
> The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's
> that? Is it that there is no backwards-compatible way to introduce
> locking on the bmap path?
I'm not sure why there is a root-only requirement for FIBMAP, but the
FIEMAP data is definitely useful even for non-root users for many
reasons, such as optimized file copies/rsync/tar/etc skipping holes
in sparse files easily.
If you are implementing a tool to use this, I would code it to try
FIEMAP first, then FIBMAP (if it is running as root, or it gets
fixed in some future kernel), then just do without (as it most likely
does already today).
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-07 10:50 ` Andreas Dilger
@ 2009-09-07 13:28 ` Theodore Tso
2009-09-07 16:26 ` Florian Weimer
1 sibling, 0 replies; 7+ messages in thread
From: Theodore Tso @ 2009-09-07 13:28 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Florian Weimer, linux-fsdevel
On Mon, Sep 07, 2009 at 12:50:14PM +0200, Andreas Dilger wrote:
> On Sep 06, 2009 13:39 +0000, Florian Weimer wrote:
> > The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's
> > that? Is it that there is no backwards-compatible way to introduce
> > locking on the bmap path?
>
> I'm not sure why there is a root-only requirement for FIBMAP
Historical reasons, I suspect --- back then only LILO needed it, and
whoever added it decided it was safer only to allow root to have
access to it. I don't think there's any good justification for FIBMAP
to require privileges.
- Ted
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-07 10:50 ` Andreas Dilger
2009-09-07 13:28 ` Theodore Tso
@ 2009-09-07 16:26 ` Florian Weimer
2009-09-08 7:23 ` Andreas Dilger
1 sibling, 1 reply; 7+ messages in thread
From: Florian Weimer @ 2009-09-07 16:26 UTC (permalink / raw)
To: Andreas Dilger; +Cc: linux-fsdevel
* Andreas Dilger:
> On Sep 06, 2009 13:39 +0000, Florian Weimer wrote:
>> The FIBMAP ioctl requires CAP_SYS_RAWIO, but FIEMAP doesn't. Why's
>> that? Is it that there is no backwards-compatible way to introduce
>> locking on the bmap path?
>
> I'm not sure why there is a root-only requirement for FIBMAP, but the
> FIEMAP data is definitely useful even for non-root users for many
> reasons, such as optimized file copies/rsync/tar/etc skipping holes
> in sparse files easily.
I'm slightly worried because the generic FIEMAP-on-FIBMAP
implementation takes the inode mutex, but the FIBMAP ioctl doesn't.
> If you are implementing a tool to use this, I would code it to try
> FIEMAP first, then FIBMAP (if it is running as root, or it gets
> fixed in some future kernel), then just do without (as it most likely
> does already today).
If FIBMAP is unsafe, it's likely exposed by concurrent changes to the
file, so using it would still be unsafe for backup purposes. And I
really only need the number of the first block. (I want to optimize
reading of Maildir-style folders, mainly for backup purposes.)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-07 16:26 ` Florian Weimer
@ 2009-09-08 7:23 ` Andreas Dilger
2009-09-08 8:47 ` Jamie Lokier
0 siblings, 1 reply; 7+ messages in thread
From: Andreas Dilger @ 2009-09-08 7:23 UTC (permalink / raw)
To: Florian Weimer; +Cc: linux-fsdevel
On Sep 07, 2009 16:26 +0000, Florian Weimer wrote:
> * Andreas Dilger:
> > If you are implementing a tool to use this, I would code it to try
> > FIEMAP first, then FIBMAP (if it is running as root, or it gets
> > fixed in some future kernel), then just do without (as it most likely
> > does already today).
>
> If FIBMAP is unsafe, it's likely exposed by concurrent changes to the
> file, so using it would still be unsafe for backup purposes. And I
> really only need the number of the first block. (I want to optimize
> reading of Maildir-style folders, mainly for backup purposes.)
Given that the worst that can happen for your particular application
if FIBMAP gets the wrong block number is a sub-optimal ordering for
the file copy, there is no risk in doing this.
For the FIEMAP code, since you only need the first block number, just
pass it a single fiemap_extent so that it doesn't spend time generating
a full list of extents that you don't need to use.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-08 7:23 ` Andreas Dilger
@ 2009-09-08 8:47 ` Jamie Lokier
2009-09-09 16:13 ` Andreas Dilger
0 siblings, 1 reply; 7+ messages in thread
From: Jamie Lokier @ 2009-09-08 8:47 UTC (permalink / raw)
To: Andreas Dilger; +Cc: Florian Weimer, linux-fsdevel
Andreas Dilger wrote:
> On Sep 07, 2009 16:26 +0000, Florian Weimer wrote:
> > * Andreas Dilger:
> > > If you are implementing a tool to use this, I would code it to try
> > > FIEMAP first, then FIBMAP (if it is running as root, or it gets
> > > fixed in some future kernel), then just do without (as it most likely
> > > does already today).
> >
> > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the
> > file, so using it would still be unsafe for backup purposes. And I
> > really only need the number of the first block. (I want to optimize
> > reading of Maildir-style folders, mainly for backup purposes.)
>
> Given that the worst that can happen for your particular application
> if FIBMAP gets the wrong block number is a sub-optimal ordering for
> the file copy, there is no risk in doing this.
>
> For the FIEMAP code, since you only need the first block number, just
> pass it a single fiemap_extent so that it doesn't spend time generating
> a full list of extents that you don't need to use.
With FIEMAP, filesystems which don't use extents will still scan a
potentially large region of the disk doing block lookups won't they,
just to maximise the size of the first extent in a large file?
Assuming yes, FIBMAP would be faster than FIEMAP in those cases, which
may be a reason to give it the same permissions, so it can be used.
-- Jamie
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO
2009-09-08 8:47 ` Jamie Lokier
@ 2009-09-09 16:13 ` Andreas Dilger
0 siblings, 0 replies; 7+ messages in thread
From: Andreas Dilger @ 2009-09-09 16:13 UTC (permalink / raw)
To: Jamie Lokier; +Cc: Florian Weimer, linux-fsdevel
On Sep 08, 2009 09:47 +0100, Jamie Lokier wrote:
> Andreas Dilger wrote:
> > On Sep 07, 2009 16:26 +0000, Florian Weimer wrote:
> > > * Andreas Dilger:
> > > > If you are implementing a tool to use this, I would code it to try
> > > > FIEMAP first, then FIBMAP (if it is running as root, or it gets
> > > > fixed in some future kernel), then just do without (as it most likely
> > > > does already today).
> > >
> > > If FIBMAP is unsafe, it's likely exposed by concurrent changes to the
> > > file, so using it would still be unsafe for backup purposes. And I
> > > really only need the number of the first block. (I want to optimize
> > > reading of Maildir-style folders, mainly for backup purposes.)
> >
> > Given that the worst that can happen for your particular application
> > if FIBMAP gets the wrong block number is a sub-optimal ordering for
> > the file copy, there is no risk in doing this.
> >
> > For the FIEMAP code, since you only need the first block number, just
> > pass it a single fiemap_extent so that it doesn't spend time generating
> > a full list of extents that you don't need to use.
>
> With FIEMAP, filesystems which don't use extents will still scan a
> potentially large region of the disk doing block lookups won't they,
> just to maximise the size of the first extent in a large file?
Partly true - if the caller only passes in space for a single extent
to be returned then the traversal would stop as soon as a discontinuity
is found. For ext2/ext3 this would happen at 12 blocks into the
filesystem, because the first indirect block is allocated after the
12th data block, so the amount of scanning would be very small.
> Assuming yes, FIBMAP would be faster than FIEMAP in those cases, which
> may be a reason to give it the same permissions, so it can be used.
By all means, I don't object to FIBMAP being fixed to allow non-root
access, but until that happens (and then users start using those
kernels) it won't work for non-root users.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-09-09 16:13 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-06 13:39 FIBMAP/FIEMAP discrepancy for CAP_SYS_RAWIO Florian Weimer
2009-09-07 10:50 ` Andreas Dilger
2009-09-07 13:28 ` Theodore Tso
2009-09-07 16:26 ` Florian Weimer
2009-09-08 7:23 ` Andreas Dilger
2009-09-08 8:47 ` Jamie Lokier
2009-09-09 16:13 ` Andreas Dilger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).