* xfs_quota: bug: traverses bind mountpoints
@ 2011-07-07 20:46 Paul Nienaber
2011-07-08 3:23 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Paul Nienaber @ 2011-07-07 20:46 UTC (permalink / raw)
To: xfs
So, much like coreutils' du (which also shouldn't), xfs_quota traverses
bind mountpoints both when doing 'project -s' and 'project -C', and
probably also 'project -c' although I haven't tested it. Testcase and
output follows.
cheers
~Paul
# dd if=/dev/zero of=./xfstestfs bs=1M count=512
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 0.402579 s, 1.3 GB/s
# mkfs.xfs xfstestfs
meta-data=xfstestfs isize=256 agcount=4, agsize=32768 blks
= sectsz=512 attr=2, projid32bit=0
data = bsize=4096 blocks=131072, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=1200, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
# mkdir /mnt/xfstestfs
# mount -t xfs -o loop,prjquota xfstestfs /mnt/xfstestfs/
# mkdir -p /mnt/xfstestfs/projects/foo/chroot/xfstestfs
# mkdir -p /mnt/xfstestfs/projects/bar
****This is here so we get spew about where it's traversing later on:
# ln -s /mnt/xfstestfs/projects/foo /mnt/xfstestfs/projects/bar/
# echo 12345:/mnt/xfstestfs/projects/foo >> /etc/projects
# echo foo:12345 >> /etc/projid
# mount -t none -o bind /mnt/xfstestfs
/mnt/xfstestfs/projects/foo/chroot/xfstestfs
# dd if=/dev/zero of=/mnt/xfstestfs/projects/foo/100M bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0815913 s, 1.3 GB/s
# dd if=/dev/zero of=/mnt/xfstestfs/projects/bar/100M bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 0.0806148 s, 1.3 GB/s
# xfs_quota -x -c 'project -s foo' /mnt/xfstestfs
Setting up project foo (path /mnt/xfstestfs/projects/foo)...
****Oops, why are we here:
xfs_quota: skipping special file
/mnt/xfstestfs/projects/foo/chroot/xfstestfs/projects/bar/foo
Processed 1 (/etc/projects and cmdline) paths for project foo with
recursion depth infinite (-1).
# xfs_quota -x -c 'report -p -h' /mnt/xfstestfs
Project quota on /mnt/xfstestfs (/dev/loop0)
Blocks
Project ID Used Soft Hard Warn/Grace
---------- ---------------------------------
foo 200M 0 0 00 [------]
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_quota: bug: traverses bind mountpoints
2011-07-07 20:46 xfs_quota: bug: traverses bind mountpoints Paul Nienaber
@ 2011-07-08 3:23 ` Dave Chinner
2011-07-08 4:48 ` Paul Nienaber
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2011-07-08 3:23 UTC (permalink / raw)
To: Paul Nienaber; +Cc: xfs
On Thu, Jul 07, 2011 at 01:46:21PM -0700, Paul Nienaber wrote:
> So, much like coreutils' du (which also shouldn't), xfs_quota
> traverses bind mountpoints both when doing 'project -s' and 'project
> -C', and probably also 'project -c' although I haven't tested it.
> Testcase and output follows.
How is a userspace traversal supposed to detect the fact it crosses
a bind mount when it enters a directory? If you bind a directory
from the same filesystem, stat(2) on the file returns -identical-
information regardless of whether you are inside or outside the bind
mount. So the normal mechanisms (e.g. st_dev changes) for detecting
such a mount point traversal simply don't work.
So the first question is whether we should be trying to detect bind
mounts within the same filesystem and handling them for project
quotas? I don't know the answer to that.
Indeed:
$ find .
.
./projects
./projects/foo
./projects/foo/chroot
find: File system loop detected; `./projects/foo/chroot/scratch' is
part of the same file system loop as `.'.
./projects/bar
./projects/bar/foo
./baz
find has some way of detecting such cases, but it doesn't do it via
any special syscalls, nor does the newfstatat(AT_SYMLINK_NOFOLLOW)
call it does return an error. And it does it regardless of whether
the -xdev option is specified or not. So it must have some form of
internal logic for detecting such loopy filesystem constructs.
However, operations such as "chmod -R" do *not* detect this
situation:
$ sudo chown -R -v dave:dave *
ownership of `baz' retained as dave:dave
changed ownership of `projects/foo/chroot/scratch/projects/foo/chroot/scratch' to dave:dave
changed ownership of `projects/foo/chroot/scratch/projects/foo/chroot' to dave:dave
changed ownership of `projects/foo/chroot/scratch/projects/foo' to dave:dave
changed ownership of `projects/foo/chroot/scratch/projects/bar/foo' to dave:dave
changed ownership of `projects/foo/chroot/scratch/projects/bar' to dave:dave
changed ownership of `projects/foo/chroot/scratch/projects' to dave:dave
ownership of `projects/foo/chroot/scratch/baz' retained as dave:dave
changed ownership of `projects/foo/chroot/scratch' to dave:dave
changed ownership of `projects/foo/chroot' to dave:dave
changed ownership of `projects/foo' to dave:dave
ownership of `projects/bar/foo' retained as dave:dave
ownership of `projects/bar' retained as dave:dave
changed ownership of `projects' to dave:dave
It's totally unclear what the behaviour of xfs_quota should be,
because operations that change user and group quotas are completely
ignorant of bind mounts.
So if we decide bind mounts are important to detect, the second
question is how do we detect bind mount point traversals in a
reliable manner that doesn't involve adding significant overhead to
the directory traversal code? I don't know the answer to that,
either, and if you care about this enough I guess you'll go and look
up what find does and tell us about it ;)
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_quota: bug: traverses bind mountpoints
2011-07-08 3:23 ` Dave Chinner
@ 2011-07-08 4:48 ` Paul Nienaber
2011-07-08 8:01 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Paul Nienaber @ 2011-07-08 4:48 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
I would definitely agree that this is perhaps perhaps quite nonsensical
in the case of user/group quotas. However, a project quota is
conceptually a quota for "files in a directory and its subdirectories on
a particular filesystem", and I would argue that, regardless of
bindmounts, / is never a subdirectory of /foo, and this is where I think
the change in behaviour should be. There's also the somewhat-grey area
of how 'project -C' should behave, and I suppose the simplest and most
sensible answer there is "however 'project -s' and 'project -c' behave",
as that's both at least somewhat sensible and the least likely to
confuse people. I'd be happy to go digging at find at some point soon.
cheers
~Paul
On 11-07-07 8:23 PM, Dave Chinner wrote:
> On Thu, Jul 07, 2011 at 01:46:21PM -0700, Paul Nienaber wrote:
>> So, much like coreutils' du (which also shouldn't), xfs_quota
>> traverses bind mountpoints both when doing 'project -s' and 'project
>> -C', and probably also 'project -c' although I haven't tested it.
>> Testcase and output follows.
> How is a userspace traversal supposed to detect the fact it crosses
> a bind mount when it enters a directory? If you bind a directory
> from the same filesystem, stat(2) on the file returns -identical-
> information regardless of whether you are inside or outside the bind
> mount. So the normal mechanisms (e.g. st_dev changes) for detecting
> such a mount point traversal simply don't work.
>
> So the first question is whether we should be trying to detect bind
> mounts within the same filesystem and handling them for project
> quotas? I don't know the answer to that.
>
> Indeed:
>
> $ find .
> .
> ./projects
> ./projects/foo
> ./projects/foo/chroot
> find: File system loop detected; `./projects/foo/chroot/scratch' is
> part of the same file system loop as `.'.
> ./projects/bar
> ./projects/bar/foo
> ./baz
>
> find has some way of detecting such cases, but it doesn't do it via
> any special syscalls, nor does the newfstatat(AT_SYMLINK_NOFOLLOW)
> call it does return an error. And it does it regardless of whether
> the -xdev option is specified or not. So it must have some form of
> internal logic for detecting such loopy filesystem constructs.
>
> However, operations such as "chmod -R" do *not* detect this
> situation:
>
> $ sudo chown -R -v dave:dave *
> ownership of `baz' retained as dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects/foo/chroot/scratch' to dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects/foo/chroot' to dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects/foo' to dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects/bar/foo' to dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects/bar' to dave:dave
> changed ownership of `projects/foo/chroot/scratch/projects' to dave:dave
> ownership of `projects/foo/chroot/scratch/baz' retained as dave:dave
> changed ownership of `projects/foo/chroot/scratch' to dave:dave
> changed ownership of `projects/foo/chroot' to dave:dave
> changed ownership of `projects/foo' to dave:dave
> ownership of `projects/bar/foo' retained as dave:dave
> ownership of `projects/bar' retained as dave:dave
> changed ownership of `projects' to dave:dave
>
> It's totally unclear what the behaviour of xfs_quota should be,
> because operations that change user and group quotas are completely
> ignorant of bind mounts.
>
> So if we decide bind mounts are important to detect, the second
> question is how do we detect bind mount point traversals in a
> reliable manner that doesn't involve adding significant overhead to
> the directory traversal code? I don't know the answer to that,
> either, and if you care about this enough I guess you'll go and look
> up what find does and tell us about it ;)
>
> Cheers,
>
> Dave.
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: xfs_quota: bug: traverses bind mountpoints
2011-07-08 4:48 ` Paul Nienaber
@ 2011-07-08 8:01 ` Dave Chinner
0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2011-07-08 8:01 UTC (permalink / raw)
To: Paul Nienaber; +Cc: xfs
On Thu, Jul 07, 2011 at 09:48:22PM -0700, Paul Nienaber wrote:
> I would definitely agree that this is perhaps perhaps quite
> nonsensical in the case of user/group quotas. However, a project
> quota is conceptually a quota for "files in a directory and its
> subdirectories on a particular filesystem",
No, project quota works exactly like user and group quotas - every
file in a project group has the same project ID stored in the inode,
just like uid and gid are stored in the inode. They do not rely on
directory structure at all, and you can add any file to a project
anywhere in the filesystem simply by changing it's project ID.
Project quotas can be used to -implement- directory tree quotas
because there is another flag in the inode that tells directories
that children should inherit the projid of the parent directory at
creation time. That's the actual feature that allows project quotas
to be used for directory tree quotas - it's entirely independent of
the basic functionality of accounting and enforcing project quotas
based on the projid in each inode.
That's why it is not at all clear how bind mounts should be treated.
On one hand they should be treated identically to user and group
quotas, and on the other hand bind mounts can completely screw up
directory tree quotas.
> and I would argue that,
> regardless of bindmounts, / is never a subdirectory of /foo, and
> this is where I think the change in behaviour should be.
You're attempting to cross recursive bind mounts with directory tree
quota and worse, pointing the bind mount inside the directory tree
to a parent directory outside the directory tree the quota applies
to.
IOWs, your directory structure disappears up it's own fundamental
orifice in a manner that is very difficult to detect (did Oroborus
know that it was eating it's own tail?). As such I don't think there
is a sane set of semantics that we can apply consistently in the XFS
code in both userspace and kernel code when it comes to directory
tree quotas and bind mounts.
The simple answer is: Just Don't Do It.
> There's
> also the somewhat-grey area of how 'project -C' should behave, and I
> suppose the simplest and most sensible answer there is "however
> 'project -s' and 'project -c' behave", as that's both at least
> somewhat sensible and the least likely to confuse people. I'd be
> happy to go digging at find at some point soon.
Like I said above, there's also consistency with every other
application that does traversal for accounting purposes (e.g. du).
If you want to avoid traversals moving across bind mounts
(especially recursive bind mounts), I think we need a syscall flag
similar to AT_SYMLINK_NOFOLLOW for the kernel to detect and prevent
such traversal in a consistent manner.
As such, that's not a project quota problem - that's a generic, VFS
behaviour issue. That's where I'd recommend trying to solve the
bind mount traversal problem, not hack something into xfs_quota....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-07-08 8:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-07 20:46 xfs_quota: bug: traverses bind mountpoints Paul Nienaber
2011-07-08 3:23 ` Dave Chinner
2011-07-08 4:48 ` Paul Nienaber
2011-07-08 8:01 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox