linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: "Darrick J. Wong" <djwong@kernel.org>, Jan Kara <jack@suse.cz>,
	 Andrey Albershteyn <aalbersh@redhat.com>,
	linux-fsdevel@vger.kernel.org,  linux-xfs@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Christian Brauner <brauner@kernel.org>
Subject: Re: [PATCH v2 2/4] fs: add FS_IOC_FSSETXATTRAT and FS_IOC_FSGETXATTRAT
Date: Fri, 7 Jun 2024 09:17:34 +0300	[thread overview]
Message-ID: <CAOQ4uxgV5V0TmbZk1vqn=bYfSsdLofDRKvBT4O60zU+jXo0YMQ@mail.gmail.com> (raw)
In-Reply-To: <ZmEemh4++vMEwLNg@dread.disaster.area>

On Thu, Jun 6, 2024 at 5:27 AM Dave Chinner <david@fromorbit.com> wrote:
>
> On Wed, Jun 05, 2024 at 08:13:15AM +0300, Amir Goldstein wrote:
> > On Wed, Jun 5, 2024 at 3:38 AM Darrick J. Wong <djwong@kernel.org> wrote:
> > > On Tue, Jun 04, 2024 at 10:58:43AM +0200, Jan Kara wrote:
> > > > On Mon 03-06-24 10:42:59, Darrick J. Wong wrote:
> > > > > I do -- allowing unpriviledged users to create symlinks that consume
> > > > > icount (and possibly bcount) in the root project breaks the entire
> > > > > enforcement mechanism.  That's not the way that project quota has worked
> > > > > on xfs and it would be quite rude to nullify the PROJINHERIT flag bit
> > > > > only for these special cases.
> > > >
> > > > OK, fair enough. I though someone will hate this. I'd just like to
> > > > understand one thing: Owner of the inode can change the project ID to 0
> > > > anyway so project quotas are more like a cooperative space tracking scheme
> > > > anyway. If you want to escape it, you can. So what are you exactly worried
> > > > about? Is it the container usecase where from within the user namespace you
> > > > cannot change project IDs?
> > >
> > > Yep.
> > >
> > > > Anyway I just wanted to have an explicit decision that the simple solution
> > > > is not good enough before we go the more complex route ;).
> > >
> > > Also, every now and then someone comes along and half-proposes making it
> > > so that non-root cannot change project ids anymore.  Maybe some day that
> > > will succeed.
> > >
> >
> > I'd just like to point out that the purpose of the project quotas feature
> > as I understand it, is to apply quotas to subtrees, where container storage
> > is a very common private case of project subtree.
>
> That is the most modern use case, yes.
>
> [ And for a walk down history lane.... ]
>
> > The purpose is NOT to create a "project" of random files in random
> > paths.
>
> This is *exactly* the original use case that project quotas were
> designed for back on Irix in the early 1990s and is the original
> behaviour project quotas brought to Linux.
>
> Project quota inheritance didn't come along until 2005:
>
> commit 65f1866a3a8e512d43795c116bfef262e703b789
> Author: Nathan Scott <nathans@sgi.com>
> Date:   Fri Jun 3 06:04:22 2005 +0000
>
>     Add support for project quota inheritance, a merge of Glens changes.
>     Merge of xfs-linux-melb:xfs-kern:22806a by kenmcd.
>
> And full support for directory tree quotas using project IDs wasn't
> fully introduced until a year later in 2006:
>
> commit 4aef4de4d04bcc36a1461c100eb940c162fd5ee6
> Author: Nathan Scott <nathans@sgi.com>
> Date:   Tue May 30 15:54:53 2006 +0000
>
>     statvfs component of directory/project quota support, code originally by Glen.
>     Merge of xfs-linux-melb:xfs-kern:26105a by kenmcd.
>
> These changes were largely done for an SGI NAS product that allowed
> us to create one great big XFS filesystem and then create
> arbitrarily sized, thin provisoned  "NFS volumes"  as directory
> quota controlled subdirs instantenously. The directory tree quota
> defined the size of the volume, and so we could also grow and shrink
> them instantenously, too. And we could remove them instantenously
> via background garbage collection after the export was removed and
> the user had been told it had been destroyed.
>
> So that was the original use case for directory tree quotas on XFS -
> providing scalable, fast management of "thin" storage for a NAS
> product. Projects quotas had been used for accounting random
> colections of files for over a decade before this directory quota
> construct was created, and the "modern" container use cases for
> directory quotas didn't come along until almost a decade after this
> capability was added.
>

Cool. Didn't know all of this.
Lucky for us, those historic use cases are well distinguished from
the modern subtree use case by the opt-in PROJINHERIT bit.
So as long as PROJINHERIT is set, my assumptions mostly hold(?)

> > My point is that changing the project id of a non-dir child to be different
> > from the project id of its parent is a pretty rare use case (I think?).
>
> Not if you are using project quotas as they were originally intended
> to be used.
>

Rephrase then:

Changing the projid of a non-dir child to be different from the projid
of its parent, which has PROJINHERIT bit set, is a pretty rare use case(?)

> > If changing the projid of non-dir is needed for moving it to a
> > different subtree,
> > we could allow renameat2(2) of non-dir with no hardlinks to implicitly
> > change its
> > inherited project id or explicitly with a flag for a hardlink, e.g.:
> > renameat2(olddirfd, name, newdirfd, name, RENAME_NEW_PROJID).
>
> Why?
>
> The only reason XFS returns -EXDEV to rename across project IDs is
> because nobody wanted to spend the time to work out how to do the
> quota accounting of the metadata changed in the rename operation
> accurately. So for that rare case (not something that would happen
> on the NAS product) we returned -EXDEV to trigger the mv command to
> copy the file to the destination and then unlink the source instead,
> thereby handling all the quota accounting correctly.
>
> IOWs, this whole "-EXDEV on rename across parent project quota
> boundaries" is an implementation detail and nothing more.
> Filesystems that implement project quotas and the directory tree
> sub-variant don't need to behave like this if they can accurately
> account for the quota ID changes during an atomic rename operation.
> If that's too hard, then the fallback is to return -EXDEV and let
> userspace do it the slow way which will always acocunt the resource
> usage correctly to the individual projects.
>
> Hence I think we should just fix the XFS kernel behaviour to do the
> right thing in this special file case rather than return -EXDEV and
> then forget about the rest of it. Sure, update xfs_repair to fix the
> special file project id issue if it trips over it, but other than
> that I don't think we need anything more. If fixing it requires new
> syscalls and tools, then that's much harder to backport to old
> kernels and distros than just backporting a couple of small XFS
> kernel patches...
>

I assume that by "fix the XFS behavior" you mean
"we could allow renameat2(2) of non-dir with no hardlinks to implicitly
 change its inherited project id"?
(in case the new parent has the PROJINHERIT bit)
so that the RENAME_NEW_PROJID behavior would be implicit.

Unlike rename() from one parent to the other, link()+unlink()
is less obvious.

The "modern" use cases that I listed where implicit change of projid
does not suffice are:

1. Share some inodes (as hardlinks) among projects
2. Recursively changing a subtree projid

They could be implemented by explicit flags to renameat2()/linkat() and
they could be implemented by [gs]etfsxattrat(2) syscalls.

Thanks,
Amir.

  parent reply	other threads:[~2024-06-07  6:17 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20240520164624.665269-2-aalbersh@redhat.com>
     [not found] ` <20240520164624.665269-4-aalbersh@redhat.com>
     [not found]   ` <CAOQ4uxikMjmAkXwGk3d9897622JfkeE8LXaT9PBrtTiR5y3=Rg@mail.gmail.com>
2024-05-20 19:05     ` [PATCH v2 2/4] fs: add FS_IOC_FSSETXATTRAT and FS_IOC_FSGETXATTRAT Amir Goldstein
2024-05-21 16:34     ` Andrey Albershteyn
2024-05-21 18:22       ` Amir Goldstein
2024-05-22 14:58         ` Andrey Albershteyn
2024-05-22 16:28           ` Darrick J. Wong
2024-05-22 16:38             ` Eric Biggers
2024-05-22 17:23               ` Andrey Albershteyn
2024-05-22 18:33                 ` Eric Biggers
2024-05-22 19:03               ` Amir Goldstein
2024-05-23 11:25                 ` Andrey Albershteyn
     [not found]   ` <20240520175159.GD25518@frogsfrogsfrogs>
2024-05-21 10:52     ` Andrey Albershteyn
     [not found]     ` <20240521-sabotieren-autowerkstatt-f4f052fa1874@brauner>
2024-05-21 14:19       ` Christian Brauner
2024-05-21 15:36         ` Darrick J. Wong
     [not found]   ` <20240522100007.zqpa5fxsele5m7wo@quack3>
2024-05-22 10:45     ` Andrey Albershteyn
2024-05-23  7:48       ` Jan Kara
2024-05-23 11:16         ` Andrey Albershteyn
2024-05-24 16:11           ` Jan Kara
2024-05-31 14:52             ` Darrick J. Wong
2024-06-03 10:42               ` Jan Kara
2024-06-03 16:28                 ` Andrey Albershteyn
2024-06-03 17:42                   ` Darrick J. Wong
2024-06-04  8:58                     ` Jan Kara
2024-06-05  0:37                       ` Darrick J. Wong
2024-06-05  5:13                         ` Amir Goldstein
2024-06-06  2:27                           ` Dave Chinner
2024-06-06 22:54                             ` Darrick J. Wong
2024-06-07  6:17                             ` Amir Goldstein [this message]
2024-06-11 23:40                               ` Dave Chinner
2024-06-12 11:24                                 ` Amir Goldstein
2024-06-10  8:17                             ` Andrey Albershteyn
2024-06-10  9:19                               ` Amir Goldstein
2024-06-10 11:50                                 ` Andrey Albershteyn
2024-06-10 13:21                                   ` Amir Goldstein
2024-06-10 14:44                                     ` Jan Kara
2024-06-10 20:26                                     ` Re: " Darrick J. Wong
2024-06-11  7:57                                       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxgV5V0TmbZk1vqn=bYfSsdLofDRKvBT4O60zU+jXo0YMQ@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=aalbersh@redhat.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).