* XFS group quota circumvention via NFS and chgrp
@ 2026-06-16 9:08 Dr. Thomas Orgis
2026-06-18 21:05 ` Dr. Thomas Orgis
0 siblings, 1 reply; 2+ messages in thread
From: Dr. Thomas Orgis @ 2026-06-16 9:08 UTC (permalink / raw)
To: linux-xfs
Dear Linux XFS folks,
I noticed that xfs group quotas can be circumvented via NFS* on vanilla
Kernel 6.6.x (slightly differing versions on client and server) as
follows:
1. A user has a primary group and an auxilliary group.
2. There are group quotas for both, possibly very restrictive.
3. User can create files with the primary group (within quota) and then
4. `chgrp $auxgroup file` to get rid of the quota and shift it to the
other group.
The kicker is: The quota for the auxilliary group is ignored in this
case. So I can store unlimited amounts of data by chgrp'ing it away to
the other group in pieces.
This is not intentional behaviour, or is it?
This is _not_ the case with ext4 being served via NFS. Long form
comparison follows.
A1. ext4 locally
A fresh ext4 fs populated with some test files and quota set:
user@server:/srv/test/userx$ quota -g -f /srv/test/ grep -v :\ none
Disk quotas for group userx (gid 1005):
Filesystem blocks quota limit grace files quota limit grace
/dev/mapper/xyz-quotatest
307216* 1024 1024 none 4* 1 1 none
Disk quotas for group auxgroupxx (gid 187100007):
Filesystem blocks quota limit grace files quota limit grace
/dev/mapper/xyz-quotatest
0 1 1 1* 1 1
user@server:/srv/test/userx$ ls -l
total 307212
-rw-r--r-- 1 userx userx 104857600 Jun 15 14:52 bar
-rw-r--r-- 1 userx userx 209715206 Jun 15 15:01 bla
-rw-r--r-- 1 userx userx 6 Jun 15 14:51 blar
-rw-r--r-- 1 userx auxgroupxx 0 Jun 15 15:38 foo
user@server:/srv/test/userx$ chgrp auxgroupxx bla
chgrp: changing group of 'bla': Disk quota exceeded
Or, in other words:
user@server:/srv/test/userx$ strace chgrp auxgroupxx bla 2>&1 | grep chown
fchownat(AT_FDCWD, "bla", -1, 187100007, 0) = -1 EDQUOT (Disk quota exceeded)
This is all well as it should be. If I am root locally, though, it does
not matter:
user@server:/srv/test/userx# strace chgrp auxgroupxx bla 2>&1 | grep chown
fchownat(AT_FDCWD, "bla", -1, 187100007, 0) = 0
The data is moved to the other group even if the quota does not allow
it.
# repquota -g /srv/test/
*** Report for group quotas on device /dev/mapper/xyz-quotatest
Block grace time: 00:00; Inode grace time: 00:00
Block limits File limits
Group used soft hard grace used soft hard grace
----------------------------------------------------------------------
root -- 20 0 0 2 0 0
userx ++ 102408 1024 1024 none 3 1 1 none
auxgroupxx ++ 204808 1 1 none 2 1 1 none
But that is fine, as root is allowed to do anything, I presume.
A2. ext4 via NFS
There is no difference on NFS for non-root. The user is not allowed to
circumvent the group quota of auxgroupxx:
userx@client:/mnt/test/userx$ ls -l
total 307212
-rw-r--r-- 1 userx userx 104857600 2026-06-15 14:52 bar
-rw-r--r-- 1 userx userx 209715210 2026-06-15 21:14 bla
-rw-r--r-- 1 userx userx 6 2026-06-15 14:51 blar
-rw-r--r-- 1 userx auxgroupxx 0 2026-06-15 15:38 foo
userx@client:/mnt/test/userx$ strace chgrp auxgroupxx bla 2>&1 | grep chown
fchownat(AT_FDCWD, "bla", -1, 187100007, 0) = -1 EDQUOT (Disk quota exceeded)
Since root over NFS, at least with root squashing, is a bit less root,
it is reassuring that for the superuser, it also fails on the NFS client:
userx@client:/mnt/test/userx# strace chgrp auxgroupxx bla 2>&1 | grep chown
fchownat(AT_FDCWD, "bla", -1, 187100007, 0) = -1 EPERM (Operation not permitted)
B1. xfs locally
To be fair, I also created a new volume with
mkfs.xfs /dev/mapper/xyz-quotatest2
mount -o usrquota,grpquota /dev/mapper/xyz-quotatest2 /srv/test2/
xfs_quota -x -c 'timer -g 60 -d' /srv/test2
xfs_quota -x -c 'limit -g bsoft=100m bhard=101m userx' /srv/test2
xfs_quota -x -c 'limit -g bsoft=4k bhard=4k auxgroupxx' /srv/test2
and confirmed the findings I had with the existing older fs.
root@server# xfs_quota -x -c 'report -g' /srv/test2/
Group quota on /srv/test2 (/dev/mapper/xyz-quotatest2)
Blocks
Group ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
root 0 0 0 00 [0 days]
userx 103424 102400 103424 00 [00:00:11]
auxgroupxx 0 4 4 00 [--------]
root@server# repquota -g /srv/test2/
*** Report for group quotas on device /dev/mapper/xyz-quotatest2
Block grace time: 00:01; Inode grace time: 00:01
Block limits File limits
Group used soft hard grace used soft hard grace
----------------------------------------------------------------------
root -- 0 0 0 3 0 0
userx +- 103424 102400 103424 none 3 0 0
auxgroupxx -- 0 4 4 1 0 0
userx@server:/srv/test2/userx$ ls -l
total 103424
-rw-r--r-- 1 userx userx 104857600 Jun 16 10:06 testzero
-rw-r--r-- 1 userx userx 1048576 Jun 16 10:25 testzero2
-rw-r--r-- 1 userx auxgroupxx 0 Jun 16 10:25 wedge
The userx should not be able to create files in auxgroupxx, as its
quota of 4K is already exhausted. And indeed, it works that way locally.
userx@server:/srv/test2/userx$ strace chgrp auxgroupxx testzero 2>&1 | grep chown
fchownat(AT_FDCWD, "testzero", -1, 187100007, 0) = -1 EDQUOT (Disk quota exceeded)
Root can do it, as expected from the experience with ext4:
root@server:/srv/test2/userx/t# strace chgrp auxgroupxx testzero 2>&1 | grep chown
fchownat(AT_FDCWD, "testzero", -1, 187100007, 0) = 0
root@server# repquota -g /srv/test2/
*** Report for group quotas on device /dev/mapper/xyz-quotatest2
Block grace time: 00:01; Inode grace time: 00:01
Block limits File limits
Group used soft hard grace used soft hard grace
----------------------------------------------------------------------
root -- 0 0 0 3 0 0
userx -- 1024 102400 103424 2 0 0
auxgroupxx +- 102400 4 4 none 2 0 0
Though, the block quota is actually also enforced for root here:
root@server# dd if=/dev/zero of=wedge bs=1 count=4096
dd: error writing 'wedge': Disk quota exceeded
1+0 records in
0+0 records out
0 bytes copied, 9.6329e-05 s, 0.0 kB/s
Is that really intentional, btw.? Enforcing quota for root this way,
but not for chgrp/chown?
B2. xfs via NFS
Switching to a client node that has test2 mounted via NFS, after moving
testzero back to the userx group.
userx@client:/mnt/test2/userx$ quota -g -f /mnt/test2 | grep -v ': no'
Disk quotas for group userx (gid 1005):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 103424* 102400 103424 none 3 0 0
Disk quotas for group auxgroupxx (gid 187100007):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 0 4 4 1 0 0
userx@client:/mnt/test2/userx$s trace chgrp auxgroupxx testzero 2>&1 | grep chown
fchownat(AT_FDCWD, "testzero", -1, 187100007, 0) = 0
userx@client:/mnt/test2/userx$ quota -g -f /mnt/test2 | grep -v ': no'
Disk quotas for group userx (gid 1005):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 1024 102400 103424 2 0 0
Disk quotas for group auxgroupxx (gid 187100007):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 102400* 4 4 00:01 2 0 0
The user happily moved 100M of data over to auxgroupxx and has quota
freed to start to comsume more data. The grace period should not
matter, as the soft limit is clearly hit, right? And it's only a minute
… so waiting a bit and preparing the next chunk:
userx@client:/mnt/test2/userx$ dd if=/dev/zero of=testzero3 bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.128547 s, 816 MB/s
userx@client:/mnt/test2/userx$ quota -g -f /mnt/test2 | grep -v ': no'
Disk quotas for group userx (gid 1005):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 103424* 102400 103424 none 3 0 0
Disk quotas for group auxgroupxx (gid 187100007):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 102400* 4 4 none 2 0 0
Quota filled again, any grace period passed. Let's give us some space!
userx@client:/mnt/test2/userx$s trace chgrp auxgroupxx testzero3 2>&1 | grep chown
fchownat(AT_FDCWD, "testzero3", -1, 187100007, 0) = 0
userx@client:/mnt/test2/userx$ quota -g -f /mnt/test2 | grep -v ': no'
Disk quotas for group userx (gid 1005):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 1024 102400 103424 2 0 0
Disk quotas for group auxgroupxx (gid 187100007):
Filesystem blocks quota limit grace files quota limit grace
server:/test2 204800* 4 4 none 3 0 0
The added data is now in auxgroupxx's overdrawn quota, too. Repeat. The
user has no effective quota.
For completeness:
root@client# strace chgrp auxgroupxx testzero2 2>&1 | grep chown
fchownat(AT_FDCWD, "testzero2", -1, 187100007, 0) = -1 EPERM (Operation not permitted)
Root is squashed and not able to directly modify the ownership, as expected.
Now, is this non-enforcement for the group quota is a bug in XFS, or
rather in the translation to the quota data as NFS sees it? A bug in
NFS in enforcement? But only when serving XFS? I did not check ZFS over
NFS yet, but with ext4, there is at least one example where it works as
expected. I also checked with BeeGFS (on top of ZFS) that it enforces
group quotas as I expect. I noted some confusion with grace periods …
and face more confusion myself on reading the xfs_quota(8) section on
the timer command.** Anyway, I waited until after any grace period in
the last example to avoid that complication.
Or is this possibly something fixed in a very recent kernel, by any
chance? Or a regression in 6.6? I am observing this in a production
setup where I cannot just freely swap out things.
Alrighty then,
Thomas
* Kernel NFSv4 over RDMA with sec=sys, if that matters.
** „Allows the quota enforcement timeout (i.e. the amount of time
allowed to pass before the soft limits are enforced as the hard
limits)” vs. „When setting any other individual timer by id or name,
the value is the number of seconds from now, at which time the hard
limits will be enforced. This allows extending the grace time of an
individual user who has exceeded soft limits.” — The hard limits being
enforced after grace does mean the soft limits becoming hard, right?
The hard limits are always enforced, without grace, are they not?
PS: We use group quotas also for individual quotas precisely to be able
to mix personal quotas and working group quotas in a meaningful manner.
PPS: I learned before that project quotas are also no solution if you
value enforcement, as people can move data into unrestricted project
IDs at will. I hoped that plain user/group quotas are enforced, also
over NFS.
--
Dr. Thomas Orgis
HPC @ Universität Hamburg
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: XFS group quota circumvention via NFS and chgrp
2026-06-16 9:08 XFS group quota circumvention via NFS and chgrp Dr. Thomas Orgis
@ 2026-06-18 21:05 ` Dr. Thomas Orgis
0 siblings, 0 replies; 2+ messages in thread
From: Dr. Thomas Orgis @ 2026-06-18 21:05 UTC (permalink / raw)
To: linux-xfs
Am Tue, 16 Jun 2026 11:08:04 +0200
schrieb "Dr. Thomas Orgis" <thomas.orgis@uni-hamburg.de>:
> Dear Linux XFS folks,
>
> I noticed that xfs group quotas can be circumvented via NFS* on vanilla
> Kernel 6.6.x (slightly differing versions on client and server) as
> follows:
I was asked off-list to provide an easier reproducer. Let's try this:
1. Prepare/start a VM in the terminal.
dd if=/dev/zero of=xfs.img bs=1M count=500
wget -O debian-live.iso https://gitlab.com/api/v4/projects/74667529/packages/generic/debian-libre-live/13.3.0/debian-live-13.3.0-amd64-libre-standard.iso
isoinfo -R -i debian-live.iso -x /live/vmlinuz > vmlinuz
isoinfo -R -i debian-live.iso -x /live/initrd.img > initrd.img
qemu-system-x86_64 -nographic -enable-kvm -smp 2 -m 4G \
-cdrom debian-live.iso -drive index=0,driver=raw,file=xfs.img \
-kernel vmlinuz -initrd initrd.img -append "boot=live components console=ttyS0 ro"
Login as user/live, and sudo su, or prepend sudo all the time below, if
you prefer.
2. In the VM, as root:
apt install -y nfs-kernel-server
mkfs.xfs -f /dev/sda
mount -o usrquota,grpquota /dev/sda /srv
xfs_quota -x -c 'limit -g bsoft=1m bhard=2m -d' /srv
echo "auxgrp:x:1001:user" >> /etc/group
mkdir -p /srv/share/user
chown user:user /srv/share/user
echo "/srv localhost(fsid=0,rw)" >> /etc/exports
echo "/srv/share localhost(fsid=1,rw)" >> /etc/exports
systemctl restart nfs-kernel-server
mount localhost:/share /mnt/
# First blob. Works.
su user -c 'dd if=/dev/zero of=/mnt/user/blob1 bs=2M count=1'
# Second blob. Fails:
su user -c 'dd if=/dev/zero of=/mnt/user/blob2 bs=2M count=1'
# Show the limited state.
xfs_quota -x -c report /srv
# Move one block to the other group, filling its quota.
su user -c 'chgrp auxgrp /mnt/user/blob1'
# Now write the second blob.
su user -c 'dd if=/dev/zero of=/mnt/user/blob2 bs=2M count=1'
# Magic: Move that away, getting out of auxgrp quota.
su user -c 'chgrp auxgrp /mnt/user/blob2'
xfs_quota -x -c report /srv
# For fun: Just fill the disk.
# side fact: I froze qemu with 100% CPU load when having a sync
# command in the loop instead of sleep. It's one of those days where
# I step onto bugs everywhere.
n=3
while su user -c "dd if=/dev/zero of=/mnt/user/blob$n bs=2M count=1" &&
sleep 0.1 &&
su user -c "chgrp auxgrp /mnt/user/blob$n"
do
n=$((n+1))
done
xfs_quota -x -c report /srv
Result after the individual writes:
User quota on /srv (/dev/sda)
Blocks
User ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
root 0 0 0 00 [--------]
user 4096 0 0 00 [--------]
Group quota on /srv (/dev/sda)
Blocks
Group ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
root 0 1024 2048 00 [--------]
user 0 1024 2048 00 [--------]
auxgrp 4096 1024 2048 00 [--none--]
This shouldn't be possible. Should it? After the writing loop, I get
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.0206776 s, 101 MB/s
1+0 records in
1+0 records out
2097152 bytes (2.1 MB, 2.0 MiB) copied, 0.0204505 s, 103 MB/s
dd: closing output file '/mnt/user/blob14': Disk quota exceeded
User quota on /srv (/dev/sda)
Blocks
User ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
root 0 0 0 00 [--------]
user 28612 0 0 00 [--------]
Group quota on /srv (/dev/sda)
Blocks
Group ID Used Soft Hard Warn/Grace
---------- --------------------------------------------------
root 0 1024 2048 00 [--------]
user 1988 1024 2048 00 [7 days]
auxgrp 26624 1024 2048 00 [--none--]
It quite consistently stops at blob14. I don't quite get why, though.
Inode size does not count into block quota for xfs, right? Why be able
to write 13 blobs, but not 14, when the quota is filled at one already?
Different questions. The main question is: Why I am able to move data
beyond the group quota?
Alrighty then,
Thomas
--
Dr. Thomas Orgis
HPC @ Universität Hamburg
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2026-06-18 21:05 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 9:08 XFS group quota circumvention via NFS and chgrp Dr. Thomas Orgis
2026-06-18 21:05 ` Dr. Thomas Orgis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox