* [PATCH v3 0/2] User namespace aware fanotify
@ 2025-05-16 19:28 Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 1/2] fanotify: remove redundant permission checks Amir Goldstein
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Amir Goldstein @ 2025-05-16 19:28 UTC (permalink / raw)
To: Jan Kara; +Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel
Jan,
Considering that the review discussion on v2 [1] did not yet converge
and considering that the merge window is very close, I realized
there is a way that we can simplify the controversial part.
There are two main use cases to allow setting marks inside user ns:
1. Christian added support for open_by_handle_at(2) to admin inside
userns, which makes watching FS_USERNS_MOUNT sb more useful.
2. The mount events added by Miklos would be very useful also inside
userns.
The rule for watching mntns inside user ns is pretty obvious and so
is the rule for watching an sb inside user ns.
The complexity discussed in review of v2 revolved around the more
complicated rules for watching fs events on a specific mount inside
users ns.
My realization is that watching fs events on a mount inside user ns
is a less intersting use case and it is much easier to apply the same
obvious rules as for watching an sb inside user ns and discuss
relaxing them later if there is any interesting use case for that.
mntns watch inside user ns was tested with the mount-notify_test_ns
selftest [2]. sb/mount watches inside user ns were tested manually
with fsnotifywatch -S and -M with some changes to inotify-tools [3].
Thanks,
Amir.
Changes since v2:
- selftest merged to Christian's tree
- Change mount mark to require capable sb user ns
- Remove incorrect reference to FS_USERNS_MOUNT in comments (Miklos)
- Avoid unneeded type casting to mntns (Miklos)
Changes since v1:
- Split cleanup patch (Jan)
- Logic simplified a bit
- Add support for mntns marks inside userns
[1] https://lore.kernel.org/linux-fsdevel/20250419100657.2654744-1-amir73il@gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/20250509133240.529330-1-amir73il@gmail.com/
[3] https://github.com/amir73il/inotify-tools/commits/fanotify_userns/
Amir Goldstein (2):
fanotify: remove redundant permission checks
fanotify: support watching filesystems and mounts inside userns
fs/notify/fanotify/fanotify.c | 1 +
fs/notify/fanotify/fanotify_user.c | 50 +++++++++++++++++-------------
include/linux/fanotify.h | 5 ++-
include/linux/fsnotify_backend.h | 1 +
4 files changed, 33 insertions(+), 24 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v3 1/2] fanotify: remove redundant permission checks
2025-05-16 19:28 [PATCH v3 0/2] User namespace aware fanotify Amir Goldstein
@ 2025-05-16 19:28 ` Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 2/2] fanotify: support watching filesystems and mounts inside userns Amir Goldstein
2025-05-19 20:56 ` [PATCH v3 0/2] User namespace aware fanotify Jan Kara
2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2025-05-16 19:28 UTC (permalink / raw)
To: Jan Kara; +Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel
FAN_UNLIMITED_QUEUE and FAN_UNLIMITED_MARK flags are already checked
as part of the CAP_SYS_ADMIN check for any FANOTIFY_ADMIN_INIT_FLAGS.
Remove the individual CAP_SYS_ADMIN checks for these flags.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/notify/fanotify/fanotify_user.c | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 87f861e9004f..471c57832357 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1334,6 +1334,7 @@ static struct fsnotify_mark *fanotify_add_new_mark(struct fsnotify_group *group,
* A group with FAN_UNLIMITED_MARKS does not contribute to mark count
* in the limited groups account.
*/
+ BUILD_BUG_ON(!(FANOTIFY_ADMIN_INIT_FLAGS & FAN_UNLIMITED_MARKS));
if (!FAN_GROUP_FLAG(group, FAN_UNLIMITED_MARKS) &&
!inc_ucount(ucounts->ns, ucounts->uid, UCOUNT_FANOTIFY_MARKS))
return ERR_PTR(-ENOSPC);
@@ -1637,21 +1638,13 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
goto out_destroy_group;
}
+ BUILD_BUG_ON(!(FANOTIFY_ADMIN_INIT_FLAGS & FAN_UNLIMITED_QUEUE));
if (flags & FAN_UNLIMITED_QUEUE) {
- fd = -EPERM;
- if (!capable(CAP_SYS_ADMIN))
- goto out_destroy_group;
group->max_events = UINT_MAX;
} else {
group->max_events = fanotify_max_queued_events;
}
- if (flags & FAN_UNLIMITED_MARKS) {
- fd = -EPERM;
- if (!capable(CAP_SYS_ADMIN))
- goto out_destroy_group;
- }
-
if (flags & FAN_ENABLE_AUDIT) {
fd = -EPERM;
if (!capable(CAP_AUDIT_WRITE))
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 2/2] fanotify: support watching filesystems and mounts inside userns
2025-05-16 19:28 [PATCH v3 0/2] User namespace aware fanotify Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 1/2] fanotify: remove redundant permission checks Amir Goldstein
@ 2025-05-16 19:28 ` Amir Goldstein
2025-05-19 20:56 ` [PATCH v3 0/2] User namespace aware fanotify Jan Kara
2 siblings, 0 replies; 4+ messages in thread
From: Amir Goldstein @ 2025-05-16 19:28 UTC (permalink / raw)
To: Jan Kara; +Cc: Miklos Szeredi, Christian Brauner, linux-fsdevel
An unprivileged user is allowed to create an fanotify group and add
inode marks, but not filesystem, mntns and mount marks.
Add limited support for setting up filesystem, mntns and mount marks by
an unprivileged user under the following conditions:
1. User has CAP_SYS_ADMIN in the user ns where the group was created
2.a. User has CAP_SYS_ADMIN in the user ns where the sb was created
OR (in case setting up a mntns mark)
2.b. User has CAP_SYS_ADMIN in the user ns associated with the mntns
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
fs/notify/fanotify/fanotify.c | 1 +
fs/notify/fanotify/fanotify_user.c | 39 +++++++++++++++++++++---------
include/linux/fanotify.h | 5 ++--
include/linux/fsnotify_backend.h | 1 +
4 files changed, 31 insertions(+), 15 deletions(-)
diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 6d386080faf2..060d9bee34bd 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -1009,6 +1009,7 @@ static int fanotify_handle_event(struct fsnotify_group *group, u32 mask,
static void fanotify_free_group_priv(struct fsnotify_group *group)
{
+ put_user_ns(group->user_ns);
kfree(group->fanotify_data.merge_hash);
if (group->fanotify_data.ucounts)
dec_ucount(group->fanotify_data.ucounts,
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 471c57832357..b192ee068a7a 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1499,6 +1499,7 @@ static struct hlist_head *fanotify_alloc_merge_hash(void)
/* fanotify syscalls */
SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
{
+ struct user_namespace *user_ns = current_user_ns();
struct fsnotify_group *group;
int f_flags, fd;
unsigned int fid_mode = flags & FANOTIFY_FID_BITS;
@@ -1513,10 +1514,11 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
/*
* An unprivileged user can setup an fanotify group with
* limited functionality - an unprivileged group is limited to
- * notification events with file handles and it cannot use
- * unlimited queue/marks.
+ * notification events with file handles or mount ids and it
+ * cannot use unlimited queue/marks.
*/
- if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) || !fid_mode)
+ if ((flags & FANOTIFY_ADMIN_INIT_FLAGS) ||
+ !(flags & (FANOTIFY_FID_BITS | FAN_REPORT_MNT)))
return -EPERM;
/*
@@ -1595,8 +1597,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
}
/* Enforce groups limits per user in all containing user ns */
- group->fanotify_data.ucounts = inc_ucount(current_user_ns(),
- current_euid(),
+ group->fanotify_data.ucounts = inc_ucount(user_ns, current_euid(),
UCOUNT_FANOTIFY_GROUPS);
if (!group->fanotify_data.ucounts) {
fd = -EMFILE;
@@ -1605,6 +1606,7 @@ SYSCALL_DEFINE2(fanotify_init, unsigned int, flags, unsigned int, event_f_flags)
group->fanotify_data.flags = flags | internal_flags;
group->memcg = get_mem_cgroup_from_mm(current->mm);
+ group->user_ns = get_user_ns(user_ns);
group->fanotify_data.merge_hash = fanotify_alloc_merge_hash();
if (!group->fanotify_data.merge_hash) {
@@ -1804,6 +1806,8 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
struct fsnotify_group *group;
struct path path;
struct fan_fsid __fsid, *fsid = NULL;
+ struct user_namespace *user_ns = NULL;
+ struct mnt_namespace *mntns;
u32 valid_mask = FANOTIFY_EVENTS | FANOTIFY_EVENT_FLAGS;
unsigned int mark_type = flags & FANOTIFY_MARK_TYPE_BITS;
unsigned int mark_cmd = flags & FANOTIFY_MARK_CMD_BITS;
@@ -1897,12 +1901,10 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
}
/*
- * An unprivileged user is not allowed to setup mount nor filesystem
- * marks. This also includes setting up such marks by a group that
- * was initialized by an unprivileged user.
+ * A user is allowed to setup sb/mount/mntns marks only if it is
+ * capable in the user ns where the group was created.
*/
- if ((!capable(CAP_SYS_ADMIN) ||
- FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV)) &&
+ if (!ns_capable(group->user_ns, CAP_SYS_ADMIN) &&
mark_type != FAN_MARK_INODE)
return -EPERM;
@@ -1981,18 +1983,31 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
fsid = &__fsid;
}
- /* inode held in place by reference to path; group by fget on fd */
+ /*
+ * In addition to being capable in the user ns where group was created,
+ * the user also needs to be capable in the user ns associated with
+ * the filesystem or in the user ns associated with the mntns
+ * (when marking mntns).
+ */
if (obj_type == FSNOTIFY_OBJ_TYPE_INODE) {
inode = path.dentry->d_inode;
obj = inode;
} else if (obj_type == FSNOTIFY_OBJ_TYPE_VFSMOUNT) {
+ user_ns = path.mnt->mnt_sb->s_user_ns;
obj = path.mnt;
} else if (obj_type == FSNOTIFY_OBJ_TYPE_SB) {
+ user_ns = path.mnt->mnt_sb->s_user_ns;
obj = path.mnt->mnt_sb;
} else if (obj_type == FSNOTIFY_OBJ_TYPE_MNTNS) {
- obj = mnt_ns_from_dentry(path.dentry);
+ mntns = mnt_ns_from_dentry(path.dentry);
+ user_ns = mntns->user_ns;
+ obj = mntns;
}
+ ret = -EPERM;
+ if (user_ns && !ns_capable(user_ns, CAP_SYS_ADMIN))
+ goto path_put_and_out;
+
ret = -EINVAL;
if (!obj)
goto path_put_and_out;
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 3c817dc6292e..879cff5eccd4 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -38,8 +38,7 @@
FAN_REPORT_PIDFD | \
FAN_REPORT_FD_ERROR | \
FAN_UNLIMITED_QUEUE | \
- FAN_UNLIMITED_MARKS | \
- FAN_REPORT_MNT)
+ FAN_UNLIMITED_MARKS)
/*
* fanotify_init() flags that are allowed for user without CAP_SYS_ADMIN.
@@ -48,7 +47,7 @@
* so one of the flags for reporting file handles is required.
*/
#define FANOTIFY_USER_INIT_FLAGS (FAN_CLASS_NOTIF | \
- FANOTIFY_FID_BITS | \
+ FANOTIFY_FID_BITS | FAN_REPORT_MNT | \
FAN_CLOEXEC | FAN_NONBLOCK)
#define FANOTIFY_INIT_FLAGS (FANOTIFY_ADMIN_INIT_FLAGS | \
diff --git a/include/linux/fsnotify_backend.h b/include/linux/fsnotify_backend.h
index fc27b53c58c2..d4034ddaf392 100644
--- a/include/linux/fsnotify_backend.h
+++ b/include/linux/fsnotify_backend.h
@@ -250,6 +250,7 @@ struct fsnotify_group {
* full */
struct mem_cgroup *memcg; /* memcg to charge allocations */
+ struct user_namespace *user_ns; /* user ns where group was created */
/* groups can define private fields here or use the void *private */
union {
--
2.34.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 0/2] User namespace aware fanotify
2025-05-16 19:28 [PATCH v3 0/2] User namespace aware fanotify Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 1/2] fanotify: remove redundant permission checks Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 2/2] fanotify: support watching filesystems and mounts inside userns Amir Goldstein
@ 2025-05-19 20:56 ` Jan Kara
2 siblings, 0 replies; 4+ messages in thread
From: Jan Kara @ 2025-05-19 20:56 UTC (permalink / raw)
To: Amir Goldstein; +Cc: Jan Kara, Miklos Szeredi, Christian Brauner, linux-fsdevel
On Fri 16-05-25 21:28:01, Amir Goldstein wrote:
> Jan,
>
> Considering that the review discussion on v2 [1] did not yet converge
> and considering that the merge window is very close, I realized
> there is a way that we can simplify the controversial part.
>
> There are two main use cases to allow setting marks inside user ns:
>
> 1. Christian added support for open_by_handle_at(2) to admin inside
> userns, which makes watching FS_USERNS_MOUNT sb more useful.
> 2. The mount events added by Miklos would be very useful also inside
> userns.
>
> The rule for watching mntns inside user ns is pretty obvious and so
> is the rule for watching an sb inside user ns.
>
> The complexity discussed in review of v2 revolved around the more
> complicated rules for watching fs events on a specific mount inside
> users ns.
>
> My realization is that watching fs events on a mount inside user ns
> is a less intersting use case and it is much easier to apply the same
> obvious rules as for watching an sb inside user ns and discuss
> relaxing them later if there is any interesting use case for that.
>
> mntns watch inside user ns was tested with the mount-notify_test_ns
> selftest [2]. sb/mount watches inside user ns were tested manually
> with fsnotifywatch -S and -M with some changes to inotify-tools [3].
>
> Thanks,
> Amir.
Thanks! Patches look good to me and they seem obvious enough now that I've
just picked them up.
Honza
>
> Changes since v2:
> - selftest merged to Christian's tree
> - Change mount mark to require capable sb user ns
> - Remove incorrect reference to FS_USERNS_MOUNT in comments (Miklos)
> - Avoid unneeded type casting to mntns (Miklos)
>
> Changes since v1:
> - Split cleanup patch (Jan)
> - Logic simplified a bit
> - Add support for mntns marks inside userns
>
> [1] https://lore.kernel.org/linux-fsdevel/20250419100657.2654744-1-amir73il@gmail.com/
> [2] https://lore.kernel.org/linux-fsdevel/20250509133240.529330-1-amir73il@gmail.com/
> [3] https://github.com/amir73il/inotify-tools/commits/fanotify_userns/
>
> Amir Goldstein (2):
> fanotify: remove redundant permission checks
> fanotify: support watching filesystems and mounts inside userns
>
> fs/notify/fanotify/fanotify.c | 1 +
> fs/notify/fanotify/fanotify_user.c | 50 +++++++++++++++++-------------
> include/linux/fanotify.h | 5 ++-
> include/linux/fsnotify_backend.h | 1 +
> 4 files changed, 33 insertions(+), 24 deletions(-)
>
> --
> 2.34.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-05-19 20:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-16 19:28 [PATCH v3 0/2] User namespace aware fanotify Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 1/2] fanotify: remove redundant permission checks Amir Goldstein
2025-05-16 19:28 ` [PATCH v3 2/2] fanotify: support watching filesystems and mounts inside userns Amir Goldstein
2025-05-19 20:56 ` [PATCH v3 0/2] User namespace aware fanotify Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).