* [PATCH 0/2] fuse: allow FUSE_SYNCFS for privileged userspace servers
@ 2026-06-16 15:19 Jimmy Zuber
2026-06-16 15:19 ` [PATCH 1/2] " Jimmy Zuber
2026-06-16 15:19 ` [PATCH 2/2] selftests/fuse: add test for FUSE_HAS_SYNCFS privilege gating Jimmy Zuber
0 siblings, 2 replies; 3+ messages in thread
From: Jimmy Zuber @ 2026-06-16 15:19 UTC (permalink / raw)
To: Miklos Szeredi
Cc: fuse-devel, linux-fsdevel, linux-api, linux-kernel, Shuah Khan,
linux-kselftest
FUSE_SYNCFS (propagating syncfs()/sync() to the server) is currently
enabled only for virtiofs and fuseblk, since an untrusted server can stall
sync(). But any FUSE filesystem may buffer data in the server that should
reach storage on sync(); the only thing that should gate it is whether the
mount was set up with host privilege. This series lets a plain /dev/fuse
server opt in via a new FUSE_HAS_SYNCFS INIT flag, honored only for mounts
owned by the initial user namespace. Patch 1 has the full rationale and
the security argument.
Patch 1: the kernel change (UAPI flag + gating in process_init_reply()).
Patch 2: a selftest that speaks the raw FUSE protocol over /dev/fuse, so
it can withhold the flag and directly observe whether the
FUSE_SYNCFS opcode is forwarded -- covering the privileged,
opt-out, and unprivileged-userns cases.
A matching libfuse change (FUSE_CAP_SYNCFS negotiation) will be sent to the
libfuse project once the UAPI flag here is settled.
Testing: built and booted under QEMU (x86_64). The selftest passes all
three cases. A separate end-to-end check on a FUSE_WRITEBACK_CACHE mount
confirmed the point of the change: after write() the server had received 0
bytes (data dirty in the page cache), and after syncfs() it received the
full buffered payload followed by FUSE_SYNCFS -- i.e. syncfs() flushes
cached data to the server on a privileged mount.
Jimmy Zuber (2):
fuse: allow FUSE_SYNCFS for privileged userspace servers
selftests/fuse: add test for FUSE_HAS_SYNCFS privilege gating
fs/fuse/inode.c | 16 +
include/uapi/linux/fuse.h | 11 +-
.../selftests/filesystems/fuse/.gitignore | 1 +
.../selftests/filesystems/fuse/Makefile | 2 +-
.../selftests/filesystems/fuse/test_syncfs.c | 318 ++++++++++++++++++
5 files changed, 346 insertions(+), 2 deletions(-)
create mode 100644 tools/testing/selftests/filesystems/fuse/test_syncfs.c
base-commit: 7d87a5a284bb34edb3f4e7e312ef403b3385a7b7
--
2.50.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH 1/2] fuse: allow FUSE_SYNCFS for privileged userspace servers
2026-06-16 15:19 [PATCH 0/2] fuse: allow FUSE_SYNCFS for privileged userspace servers Jimmy Zuber
@ 2026-06-16 15:19 ` Jimmy Zuber
2026-06-16 15:19 ` [PATCH 2/2] selftests/fuse: add test for FUSE_HAS_SYNCFS privilege gating Jimmy Zuber
1 sibling, 0 replies; 3+ messages in thread
From: Jimmy Zuber @ 2026-06-16 15:19 UTC (permalink / raw)
To: Miklos Szeredi
Cc: fuse-devel, linux-fsdevel, linux-api, linux-kernel, Shuah Khan,
linux-kselftest
Propagating syncfs()/sync() to a FUSE server via FUSE_SYNCFS lets the
server flush its own cached or intermediate state when userspace asks the
filesystem to sync. This is currently enabled only for virtiofs and
fuseblk, because an untrusted server can use it to stall sync()
indefinitely (see commit 2d82ab251ef0 ("virtiofs: propagate sync() to file
server"), and commit d3906d8f3cee ("fuse: enable FUSE_SYNCFS for all
fuseblk servers")). Both of those mount types require host privilege to
set up, so the server is trusted not to abuse it.
There is nothing virtiofs- or block-specific about wanting to handle
syncfs(), though. A plain /dev/fuse server is just as entitled to
participate in the sync() path -- so that data it has buffered reaches
stable storage when the user asks for it -- provided it is equally
trusted. The relevant trust boundary is whether the mount was set up with
host privilege.
Add an opt-in INIT flag, FUSE_HAS_SYNCFS, and enable propagation only when
both:
- the server sets FUSE_HAS_SYNCFS in its INIT reply, and
- the mount is owned by the initial user namespace
(fc->user_ns == &init_user_ns).
The user namespace check is the key restriction. A regular fuse mount is
mountable from a non-initial user namespace (FS_USERNS_MOUNT), where the
server is untrusted; the VFS already marks such mounts with
SB_I_UNTRUSTED_MOUNTER. Restricting FUSE_SYNCFS to init_user_ns mounts
preserves the original DoS protection for unprivileged servers in full,
while mirroring how virtiofs and fuseblk earn syncfs by construction
(neither is mountable from a non-initial user namespace).
The flag is only advertised to servers whose mount is owned by the initial
user namespace, so an unprivileged server is never invited to opt in (and
is ignored by fuse_syncfs_enable() if it sets the flag anyway).
Signed-off-by: Jimmy Zuber <jamz@amazon.com>
Assisted-by: Claude:claude-opus-4-8 [Claude-Code]
---
fs/fuse/inode.c | 16 ++++++++++++++++
include/uapi/linux/fuse.h | 11 ++++++++++-
2 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
index d975073c6029..d0005a373729 100644
--- a/fs/fuse/inode.c
+++ b/fs/fuse/inode.c
@@ -1266,6 +1266,16 @@ struct fuse_init_args {
struct fuse_mount *fm;
};
+/*
+ * A server can stall syncfs()/sync(), so only honor FUSE_HAS_SYNCFS for
+ * mounts owned by the initial user namespace, i.e. set up with host
+ * privilege (like virtiofs and fuseblk).
+ */
+static bool fuse_syncfs_enable(struct fuse_conn *fc, u64 flags)
+{
+ return (flags & FUSE_HAS_SYNCFS) && fc->user_ns == &init_user_ns;
+}
+
static void process_init_reply(struct fuse_args *args, int error)
{
struct fuse_init_args *ia = container_of(args, typeof(*ia), args);
@@ -1406,6 +1416,9 @@ static void process_init_reply(struct fuse_args *args, int error)
if (flags & FUSE_REQUEST_TIMEOUT)
timeout = arg->request_timeout;
+
+ if (fuse_syncfs_enable(fc, flags))
+ fc->sync_fs = 1;
} else {
ra_pages = fc->max_read / PAGE_SIZE;
fc->no_lock = 1;
@@ -1473,6 +1486,9 @@ static struct fuse_init_args *fuse_new_init(struct fuse_mount *fm)
flags |= FUSE_SUBMOUNTS;
if (IS_ENABLED(CONFIG_FUSE_PASSTHROUGH))
flags |= FUSE_PASSTHROUGH;
+ /* Only offered to host-privileged mounts; see fuse_syncfs_enable(). */
+ if (fm->fc->user_ns == &init_user_ns)
+ flags |= FUSE_HAS_SYNCFS;
/*
* This is just an information flag for fuse server. No need to check
diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
index c13e1f9a2f12..de1002063ca2 100644
--- a/include/uapi/linux/fuse.h
+++ b/include/uapi/linux/fuse.h
@@ -240,6 +240,9 @@
* - add FUSE_COPY_FILE_RANGE_64
* - add struct fuse_copy_file_range_out
* - add FUSE_NOTIFY_PRUNE
+ *
+ * 7.46
+ * - add FUSE_HAS_SYNCFS opt-in flag for privileged userspace servers
*/
#ifndef _LINUX_FUSE_H
@@ -275,7 +278,7 @@
#define FUSE_KERNEL_VERSION 7
/** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 45
+#define FUSE_KERNEL_MINOR_VERSION 46
/** The node ID of the root inode */
#define FUSE_ROOT_ID 1
@@ -448,6 +451,11 @@ struct fuse_file_lock {
* FUSE_OVER_IO_URING: Indicate that client supports io-uring
* FUSE_REQUEST_TIMEOUT: kernel supports timing out requests.
* init_out.request_timeout contains the timeout (in secs)
+ * FUSE_HAS_SYNCFS: server requests that syncfs()/sync() be propagated as
+ * FUSE_SYNCFS requests. Only honored by the kernel for mounts
+ * owned by the initial user namespace (i.e. set up with real
+ * host privilege), since an untrusted server can use this to
+ * stall sync(). Unprivileged (user namespace) mounts ignore it.
*/
#define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1)
@@ -495,6 +503,7 @@ struct fuse_file_lock {
#define FUSE_ALLOW_IDMAP (1ULL << 40)
#define FUSE_OVER_IO_URING (1ULL << 41)
#define FUSE_REQUEST_TIMEOUT (1ULL << 42)
+#define FUSE_HAS_SYNCFS (1ULL << 43)
/**
* CUSE INIT request/reply flags
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 2/2] selftests/fuse: add test for FUSE_HAS_SYNCFS privilege gating
2026-06-16 15:19 [PATCH 0/2] fuse: allow FUSE_SYNCFS for privileged userspace servers Jimmy Zuber
2026-06-16 15:19 ` [PATCH 1/2] " Jimmy Zuber
@ 2026-06-16 15:19 ` Jimmy Zuber
1 sibling, 0 replies; 3+ messages in thread
From: Jimmy Zuber @ 2026-06-16 15:19 UTC (permalink / raw)
To: Miklos Szeredi
Cc: fuse-devel, linux-fsdevel, linux-api, linux-kernel, Shuah Khan,
linux-kselftest
Add a selftest that talks the raw FUSE protocol over /dev/fuse (rather
than via libfuse, which negotiates INIT internally) so it can both choose
whether to advertise FUSE_HAS_SYNCFS and directly observe whether a
FUSE_SYNCFS opcode is forwarded by the kernel.
Three cases are covered:
T1: host-root mount, server sets FUSE_HAS_SYNCFS
-> FUSE_SYNCFS must reach the server.
T2: host-root mount, server does not opt in
-> FUSE_SYNCFS must not be sent (back-compat).
T3: unprivileged user-namespace mount, server sets FUSE_HAS_SYNCFS
-> kernel must still withhold FUSE_SYNCFS.
T3 requires CONFIG_USER_NS and the ability to create an unprivileged
user-namespace mount; it is skipped otherwise.
Signed-off-by: Jimmy Zuber <jamz@amazon.com>
Assisted-by: Claude:claude-opus-4-8 [Claude-Code]
---
.../selftests/filesystems/fuse/.gitignore | 1 +
.../selftests/filesystems/fuse/Makefile | 2 +-
.../selftests/filesystems/fuse/test_syncfs.c | 318 ++++++++++++++++++
3 files changed, 320 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/filesystems/fuse/test_syncfs.c
diff --git a/tools/testing/selftests/filesystems/fuse/.gitignore b/tools/testing/selftests/filesystems/fuse/.gitignore
index 3e72e742d08e..4e92f363c74a 100644
--- a/tools/testing/selftests/filesystems/fuse/.gitignore
+++ b/tools/testing/selftests/filesystems/fuse/.gitignore
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0-only
fuse_mnt
fusectl_test
+test_syncfs
diff --git a/tools/testing/selftests/filesystems/fuse/Makefile b/tools/testing/selftests/filesystems/fuse/Makefile
index 612aad69a93a..cbba01635226 100644
--- a/tools/testing/selftests/filesystems/fuse/Makefile
+++ b/tools/testing/selftests/filesystems/fuse/Makefile
@@ -2,7 +2,7 @@
CFLAGS += -Wall -O2 -g $(KHDR_INCLUDES)
-TEST_GEN_PROGS := fusectl_test
+TEST_GEN_PROGS := fusectl_test test_syncfs
TEST_GEN_FILES := fuse_mnt
include ../../lib.mk
diff --git a/tools/testing/selftests/filesystems/fuse/test_syncfs.c b/tools/testing/selftests/filesystems/fuse/test_syncfs.c
new file mode 100644
index 000000000000..c00375ffeaea
--- /dev/null
+++ b/tools/testing/selftests/filesystems/fuse/test_syncfs.c
@@ -0,0 +1,318 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Test that FUSE_SYNCFS is propagated to a userspace server only when the
+ * server opts in with FUSE_HAS_SYNCFS *and* the mount is owned by the initial
+ * user namespace (i.e. set up with real host privilege).
+ *
+ * Unlike the libfuse-based selftests, this talks the raw FUSE wire protocol
+ * over /dev/fuse so it can (a) choose whether to advertise FUSE_HAS_SYNCFS in
+ * the INIT reply and (b) observe directly whether a FUSE_SYNCFS opcode arrives.
+ */
+#define _GNU_SOURCE
+#include <errno.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <sys/eventfd.h>
+#include <sys/mount.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <sys/uio.h>
+#include <sys/wait.h>
+#include <linux/fuse.h>
+
+#include "../../kselftest.h"
+
+#define FUSE_ROOT_ID 1
+
+/*
+ * eventfd the server child writes once when it receives FUSE_SYNCFS; the
+ * parent poll()s it to observe (or rule out) propagation.
+ */
+static int syncfs_evfd;
+
+static void reply(int fd, uint64_t unique, int error, void *data, size_t len)
+{
+ struct fuse_out_header oh = {
+ .len = sizeof(oh) + len,
+ .error = error,
+ .unique = unique,
+ };
+ struct iovec iov[2] = {
+ { &oh, sizeof(oh) },
+ { data, len },
+ };
+
+ if (writev(fd, iov, data ? 2 : 1) < 0)
+ ksft_print_msg("server writev failed: %s\n", strerror(errno));
+}
+
+static void fill_attr(struct fuse_attr *a, uint64_t ino, uint32_t mode,
+ uint32_t nlink)
+{
+ memset(a, 0, sizeof(*a));
+ a->ino = ino;
+ a->mode = mode;
+ a->nlink = nlink;
+ a->blksize = 4096;
+}
+
+/*
+ * Minimal FUSE server. Advertises FUSE_HAS_SYNCFS in its INIT reply iff
+ * @advertise is set. Signals syncfs_evfd when a FUSE_SYNCFS opcode arrives.
+ */
+#define SERVER_MAX_WRITE 65536
+static void run_server(int fd, int advertise)
+{
+ /*
+ * The kernel rejects reads (EINVAL) whose buffer is smaller than
+ * max_write + header, so size generously for the max_write we
+ * advertise in the INIT reply below.
+ */
+ static char buf[SERVER_MAX_WRITE + 4096];
+
+ for (;;) {
+ ssize_t n = read(fd, buf, sizeof(buf));
+ struct fuse_in_header *ih = (void *)buf;
+
+ if (n < 0) {
+ if (errno == EINTR || errno == EAGAIN)
+ continue;
+ return; /* device closed on unmount/abort */
+ }
+ if (n < (ssize_t)sizeof(*ih))
+ continue;
+
+ switch (ih->opcode) {
+ case FUSE_INIT: {
+ struct fuse_init_in *in = (void *)(ih + 1);
+ struct fuse_init_out out = {0};
+ uint64_t flags = FUSE_INIT_EXT;
+
+ out.major = FUSE_KERNEL_VERSION;
+ out.minor = FUSE_KERNEL_MINOR_VERSION;
+ out.max_readahead = in->max_readahead;
+ out.max_write = SERVER_MAX_WRITE;
+ out.max_background = 16;
+ out.congestion_threshold = 12;
+ if (advertise)
+ flags |= FUSE_HAS_SYNCFS;
+ out.flags = flags;
+ out.flags2 = flags >> 32;
+ reply(fd, ih->unique, 0, &out, sizeof(out));
+ break;
+ }
+ case FUSE_GETATTR: {
+ struct fuse_attr_out out = {0};
+
+ fill_attr(&out.attr, FUSE_ROOT_ID, S_IFDIR | 0755, 2);
+ reply(fd, ih->unique, 0, &out, sizeof(out));
+ break;
+ }
+ case FUSE_SYNCFS: {
+ uint64_t one = 1;
+
+ if (write(syncfs_evfd, &one, sizeof(one)) < 0)
+ ksft_print_msg("server eventfd write failed: %s\n",
+ strerror(errno));
+ reply(fd, ih->unique, 0, NULL, 0);
+ break;
+ }
+ default:
+ /*
+ * Anything else (e.g. OPENDIR from opening the mount
+ * root) is not needed to drive this test; -ENOSYS lets
+ * the kernel proceed.
+ */
+ reply(fd, ih->unique, -ENOSYS, NULL, 0);
+ break;
+ }
+ }
+}
+
+/*
+ * Mount a fuse fs backed by a forked server, issue syncfs(), and report
+ * whether the server observed FUSE_SYNCFS. Returns 0 on success, -1 if the
+ * environment could not support the test (caller should skip).
+ */
+static int do_mount_and_syncfs(const char *mnt, int advertise, int *seen)
+{
+ struct pollfd pfd = { .events = POLLIN };
+ char opts[256];
+ int fd, mfd = -1, i;
+ pid_t pid;
+
+ syncfs_evfd = eventfd(0, EFD_CLOEXEC);
+ if (syncfs_evfd < 0)
+ return -1;
+
+ fd = open("/dev/fuse", O_RDWR);
+ if (fd < 0)
+ goto out_evfd;
+
+ mkdir(mnt, 0755);
+ snprintf(opts, sizeof(opts),
+ "fd=%d,rootmode=40000,user_id=%d,group_id=%d",
+ fd, getuid(), getgid());
+
+ if (mount("fuse", mnt, "fuse", 0, opts) < 0)
+ goto out_fd;
+
+ pid = fork();
+ if (pid < 0)
+ goto out_umount;
+ if (pid == 0) {
+ run_server(fd, advertise);
+ _exit(0);
+ }
+
+ /*
+ * The parent does not service the fuse fd; the child does. Close our
+ * copy so the kernel sees a single server, and so that if the child
+ * dies the connection aborts instead of hanging us forever.
+ */
+ close(fd);
+
+ /*
+ * mount() returns before the server has answered FUSE_INIT, so the
+ * first open() can race and fail with ENOTCONN; retry until the
+ * handshake settles.
+ */
+ for (i = 0; i < 1000; i++) {
+ mfd = open(mnt, O_RDONLY | O_DIRECTORY);
+ if (mfd >= 0)
+ break;
+ usleep(1000);
+ }
+ if (mfd >= 0) {
+ syncfs(mfd);
+ close(mfd);
+ }
+
+ /*
+ * No waiting is needed: the server writes syncfs_evfd before it replies
+ * to FUSE_SYNCFS, and that reply is what unblocks the synchronous
+ * syncfs() above. So once syncfs() has returned, the eventfd is already
+ * signalled if the opcode was propagated, and will never be otherwise.
+ * poll() with a zero timeout therefore decides both cases immediately.
+ */
+ pfd.fd = syncfs_evfd;
+ *seen = poll(&pfd, 1, 0) > 0 && (pfd.revents & POLLIN);
+
+ kill(pid, SIGKILL);
+ waitpid(pid, NULL, 0);
+ umount2(mnt, MNT_DETACH);
+ close(syncfs_evfd);
+ return 0;
+
+out_umount:
+ umount2(mnt, MNT_DETACH);
+out_fd:
+ close(fd);
+out_evfd:
+ close(syncfs_evfd);
+ return -1;
+}
+
+/* T3: same as above but the mount is created inside a new user namespace. */
+static int run_in_userns(const char *mnt, int advertise, int *seen)
+{
+ uid_t uid = getuid();
+ gid_t gid = getgid();
+ char map[64];
+ int f;
+
+ if (unshare(CLONE_NEWUSER | CLONE_NEWNS) < 0)
+ return -1; /* unprivileged userns mounts unavailable */
+
+ f = open("/proc/self/setgroups", O_WRONLY);
+ if (f >= 0) {
+ dprintf(f, "deny");
+ close(f);
+ }
+ snprintf(map, sizeof(map), "0 %d 1", uid);
+ f = open("/proc/self/uid_map", O_WRONLY);
+ if (f < 0 || dprintf(f, "%s", map) < 0)
+ return -1;
+ close(f);
+ snprintf(map, sizeof(map), "0 %d 1", gid);
+ f = open("/proc/self/gid_map", O_WRONLY);
+ if (f < 0 || dprintf(f, "%s", map) < 0)
+ return -1;
+ close(f);
+
+ /* Need a mount namespace where we can mount fuse unprivileged. */
+ if (mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL) < 0)
+ return -1;
+
+ return do_mount_and_syncfs(mnt, advertise, seen);
+}
+
+int main(void)
+{
+ char mnt[] = "/tmp/fuse_syncfs_XXXXXX";
+ int seen, ret;
+
+ ksft_print_header();
+ ksft_set_plan(3);
+
+ /* Hard watchdog: never let a stuck syncfs hang the test runner. */
+ signal(SIGALRM, SIG_DFL);
+ alarm(60);
+
+ if (geteuid() != 0)
+ ksft_exit_skip("test requires root to mount fuse\n");
+
+ if (!mkdtemp(mnt))
+ ksft_exit_fail_msg("mkdtemp failed\n");
+
+ /* T1: host-root mount, server opts in -> syncfs must reach server. */
+ ret = do_mount_and_syncfs(mnt, 1, &seen);
+ if (ret < 0)
+ ksft_test_result_skip("T1: could not mount fuse\n");
+ else
+ ksft_test_result(seen,
+ "T1 host-root + FUSE_HAS_SYNCFS: server receives FUSE_SYNCFS\n");
+
+ /* T2: host-root mount, server does NOT opt in -> no FUSE_SYNCFS. */
+ ret = do_mount_and_syncfs(mnt, 0, &seen);
+ if (ret < 0)
+ ksft_test_result_skip("T2: could not mount fuse\n");
+ else
+ ksft_test_result(!seen,
+ "T2 host-root, no opt-in: server does NOT receive FUSE_SYNCFS\n");
+
+ /*
+ * T3: unprivileged userns mount, server opts in -> kernel must still
+ * withhold FUSE_SYNCFS. Run in a child since it unshares namespaces.
+ */
+ {
+ pid_t p = fork();
+
+ if (p == 0) {
+ int s = 0;
+ int r = run_in_userns(mnt, 1, &s);
+
+ _exit(r < 0 ? 2 : (s ? 1 : 0));
+ } else {
+ int status;
+
+ waitpid(p, &status, 0);
+ if (!WIFEXITED(status))
+ ksft_test_result_error("T3: child crashed\n");
+ else if (WEXITSTATUS(status) == 2)
+ ksft_test_result_skip("T3: userns fuse mount unavailable\n");
+ else
+ ksft_test_result(WEXITSTATUS(status) == 0,
+ "T3 unpriv userns + opt-in: FUSE_SYNCFS withheld\n");
+ }
+ }
+
+ rmdir(mnt);
+ ksft_finished();
+}
--
2.50.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-16 15:20 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-16 15:19 [PATCH 0/2] fuse: allow FUSE_SYNCFS for privileged userspace servers Jimmy Zuber
2026-06-16 15:19 ` [PATCH 1/2] " Jimmy Zuber
2026-06-16 15:19 ` [PATCH 2/2] selftests/fuse: add test for FUSE_HAS_SYNCFS privilege gating Jimmy Zuber
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.