From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 00/12 for v7.1] v7.1
Date: Fri, 10 Apr 2026 17:15:35 +0200 [thread overview]
Message-ID: <20260410-vfs-v71-b055f260060c@brauner> (raw)
Hey Linus,
This is the batch of pull requests for the v7.1 merge window.
This cycle has several new features and a good amount of infrastructure
work.
There are three new clone3() flags for pidfd-based process lifecycle
management. CLONE_AUTOREAP makes a child auto-reap on exit without
becoming a zombie. Unlike SA_NOCLDWAIT or SIG_IGN on SIGCHLD this is a
per-process property on the child, not a parent-scoped setting affecting
all children. The flag survives reparenting so subreaper or init won't
need to deal with it. CLONE_NNP sets no_new_privs on the child at clone
time, allowing the parent to impose it without affecting itself.
CLONE_PIDFD_AUTOKILL ties a child's lifetime to the pidfd returned from
clone3() - when the last reference to that struct file is closed the
kernel sends SIGKILL to the child. This is useful for container
runtimes, service managers, and sandboxed subprocess execution where the
child must die if the parent crashes. CLONE_PIDFD_AUTOKILL requires both
CLONE_PIDFD and CLONE_AUTOREAP. The pidfd_info struct also gains a
coredump_code field.
The mount namespace work introduces FSMOUNT_NAMESPACE for fsmount()
which creates a new mount namespace with the newly created filesystem
attached, returning a namespace fd instead of an O_PATH mount fd. This
accompanies last cycle's OPEN_TREE_NAMESPACE and is especially useful
when mounting a real filesystem to serve as a container rootfs.
Also new is support for creating empty mount namespaces via
CLONE_EMPTY_MNTNS for clone3() and UNSHARE_EMPTY_MNTNS for unshare().
These create a namespace containing only a single nullfs root mount with
an immutable empty directory. The intended workflow is to mount a real
filesystem over the root and build the mount table from scratch, which
avoids copying and tearing down the entire parent mount tree.
MOVE_MOUNT_BENEATH is extended to target the caller's rootfs and to
transfer the MNT_LOCKED property from the top mount to the mount
beneath. This allows safely modifying an inherited mount table after
unprivileged namespace creation via unshare(CLONE_NEWUSER | CLONE_NEWNS)
and makes it possible to switch out the rootfs without pivot_root(2).
The simple_xattr subsystem is reworked from an rbtree protected by a
reader-writer spinlock to an rhashtable with RCU-based lockless reads.
All consumers (shmem, kernfs, pidfs) are converted and the rbtree code
is removed. On top of this, user.* extended attributes are now supported
on sockets. Sockfs sockets get per-inode limits of 128 xattrs and 128KB
total value size. The practical motivation comes from systemd and GNOME
expanding their use of Varlink as an IPC mechanism - with user.* xattrs
a service can label its socket with the IPC protocol it speaks and eBPF
programs can selectively capture traffic on those sockets.
The inode->i_ino field is widened from unsigned long to u64. This is a
treewide change affecting format strings and tracepoints across 222
files. On 64-bit hosts this makes no material difference but it
eliminates the 32-bit hashing hacks various filesystems had to use.
Filesystem-level T10 protection information support is added. The
existing block layer PI code is refactored to be reusable and wired up
through iomap into XFS. This increases read performance up to 15% for
4k I/O compared to the automatic below-the-covers block layer approach.
The metadata buffer_head tracking accumulated in struct address_space
over the years is cleaned up and moved into filesystem-private inode
structures. The private_list, private_data, and private_lock fields are
removed from struct address_space, saving 3 longs in struct inode for
the vast majority of inodes.
The audit subsystem's excessive dput/dget calls during context setup and
reset are addressed by adding a pool of extra fs->pwd references to
fs_struct. This avoids the spinlock contention on the pwd dentry lock
that was causing noticeable performance regressions on systems with many
CPUs doing open/close with audit enabled.
The directory locking centralization continues. The remaining places
where explicit inode_lock(), lock_rename(), or similar are used outside
the core VFS are converted to use the start_creating/start_removing/
start_renaming interfaces. The biggest changes are in overlayfs,
with smaller conversions in cachefiles, nfsd, apparmor, and selinux.
lock_rename(), lock_rename_child(), and unlock_rename() are unexported.
The writeback subsystem gets new helper APIs and f2fs, gfs2, and nfs are
converted to stop accessing writeback internals directly.
Smaller items include namespace helper macros (FOR_EACH_NS_TYPE(),
CLONE_NS_ALL), FAT timestamp conversion KUnit tests, removal of unused
fs_context infrastructure now that the conversion is finished, a fix
for architecture-specific compat_ftruncate64 enforcing the non-LFS file
size limit, trivial ->setattr rename cleanups, dcache bucket count
fixes, mbcache shrink work ordering fix, omfs superblock validation,
a coredump tracepoint, dirent_size() helper, scoped user access
conversion, runtime const for file/bfile caches, and permitting
dynamic_dname()s up to NAME_MAX.
Thanks!
Christian
next reply other threads:[~2026-04-10 15:16 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 15:15 Christian Brauner [this message]
2026-04-10 15:16 ` [GIT PULL 01/12 for v7.1] vfs writeback Christian Brauner
2026-04-10 15:16 ` [GIT PULL 02/12 for v7.1] vfs xattr Christian Brauner
2026-04-10 15:16 ` [GIT PULL 03/12 for v7.1] vfs directory Christian Brauner
2026-04-10 15:17 ` [GIT PULL 04/12 for v7.1] vfs integrity Christian Brauner
2026-04-10 15:18 ` [GIT PULL 05/12 for v7.1] vfs fs_struct Christian Brauner
2026-04-10 15:18 ` [GIT PULL 06/12 for v7.1] vfs kino Christian Brauner
2026-04-10 15:19 ` [GIT PULL 07/12 for v7.1] vfs fat Christian Brauner
2026-04-10 15:19 ` [GIT PULL 08/12 for v7.1] vfs bh metadata Christian Brauner
2026-04-10 15:19 ` [GIT PULL 09/12 for v7.1] namespaces misc Christian Brauner
2026-04-10 15:21 ` [GIT PULL 10/12 for v7.1] vfs pidfs Christian Brauner
2026-04-10 15:21 ` [GIT PULL 11/12 for v7.1] vfs mount Christian Brauner
2026-04-10 15:23 ` [GIT PULL 12/12 for v7.1] vfs misc Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260410-vfs-v71-b055f260060c@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox