All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/16 for v7.2] v7.2
@ 2026-06-12 15:10 Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

This is the batch of pull requests for the v7.2 merge window.

This cycle is light on new uapi and heavy on infrastructure: a couple
of long-standing scalability problems are fixed and a few pieces of
filesystem behavior that file servers have wanted for a long time are
finally exposed.

Case folding behavior of local filesystems is now exposed so file
servers - nfsd, ksmbd, and user space servers - can report it to
clients instead of guessing. Filesystems report case-insensitive and
case-nonpreserving behavior through fileattr_get and nfsd implements
NFSv3 PATHCONF and the NFSv4 FATTR4_CASE_INSENSITIVE and
FATTR4_CASE_PRESERVING attributes which have been part of the NFS
protocols for decades. Windows NFS clients hard-require this
information for Win32 applications to behave correctly, the Linux
client uses it to disable negative dentry caching on case-insensitive
shares, and multi-protocol NFS/SMB servers need it to participate as
first-class citizens in such environments.

openat2() grows two new flags. O_EMPTYPATH allows reopening the file
behind an O_PATH file descriptor through an empty path string,
removing the detour through /proc/<pid>/fd and the procfs dependency
that comes with it. OPENAT2_REGULAR refuses to open anything but
regular files, returning the new EFTYPE error code, so services can
protect themselves against being redirected to fifos or device nodes.

exec gains a per-task task_exec_state structure holding the dumpable
mode and the user namespace captured at execve(). Both used to live on
mm_struct which exit_mm() clears long before a task is reaped, so
__ptrace_may_access() and several /proc visibility checks misbehaved
for zombies - denying legitimate access to non-dumpable zombies that
were running in nested user namespaces. exec also stops tearing down
the old mm while holding exec_update_lock and cred_guard_mutex, so
execve() of a large process no longer blocks ptrace_attach() and every
exec_update_lock reader for the duration of the teardown.

The VFS prerequisites for directory delegations land: lease holders
can opt out of having specific directory change events break their
delegation and fsnotify grows the helpers nfsd needs to drive
CB_NOTIFY callbacks from inotify watches in a future cycle.

Acquiring an inode reference becomes lockless as long as the refcount
was already at least 1, so only the 0->1 and 1->0 transitions take
inode->i_lock anymore.

The race between cgroup_writeback_umount() and inode_switch_wbs() that
could trigger "VFS: Busy inodes after unmount" and a use-after-free on
percpu counters is fixed, and the global serialization in the umount
path is replaced with a per-sb counter. Umount latency under cgroup
writeback churn drops from ~92-138ms p50 to ~5-8ms p50. Writeback also
learns to track dirty RWF_DONTCACHE pages per bdi_writeback so the
flusher can be kicked in a targeted fashion, improving uncached write
performance.

b_end_io is removed from struct buffer_head. The completion path loses
an indirect function call, struct buffer_head shrinks from 104 to 96
bytes, and a corruptible function pointer in the middle of a writable
data structure goes away. All in-tree users are converted to the new
bh_submit() interface.

fs/eventpoll.c is extensively documented and refactored. The
invariants the recent UAF fixes relied on were nowhere written down
and had to be reverse-engineered, so they are now codified in source,
long function bodies are split into named helpers, and the per-CTL_ADD
scratch state moves off file-scope globals. epoll also gains a
file-based control interface so io_uring can stop supporting nested
epoll contexts, and a long-standing race that made epoll_wait() report
false negatives with a zero timeout is fixed.

The simple xattr infrastructure moves its hash table into a
per-superblock cache and handles lazy allocation internally instead of
burdening every caller. On top of this bpffs gains support for
trusted.* and security.* xattrs so metadata like content hashes or
security labels can be attached to pinned objects.

iomap brings the vfs infrastructure required for fs-verity support in
XFS with a post-EOF merkle tree, stops pointlessly zeroing the iomap
on the final iteration which improves polled I/O IOPS by about 5%, and
introduces the IOMAP_F_ZERO_TAIL flag needed by filesystems with a
valid data length like exFAT and NTFS.

The string emitted from /proc/filesystems is pre-generated and cached
and the filesystems list is RCU-ified. The file is read by libselinux
and thus by a surprising number of programs; open+read+close goes from
~440k to ~1.06M ops/s single-threaded and from ~600k to ~3.3M ops/s
with 20 processes. procfs mounts with subset=pid are exempted from the
full mount visibility checks, unblocking procfs mounts in rootless
containers, and most ptrace_may_access() users in procfs now hold
exec_update_lock to avoid TOCTOU races with concurrent privileged
execve().

pipe writes pre-allocate pages outside pipe->mutex so readers no
longer stall behind a writer doing direct reclaim under the mutex,
improving throughput by 6-28% and up to 48% under memory pressure.

sget() is retired with the last users converted to sget_fc(), and the
exportfs support for block-style layouts is cleaned up in preparation
for multi-device filesystem exports.

Smaller items include a fix for the bpf dentry xattr kfuncs with
negative dentries, per-instance lockdep classes for rhashtable, fixes
and new helpers for the copy_struct_*() machinery, set_blocksize()
error handling for a pile of legacy filesystems that crashed when
mounting devices with sector size > PAGE_SIZE, SB_I_NOEXEC and
SB_I_NODEV being set by default in init_pseudo(), honouring SB_NOUSER
in the new mount API, a SOFTIRQ-unsafe lock order fix in fasync
signaling, an FS_USERNS_DELEGATABLE flag to unbreak delegated NFS
mounts in containers, documentation with guidelines for submitting new
filesystems, and assorted selftest fixes and cleanups.

Thanks!
Christian


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-06-12 15:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
2026-06-12 15:13 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state Christian Brauner
2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
2026-06-12 15:14 ` [GIT PULL 10/16 for v7.2] vfs writeback Christian Brauner
2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.