All of lore.kernel.org
 help / color / mirror / Atom feed
* [GIT PULL 00/16 for v7.2] v7.2
@ 2026-06-12 15:10 Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
                   ` (15 more replies)
  0 siblings, 16 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

This is the batch of pull requests for the v7.2 merge window.

This cycle is light on new uapi and heavy on infrastructure: a couple
of long-standing scalability problems are fixed and a few pieces of
filesystem behavior that file servers have wanted for a long time are
finally exposed.

Case folding behavior of local filesystems is now exposed so file
servers - nfsd, ksmbd, and user space servers - can report it to
clients instead of guessing. Filesystems report case-insensitive and
case-nonpreserving behavior through fileattr_get and nfsd implements
NFSv3 PATHCONF and the NFSv4 FATTR4_CASE_INSENSITIVE and
FATTR4_CASE_PRESERVING attributes which have been part of the NFS
protocols for decades. Windows NFS clients hard-require this
information for Win32 applications to behave correctly, the Linux
client uses it to disable negative dentry caching on case-insensitive
shares, and multi-protocol NFS/SMB servers need it to participate as
first-class citizens in such environments.

openat2() grows two new flags. O_EMPTYPATH allows reopening the file
behind an O_PATH file descriptor through an empty path string,
removing the detour through /proc/<pid>/fd and the procfs dependency
that comes with it. OPENAT2_REGULAR refuses to open anything but
regular files, returning the new EFTYPE error code, so services can
protect themselves against being redirected to fifos or device nodes.

exec gains a per-task task_exec_state structure holding the dumpable
mode and the user namespace captured at execve(). Both used to live on
mm_struct which exit_mm() clears long before a task is reaped, so
__ptrace_may_access() and several /proc visibility checks misbehaved
for zombies - denying legitimate access to non-dumpable zombies that
were running in nested user namespaces. exec also stops tearing down
the old mm while holding exec_update_lock and cred_guard_mutex, so
execve() of a large process no longer blocks ptrace_attach() and every
exec_update_lock reader for the duration of the teardown.

The VFS prerequisites for directory delegations land: lease holders
can opt out of having specific directory change events break their
delegation and fsnotify grows the helpers nfsd needs to drive
CB_NOTIFY callbacks from inotify watches in a future cycle.

Acquiring an inode reference becomes lockless as long as the refcount
was already at least 1, so only the 0->1 and 1->0 transitions take
inode->i_lock anymore.

The race between cgroup_writeback_umount() and inode_switch_wbs() that
could trigger "VFS: Busy inodes after unmount" and a use-after-free on
percpu counters is fixed, and the global serialization in the umount
path is replaced with a per-sb counter. Umount latency under cgroup
writeback churn drops from ~92-138ms p50 to ~5-8ms p50. Writeback also
learns to track dirty RWF_DONTCACHE pages per bdi_writeback so the
flusher can be kicked in a targeted fashion, improving uncached write
performance.

b_end_io is removed from struct buffer_head. The completion path loses
an indirect function call, struct buffer_head shrinks from 104 to 96
bytes, and a corruptible function pointer in the middle of a writable
data structure goes away. All in-tree users are converted to the new
bh_submit() interface.

fs/eventpoll.c is extensively documented and refactored. The
invariants the recent UAF fixes relied on were nowhere written down
and had to be reverse-engineered, so they are now codified in source,
long function bodies are split into named helpers, and the per-CTL_ADD
scratch state moves off file-scope globals. epoll also gains a
file-based control interface so io_uring can stop supporting nested
epoll contexts, and a long-standing race that made epoll_wait() report
false negatives with a zero timeout is fixed.

The simple xattr infrastructure moves its hash table into a
per-superblock cache and handles lazy allocation internally instead of
burdening every caller. On top of this bpffs gains support for
trusted.* and security.* xattrs so metadata like content hashes or
security labels can be attached to pinned objects.

iomap brings the vfs infrastructure required for fs-verity support in
XFS with a post-EOF merkle tree, stops pointlessly zeroing the iomap
on the final iteration which improves polled I/O IOPS by about 5%, and
introduces the IOMAP_F_ZERO_TAIL flag needed by filesystems with a
valid data length like exFAT and NTFS.

The string emitted from /proc/filesystems is pre-generated and cached
and the filesystems list is RCU-ified. The file is read by libselinux
and thus by a surprising number of programs; open+read+close goes from
~440k to ~1.06M ops/s single-threaded and from ~600k to ~3.3M ops/s
with 20 processes. procfs mounts with subset=pid are exempted from the
full mount visibility checks, unblocking procfs mounts in rootless
containers, and most ptrace_may_access() users in procfs now hold
exec_update_lock to avoid TOCTOU races with concurrent privileged
execve().

pipe writes pre-allocate pages outside pipe->mutex so readers no
longer stall behind a writer doing direct reclaim under the mutex,
improving throughput by 6-28% and up to 48% under memory pressure.

sget() is retired with the last users converted to sget_fc(), and the
exportfs support for block-style layouts is cleaned up in preparation
for multi-device filesystem exports.

Smaller items include a fix for the bpf dentry xattr kfuncs with
negative dentries, per-instance lockdep classes for rhashtable, fixes
and new helpers for the copy_struct_*() machinery, set_blocksize()
error handling for a pile of legacy filesystems that crashed when
mounting devices with sector size > PAGE_SIZE, SB_I_NOEXEC and
SB_I_NODEV being set by default in init_pseudo(), honouring SB_NOUSER
in the new mount API, a SOFTIRQ-unsafe lock order fix in fasync
signaling, an FS_USERNS_DELEGATABLE flag to unbreak delegated NFS
mounts in containers, documentation with guidelines for submitting new
filesystems, and assorted selftest fixes and cleanups.

Thanks!
Christian


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 01/16 for v7.2] vfs kfunc
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
@ 2026-06-12 15:11 ` Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains a fix for the bpf filesystem kfuncs.

The bpf_set_dentry_xattr() and bpf_remove_dentry_xattr() kfuncs locked
the inode of the supplied dentry without checking whether the dentry is
negative. Passing a negative dentry (e.g., from security_inode_create)
caused a NULL pointer dereference. Negative dentries now fail with
EINVAL. The WARN_ON(!inode) in the bpf xattr permission helpers is
dropped as well since it could be triggered the same way, amounting to
a denial of service on systems with panic_on_warn enabled.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.kfunc

for you to fetch changes up to 07410646f6ff1d23222f105ccab778957d401bbe:

  bpf: fix crash in bpf_[set|remove]_dentry_xattr for negative dentries (2026-05-11 11:23:00 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.kfunc

Please consider pulling these changes from the signed vfs-7.2-rc1.kfunc tag.

Thanks!
Christian

----------------------------------------------------------------
Matt Bobrowski (1):
      bpf: fix crash in bpf_[set|remove]_dentry_xattr for negative dentries

 fs/bpf_fs_kfuncs.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 02/16 for v7.2] vfs exportfs
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
@ 2026-06-12 15:11 ` Christian Brauner
  2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This cleans up the exportfs support for block-style layouts that
provide direct block device access: the operations for layout-based
block device access are split out of struct export_operations into a
separate header, ->commit_blocks() no longer takes a struct iattr
argument, and the way support for layout-based block device access is
detected is reworked. nfsd's blocklayout code also stops honoring
loca_time_modify. This is preparation for supporting export of more
than a single device per file system.

Note that the nfsd tree is based on a merge of this branch so these
changes may also reach you through the nfsd pull request.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.exportfs

for you to fetch changes up to 79e33ddc62c03cce6c29f0792454e1d618228acf:

  Merge patch series "cleanup block-style layouts exports" (2026-05-11 11:11:55 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.exportfs

Please consider pulling these changes from the signed vfs-7.2-rc1.exportfs tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "cleanup block-style layouts exports"

Christoph Hellwig (4):
      nfsd/blocklayout: always ignore loca_time_modify
      exportfs: split out the ops for layout-based block device access
      exportfs: don't pass struct iattr to ->commit_blocks
      exportfs,nfsd: rework checking for layout-based block device access support

 MAINTAINERS                    |  2 +-
 fs/nfsd/blocklayout.c          | 37 ++++++++----------
 fs/nfsd/export.c               |  3 +-
 fs/nfsd/nfs4layouts.c          | 29 ++++----------
 fs/xfs/xfs_export.c            |  4 +-
 fs/xfs/xfs_pnfs.c              | 44 +++++++++++++++------
 fs/xfs/xfs_pnfs.h              | 11 +++---
 include/linux/exportfs.h       | 25 ++++--------
 include/linux/exportfs_block.h | 88 ++++++++++++++++++++++++++++++++++++++++++
 9 files changed, 162 insertions(+), 81 deletions(-)
 create mode 100644 include/linux/exportfs_block.h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 03/16 for v7.2] vfs inode
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
  2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
@ 2026-06-12 15:12 ` Christian Brauner
  2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This extends the lockless ->i_count handling. iput() could already
decrement any value greater than 1 locklessly but acquiring a
reference always required taking inode->i_lock. Now acquiring a
reference is lockless as long as the count was already at least 1,
i.e., only the 0->1 and 1->0 transitions take the lock. This avoids
the lock for the common cases of nfs calling into the inode hash and
btrfs using igrab(). Cleanup-wise icount_read_once() is added to line
up with inode_state_read_once() and the open-coded ->i_count loads
across the tree are converted, and ihold() is relocated and tidied up.

On top of that some stale lock ordering annotations are retired from
the inode hash code: iunique() no longer takes the hash lock since the
inode hash became RCU-searchable and s_inode_list_lock is no longer
taken under the hash lock either.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This has a merge conflict with the xfs tree in fs/xfs/xfs_trace.h
between commit 1113a6d6d5d133 ("xfs: remove the i_ino field in struct
xfs_inode") from the xfs tree and commit 769e143b115a4a ("fs: add
icount_read_once() and stop open-coding ->i_count loads") from this
tree, reported in [1]. It can be resolved as follows:

[1]: https://lore.kernel.org/linux-next/aigwDvQMI2CHiLl3@sirena.co.uk

diff --cc fs/xfs/xfs_trace.h
index ae5faa78783005,f87c738d84b248..00000000000000
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@@ -1157,8 -1157,8 +1157,8 @@@ DECLARE_EVENT_CLASS(xfs_iref_class
  	),
  	TP_fast_assign(
  		__entry->dev = VFS_I(ip)->i_sb->s_dev;
 -		__entry->ino = ip->i_ino;
 +		__entry->ino = I_INO(ip);
- 		__entry->count = icount_read(VFS_I(ip));
+ 		__entry->count = icount_read_once(VFS_I(ip));
  		__entry->pincount = atomic_read(&ip->i_pincount);
  		__entry->iflags = ip->i_flags;
  		__entry->caller_ip = caller_ip;

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.inode

for you to fetch changes up to 5b451b76c85c8309d2e02caa467b38f5999c986f:

  fs: retire stale lock ordering annotations from inode hash (2026-05-11 23:12:29 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.inode

Please consider pulling these changes from the signed vfs-7.2-rc1.inode tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "assorted ->i_count changes + extension of lockless handling"

Mateusz Guzik (4):
      fs: add icount_read_once() and stop open-coding ->i_count loads
      fs: relocate and tidy up ihold()
      fs: allow lockless ->i_count bumps as long as it does not transition 0->1
      fs: retire stale lock ordering annotations from inode hash

 arch/powerpc/platforms/cell/spufs/file.c |   2 +-
 fs/btrfs/inode.c                         |   2 +-
 fs/ceph/mds_client.c                     |   2 +-
 fs/dcache.c                              |   4 ++
 fs/ext4/ialloc.c                         |   4 +-
 fs/hpfs/inode.c                          |   2 +-
 fs/inode.c                               | 100 +++++++++++++++++++++++++------
 fs/nfs/inode.c                           |   4 +-
 fs/smb/client/inode.c                    |   2 +-
 fs/ubifs/super.c                         |   2 +-
 fs/xfs/xfs_inode.c                       |   2 +-
 fs/xfs/xfs_trace.h                       |   2 +-
 include/linux/fs.h                       |  13 ++++
 include/trace/events/filelock.h          |   2 +-
 security/landlock/fs.c                   |   2 +-
 15 files changed, 112 insertions(+), 33 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 04/16 for v7.2] vfs directory delegations
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (2 preceding siblings ...)
  2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
@ 2026-06-12 15:12 ` Christian Brauner
  2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the VFS prerequisites for supporting directory
delegations in nfsd via CB_NOTIFY callbacks.

The filelock core gains support for ignoring delegation breaks for
directory change events together with an inode_lease_ignore_mask()
helper, and fsnotify gains fsnotify_modify_mark_mask() and a
FSNOTIFY_EVENT_RENAME data type. With this in place nfsd can request
delegations on directories and set up inotify watches to trigger
sending CB_NOTIFY events to clients instead of having every directory
change break the delegation. New tracepoints are added to fsnotify()
and to the start of break_lease(), and trace_break_lease_block() is
passed the currently blocking lease instead of the new one.

A follow-up fix moves the LEASE_BREAK_* flags out of
#ifdef CONFIG_FILE_LOCKING to fix the build for CONFIG_FILE_LOCKING=n
configurations.

Note that the nfsd tree is based on a merge of this branch so these
changes may also reach you through the nfsd pull request.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.directory.delegations

for you to fetch changes up to 246bc86d0fd891273a8502314f158eab23af823c:

  filelock: move LEASE_BREAK_* flags out of #ifdef CONFIG_FILE_LOCKING (2026-05-16 17:05:52 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.directory.delegations

Please consider pulling these changes from the signed vfs-7.2-rc1.directory.delegations tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "VFS changes for nfsd CB_NOTIFY callbacks in directory delegations"

Jeff Layton (8):
      filelock: pass current blocking lease to trace_break_lease_block() rather than "new_fl"
      filelock: add support for ignoring deleg breaks for dir change events
      filelock: add a tracepoint to start of break_lease()
      filelock: add an inode_lease_ignore_mask helper
      fsnotify: new tracepoint in fsnotify()
      fsnotify: add fsnotify_modify_mark_mask()
      fsnotify: add FSNOTIFY_EVENT_RENAME data type
      filelock: move LEASE_BREAK_* flags out of #ifdef CONFIG_FILE_LOCKING

 fs/attr.c                        |   2 +-
 fs/locks.c                       | 118 ++++++++++++++++++++++++++++++---------
 fs/namei.c                       |  31 +++++-----
 fs/notify/fsnotify.c             |   5 ++
 fs/notify/mark.c                 |  29 ++++++++++
 fs/posix_acl.c                   |   4 +-
 fs/xattr.c                       |   4 +-
 include/linux/filelock.h         |  66 ++++++++++++++--------
 include/linux/fsnotify.h         |   8 ++-
 include/linux/fsnotify_backend.h |  21 +++++++
 include/trace/events/filelock.h  |  38 ++++++++++++-
 include/trace/events/fsnotify.h  |  51 +++++++++++++++++
 include/trace/misc/fsnotify.h    |  35 ++++++++++++
 13 files changed, 341 insertions(+), 71 deletions(-)
 create mode 100644 include/trace/events/fsnotify.h
 create mode 100644 include/trace/misc/fsnotify.h

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 05/16 for v7.2] vfs casefold
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (3 preceding siblings ...)
  2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
@ 2026-06-12 15:12 ` Christian Brauner
  2026-06-12 15:13 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state Christian Brauner
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This exposes the case folding behavior of local filesystems so that
file servers - nfsd, ksmbd, and user space file servers - can report
the actual behavior to clients instead of guessing.

Filesystems report case-insensitive and case-nonpreserving behavior
via new file_kattr flags in their fileattr_get implementations. fat,
exfat, ntfs3, hfs, hfsplus, xfs, cifs, nfs, vboxsf, and isofs are
wired up; local filesystems not explicitly handled default to the
usual POSIX behavior of case-sensitive and case-preserving. nfsd uses
this to report case folding via NFSv3 PATHCONF and to implement the
NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING attributes -
both have been part of the NFS protocols for decades to support
clients on non-POSIX systems - and ksmbd reports it via
FS_ATTRIBUTE_INFORMATION. Exposing the information through the
fileattr uapi covers user space file servers.

The immediate motivation is interoperability: Windows NFS clients
hard-require servers to report case-insensitivity for Win32
applications to work correctly, and a client that knows the server is
case-insensitive can avoid issuing multiple LOOKUP/READDIR requests
searching for case variants. The Linux NFS client already grew
support for case-insensitive shares years ago in support of the
Hammerspace NFS server - negative dentry caching must be disabled (a
lookup for "FILE.TXT" failing must not cache a negative entry when
"file.txt" exists) and directory change invalidation must drop cached
case-folded name variants. Such servers often operate in
multi-protocol environments where a single file service instance
caters to both NFS and SMB clients, and nfsd needs to report case
folding properly to participate as a first-class citizen there.

A follow-up series brings fixes for the initial work: the nfsd
case-info probe now uses kernel credentials, maps -ESTALE to
NFS3ERR_STALE, and has its cost capped across READDIR entries; the
nfs client avoids transiently zeroed case capability bits during the
probe and skips the pathconf probe when neither field is consumed;
the FS_CASEFOLD_FL semantics are clarified in the UAPI header; and
the tools UAPI headers are synced.

Note that the nfsd tree is based on a merge of this branch so these
changes may also reach you through the nfsd pull request.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This has a merge conflict with the ntfs3 tree in fs/ntfs3/file.c,
fs/ntfs3/namei.c, and fs/ntfs3/ntfs_fs.h between commit eeb7b37b9700f
("ntfs3: Implement fileattr_get for case sensitivity") from this tree
and commit 245bbdd2b9d65 ("fs/ntfs3: add fileattr support") from the
ntfs3 tree. Reported with resolution in [1].

[1]: https://lore.kernel.org/linux-next/ahmF4spkQMYcQMGI@sirena.org.uk

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.casefold

for you to fetch changes up to ea3120fd5153c967efb20e6e3330caecbf9d8b0a:

  Merge patch series "Casefold Fixes" (2026-05-15 17:49:29 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.casefold

Please consider pulling these changes from the signed vfs-7.2-rc1.casefold tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "Exposing case folding behavior"
      Merge patch series "Casefold Fixes"

Chuck Lever (22):
      fs: Move file_kattr initialization to callers
      fs: Add case sensitivity flags to file_kattr
      fat: Implement fileattr_get for case sensitivity
      exfat: Implement fileattr_get for case sensitivity
      ntfs3: Implement fileattr_get for case sensitivity
      hfs: Implement fileattr_get for case sensitivity
      hfsplus: Report case sensitivity in fileattr_get
      xfs: Report case sensitivity in fileattr_get
      cifs: Implement fileattr_get for case sensitivity
      nfs: Implement fileattr_get for case sensitivity
      vboxsf: Implement fileattr_get for case sensitivity
      isofs: Implement fileattr_get for case sensitivity
      nfsd: Report export case-folding via NFSv3 PATHCONF
      nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
      ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
      tools headers UAPI: Sync case-sensitivity flags from linux/fs.h
      nfs: Avoid transient zeroed case capability bits during probe
      nfs: Skip pathconf probe when neither field is consumed
      fs: Clarify FS_CASEFOLD_FL semantics in UAPI header
      nfsd: Use kernel credentials for case-info probe
      nfsd: Map -ESTALE from case probe to NFS3ERR_STALE
      nfsd: Cap case-folding probe cost across READDIR entries

 fs/exfat/exfat_fs.h                             |  2 +
 fs/exfat/file.c                                 | 18 ++++-
 fs/exfat/namei.c                                |  1 +
 fs/fat/fat.h                                    |  3 +
 fs/fat/file.c                                   | 36 +++++++++
 fs/fat/namei_msdos.c                            |  1 +
 fs/fat/namei_vfat.c                             |  1 +
 fs/file_attr.c                                  | 16 ++--
 fs/hfs/dir.c                                    |  1 +
 fs/hfs/hfs_fs.h                                 |  2 +
 fs/hfs/inode.c                                  | 14 ++++
 fs/hfsplus/inode.c                              | 16 +++-
 fs/isofs/dir.c                                  | 16 ++++
 fs/isofs/isofs.h                                |  3 +
 fs/nfs/client.c                                 | 32 +++++---
 fs/nfs/inode.c                                  | 15 ++++
 fs/nfs/internal.h                               |  3 +
 fs/nfs/namespace.c                              |  2 +
 fs/nfs/nfs3proc.c                               |  2 +
 fs/nfs/nfs3xdr.c                                |  7 +-
 fs/nfs/nfs4proc.c                               | 10 ++-
 fs/nfs/proc.c                                   |  3 +
 fs/nfs/symlink.c                                |  3 +
 fs/nfsd/nfs3proc.c                              | 39 ++++++++--
 fs/nfsd/nfs4xdr.c                               | 99 +++++++++++++++++++++++--
 fs/nfsd/vfs.c                                   | 86 +++++++++++++++++++++
 fs/nfsd/vfs.h                                   |  3 +
 fs/nfsd/xdr3.h                                  |  4 +-
 fs/nfsd/xdr4.h                                  | 14 ++++
 fs/ntfs3/file.c                                 | 29 ++++++++
 fs/ntfs3/namei.c                                |  1 +
 fs/ntfs3/ntfs_fs.h                              |  1 +
 fs/smb/client/cifsfs.c                          | 53 +++++++++++++
 fs/smb/client/cifsfs.h                          |  3 +
 fs/smb/client/namespace.c                       |  1 +
 fs/smb/server/smb2pdu.c                         | 30 ++++++--
 fs/vboxsf/dir.c                                 |  1 +
 fs/vboxsf/file.c                                |  6 +-
 fs/vboxsf/super.c                               |  7 ++
 fs/vboxsf/utils.c                               | 30 ++++++++
 fs/vboxsf/vfsmod.h                              |  6 ++
 fs/xfs/libxfs/xfs_inode_util.c                  |  2 +
 fs/xfs/xfs_ioctl.c                              | 22 +++++-
 include/linux/fileattr.h                        |  3 +-
 include/linux/nfs_fs_sb.h                       |  2 +-
 include/linux/nfs_xdr.h                         |  2 +
 include/uapi/linux/fs.h                         | 18 ++++-
 tools/perf/trace/beauty/include/uapi/linux/fs.h |  7 ++
 48 files changed, 618 insertions(+), 58 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 06/16 for v7.2] kernel task_exec_state
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (4 preceding siblings ...)
  2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
@ 2026-06-12 15:13 ` Christian Brauner
  2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This introduces a new per-task task_exec_state structure and relocates
the dumpable mode and the user namespace captured at execve() from
mm_struct onto it. It stays attached to the task for its full
lifetime.

__ptrace_may_access() and several /proc owner and visibility checks
need to consult two pieces of state for any observable task, including
zombies that have already gone through exit_mm(): the dumpable mode
and the user namespace captured at execve(). Both live on mm_struct
today, which exit_mm() clears from the task long before the task is
reaped. A reader that races with do_exit() observes task->mm == NULL
and either fails the check or falls back to init_user_ns - which
denies legitimate access to non-dumpable zombies that were running in
a nested user namespace.

mm_struct loses ->user_ns and the dumpability bits in ->flags.
MMF_DUMPABLE_BITS is reserved so the MMF_DUMP_FILTER_* layout exposed
via /proc/<pid>/coredump_filter stays stable. task->user_dumpable and
its exit_mm() snapshot are removed.

task_exec_state is the privilege domain established by an execve().
Within a thread group it is shared via refcount; across thread groups
each task has its own:

- CLONE_VM siblings (thread-group members, io_uring workers)
  refcount-share the parent's exec_state.

- Non-CLONE_VM clones (fork(), vfork() without CLONE_VM) allocate a
  fresh exec_state inheriting the parent's dumpable mode and user_ns.

- execve() in the child allocates a fresh instance and installs it
  under task_lock + exec_update_lock via task_exec_state_replace().

- Credential changes (setresuid, capset, ...) and
  prctl(PR_SET_DUMPABLE) update dumpability on the current task's
  exec_state, i.e., on the thread group's shared instance.

On top of this exec_mmap() no longer tears down the old mm while
holding exec_update_lock for writing and cred_guard_mutex. Neither
lock is needed for that: exec_update_lock only exists to make the mm
swap atomic with the later commit_creds() and all its readers operate
on the new mm; none looks at the detached old mm. The cost was real:
__mmput() runs exit_mmap() over the entire old address space and can
block in exit_aio() waiting for in-flight AIO, so execve() of a large
process blocked ptrace_attach() and every exec_update_lock reader for
the duration of the teardown. The old mm is now stashed in
bprm->old_mm and released from setup_new_exec() after both locks are
dropped, with a backstop in free_bprm() for the error paths.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 5200f5f493f79f14bbdc349e402a40dfb32f23c8:

  Linux 7.1-rc4 (2026-05-17 13:59:58 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/kernel-7.2-rc1.task_exec_state

for you to fetch changes up to 38205ecbe6b6dc47968ad4e9c978e2117720969e:

  exec: free the old mm outside the exec locks (2026-05-26 11:02:02 +0200)

----------------------------------------------------------------
kernel-7.2-rc1.task_exec_state

Please consider pulling these changes from the signed kernel-7.2-rc1.task_exec_state tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "exec: introduce task_exec_state for exec-time metadata"
      exec: free the old mm outside the exec locks

Christian Brauner (Amutable) (4):
      sched/coredump: introduce enum task_dumpable
      exec: introduce struct task_exec_state
      ptrace: add ptracer_access_allowed()
      exec_state: relocate dumpable information

 arch/arm64/kernel/mte.c          |   6 +-
 drivers/firmware/efi/efi.c       |   1 -
 fs/coredump.c                    |  22 +++-----
 fs/exec.c                        |  65 +++++++++++++--------
 fs/pidfs.c                       |  23 +++-----
 fs/proc/base.c                   |  39 ++++++-------
 include/linux/binfmts.h          |   3 +
 include/linux/coredump.h         |   4 ++
 include/linux/mm_types.h         |   9 ++-
 include/linux/ptrace.h           |   1 +
 include/linux/sched.h            |   6 +-
 include/linux/sched/coredump.h   |  47 ++++------------
 include/linux/sched/exec_state.h |  31 ++++++++++
 init/init_task.c                 |  10 ++++
 kernel/Makefile                  |   2 +-
 kernel/cred.c                    |   3 +-
 kernel/exec_state.c              | 119 +++++++++++++++++++++++++++++++++++++++
 kernel/exit.c                    |   1 -
 kernel/fork.c                    |  33 +++++++++--
 kernel/kthread.c                 |   1 -
 kernel/ptrace.c                  |  51 +++++++++++------
 kernel/sys.c                     |   6 +-
 mm/init-mm.c                     |   1 -
 23 files changed, 329 insertions(+), 155 deletions(-)
 create mode 100644 include/linux/sched/exec_state.h
 create mode 100644 kernel/exec_state.c

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 07/16 for v7.2] kernel misc
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (5 preceding siblings ...)
  2026-06-12 15:13 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state Christian Brauner
@ 2026-06-12 15:13 ` Christian Brauner
  2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

Fixes

- rhashtable: give each instance its own lockdep class

  syzbot reported a circular locking dependency between ht->mutex and
  fs_reclaim via the simple_xattrs rhashtable being torn down during
  inode eviction. The predicted deadlock cannot occur:
  rhashtable_free_and_destroy() cancels the deferred worker before
  taking ht->mutex and acquisitions on distinct rhashtables are on
  distinct mutexes. Lockdep flags a cycle anyway because every
  ht->mutex in the kernel shared the single static lockdep class from
  rhashtable_init_noprof(). The lockdep key is lifted to a
  per-call-site static key so every rhashtable instance gets its own
  class.

- selftests/clone3: fix misuse of the libcap library interface in the
  cap_checkpoint_restore test and remove unused variables

- selftests/pid_namespace: compute the pid_max test limits dynamically
  instead of hardcoding values below the kernel-enforced minimum of
  PIDS_PER_CPU_MIN * num_possible_cpus() which made the tests fail on
  machines with many possible CPUs

- selftests: fix the Makefile TARGETS entry for nsfs which wasn't
  adjusted when the tests moved under filesystems/

Cleanups

- ipc/sem.c: use unsigned int for nsops to match the declaration in
  syscalls.h

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/kernel-7.2-rc1.misc

for you to fetch changes up to ee8ab98f831226d69d43ccd93f53c50e6f19b389:

  Merge patch series "selftests/clone3: fix cap_checkpoint_restore test" (2026-05-27 14:11:47 +0200)

----------------------------------------------------------------
kernel-7.2-rc1.misc

Please consider pulling these changes from the signed kernel-7.2-rc1.misc tag.

Thanks!
Christian

----------------------------------------------------------------
Bjoern Doebel (1):
      selftests/pid_namespace: compute pid_max test limits dynamically

Christian Brauner (2):
      rhashtable: give each instance its own lockdep class
      Merge patch series "selftests/clone3: fix cap_checkpoint_restore test"

Eva Kurchatova (1):
      selftests/clone3: fix libcap interface usage

Florian Schmaus (1):
      selftests: Fix Makefile target for nsfs

Konstantin Khorenko (1):
      selftests/clone3: remove unused variables

Yi Xie (1):
      ipc/sem.c: use unsigned int for nsops

 include/linux/rhashtable-types.h                   |  22 ++-
 ipc/sem.c                                          |   6 +-
 lib/rhashtable.c                                   |  17 ++-
 tools/testing/selftests/Makefile                   |   2 +-
 .../clone3/clone3_cap_checkpoint_restore.c         |  24 +---
 tools/testing/selftests/pid_namespace/pid_max.c    | 156 ++++++++++++++++-----
 6 files changed, 161 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 08/16 for v7.2] vfs openat2
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (6 preceding siblings ...)
  2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
@ 2026-06-12 15:13 ` Christian Brauner
  2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the openat2 changes for this cycle:

Features

- Add O_EMPTYPATH to openat(2)/openat2(2). To get an operable file
  descriptor from an O_PATH file descriptor it is possible to use
  openat(fd, ".", O_DIRECTORY) for directories, but other file types
  require going through open("/proc/<pid>/fd/<nr>") and thus depend on
  a functioning procfs. With O_EMPTYPATH an empty path string is
  accepted and LOOKUP_EMPTY is set at path resolution time, allowing
  to reopen the file behind the file descriptor directly. Selftests
  are included.

- Add an OPENAT2_REGULAR flag for openat2(2) which refuses to open
  anything but regular files with the new EFTYPE error code. This
  implements the "ability to only open regular files" feature
  requested by userspace via uapi-group.org and protects services
  from being redirected to fifos, device nodes, and friends.

  All atomic_open implementations were audited for OPENAT2_REGULAR
  handling. Explicit checks were added to ceph, gfs2, nfs (v4), and
  cifs/smb - these are the filesystems whose atomic_open can encounter
  an existing non-regular file and would otherwise call finish_open()
  on it or return a misleading error code. The remaining
  implementations (9p, fuse, vboxsf, nfs v2/v3) only call
  finish_open() on freshly created files and use finish_no_open() for
  lookup hits, letting the VFS catch non-regular files via the
  do_open() safety net.

Cleanups

- Migrate the openat2 selftests to the kselftest harness and move
  them under selftests/filesystems/. The tests were written in the
  early days of selftests' TAP support and the modern kselftest
  harness is much easier to follow and maintain. The contents of the
  tests are unchanged and the new emptypath tests are ported on top.

- Make the LAST_XXX last-type constants private to fs/namei.c. The
  only user outside of fs/namei.c was ksmbd which only needs to know
  whether the last component is a regular one, so
  vfs_path_parent_lookup() now performs the LAST_NORM check
  internally. The ints are replaced with a dedicated enum last_type.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.openat2

for you to fetch changes up to 318643721de396012da102723f337f35ba7ec1e9:

  vfs: replace ints with enum last_type for LAST_XXX (2026-05-29 09:47:02 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.openat2

Please consider pulling these changes from the signed vfs-7.2-rc1.openat2 tag.

Thanks!
Christian

----------------------------------------------------------------
Aleksa Sarai (4):
      selftests: move openat2 tests to selftests/filesystems/
      selftests: openat2: move helpers to header
      selftests: openat2: switch from custom ARRAY_LEN to ARRAY_SIZE
      selftests: openat2: migrate to kselftest harness

Christian Brauner (4):
      Merge patch series "selftests: openat2: migrate to kselftest harness"
      Merge patch series "vfs: add O_EMPTYPATH to openat(2)/openat2(2)"
      Merge patch series "OPENAT2_REGULAR flag support for openat2"
      selftests: openat2: port emptypath_test to kselftest harness

Dorjoy Chowdhury (3):
      openat2: introduce EFTYPE error code
      openat2: new OPENAT2_REGULAR flag support
      kselftest/openat2: test for OPENAT2_REGULAR flag

Jori Koolstra (4):
      vfs: add O_EMPTYPATH to openat(2)/openat2(2)
      selftest: add tests for O_EMPTYPATH
      vfs: make LAST_XXX private to fs/namei.c
      vfs: replace ints with enum last_type for LAST_XXX

 arch/alpha/include/uapi/asm/errno.h                |   2 +
 arch/mips/include/uapi/asm/errno.h                 |   2 +
 arch/parisc/include/uapi/asm/errno.h               |   2 +
 arch/sparc/include/uapi/asm/errno.h                |   2 +
 fs/ceph/file.c                                     |   4 +
 fs/fcntl.c                                         |   4 +-
 fs/gfs2/inode.c                                    |   7 +
 fs/namei.c                                         |  48 ++-
 fs/nfs/dir.c                                       |   4 +
 fs/open.c                                          |  41 ++-
 fs/smb/client/dir.c                                |  18 +-
 fs/smb/server/vfs.c                                |  15 +-
 include/linux/fcntl.h                              |  20 +-
 include/linux/namei.h                              |   7 +-
 include/uapi/asm-generic/errno.h                   |   2 +
 include/uapi/asm-generic/fcntl.h                   |   4 +
 include/uapi/linux/openat2.h                       |   7 +
 tools/arch/alpha/include/uapi/asm/errno.h          |   2 +
 tools/arch/mips/include/uapi/asm/errno.h           |   2 +
 tools/arch/parisc/include/uapi/asm/errno.h         |   2 +
 tools/arch/sparc/include/uapi/asm/errno.h          |   2 +
 tools/include/uapi/asm-generic/errno.h             |   2 +
 tools/include/uapi/linux/openat2.h                 |  43 +++
 .../selftests/{ => filesystems}/openat2/.gitignore |   0
 .../selftests/{ => filesystems}/openat2/Makefile   |   9 +-
 .../selftests/filesystems/openat2/emptypath_test.c |  77 +++++
 .../selftests/filesystems/openat2/helpers.h        | 135 ++++++++
 .../{ => filesystems}/openat2/openat2_test.c       | 262 ++++++++-------
 .../filesystems/openat2/rename_attack_test.c       | 159 +++++++++
 .../{ => filesystems}/openat2/resolve_test.c       | 368 ++++++++++++---------
 tools/testing/selftests/openat2/helpers.c          | 109 ------
 tools/testing/selftests/openat2/helpers.h          | 108 ------
 .../testing/selftests/openat2/rename_attack_test.c | 160 ---------
 33 files changed, 920 insertions(+), 709 deletions(-)
 create mode 100644 tools/include/uapi/linux/openat2.h
 rename tools/testing/selftests/{ => filesystems}/openat2/.gitignore (100%)
 rename tools/testing/selftests/{ => filesystems}/openat2/Makefile (65%)
 create mode 100644 tools/testing/selftests/filesystems/openat2/emptypath_test.c
 create mode 100644 tools/testing/selftests/filesystems/openat2/helpers.h
 rename tools/testing/selftests/{ => filesystems}/openat2/openat2_test.c (63%)
 create mode 100644 tools/testing/selftests/filesystems/openat2/rename_attack_test.c
 rename tools/testing/selftests/{ => filesystems}/openat2/resolve_test.c (74%)
 delete mode 100644 tools/testing/selftests/openat2/helpers.c
 delete mode 100644 tools/testing/selftests/openat2/helpers.h
 delete mode 100644 tools/testing/selftests/openat2/rename_attack_test.c

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 09/16 for v7.2] vfs super
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (7 preceding siblings ...)
  2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
@ 2026-06-12 15:14 ` Christian Brauner
  2026-06-12 15:14 ` [GIT PULL 10/16 for v7.2] vfs writeback Christian Brauner
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This retires sget(). CIFS plus the two ext4 KUnit tests (extents-test,
mballoc-test) were the last in-tree callers, and all three convert
cleanly to sget_fc(). That lets sget() and its prototype come out,
taking ~60 lines that only existed to be kept in lockstep with
sget_fc() on every publish-path change.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.super

for you to fetch changes up to 2c6f0c248a6b49a6fc8c301c84d367860c56ccd8:

  Merge patch series "super: retire sget()" (2026-06-03 09:09:57 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.super

Please consider pulling these changes from the signed vfs-7.2-rc1.super tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (5):
      ext4: convert extents KUnit test to sget_fc()
      ext4: convert mballoc KUnit test to sget_fc()
      smb: client: convert cifs_smb3_do_mount() to sget_fc()
      fs: retire sget()
      Merge patch series "super: retire sget()"

 fs/btrfs/super.c           |  2 +-
 fs/ext4/extents-test.c     | 22 +++++++++++---
 fs/ext4/mballoc-test.c     | 17 +++++++++--
 fs/smb/client/cifsfs.c     | 43 +++++++++++++++++-----------
 fs/smb/client/cifsfs.h     |  3 +-
 fs/smb/client/cifsproto.h  |  3 +-
 fs/smb/client/connect.c    |  5 ++--
 fs/smb/client/fs_context.c |  2 +-
 fs/super.c                 | 71 ++++------------------------------------------
 include/linux/fs.h         |  4 ---
 10 files changed, 73 insertions(+), 99 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 10/16 for v7.2] vfs writeback
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (8 preceding siblings ...)
  2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
@ 2026-06-12 15:14 ` Christian Brauner
  2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the writeback changes for this cycle:

* Fix a race between cgroup_writeback_umount() and inode_switch_wbs()

  When a container exits, a race between cgroup_writeback_umount() and
  inode_switch_wbs()/cleanup_offline_cgwb() can trigger "VFS: Busy
  inodes after unmount" followed by a use-after-free on percpu
  counters. There is a window between inode_prepare_wbs_switch()
  returning true (having passed the SB_ACTIVE check and grabbed the
  inode) and the subsequent wb_queue_isw() call: if
  cgroup_writeback_umount() observes the global isw_nr_in_flight
  counter as non-zero but flush_workqueue() finds nothing queued yet,
  it returns early - leaving a held inode reference that blocks
  evict_inodes() and a later iput() that hits freed percpu counters.

  The race is closed by covering the window from
  inode_prepare_wbs_switch() through wb_queue_isw() with an RCU
  read-side critical section and synchronizing in the umount path. On
  top of that the now-dead rcu_barrier() left over from the
  queue_rcu_work() era is removed, and the global
  synchronize_rcu()/flush_workqueue() pair is replaced with a per-sb
  in-flight counter plus pin/unpin/drain helpers so umount no longer
  serializes against switch activity on unrelated superblocks.

  Under cgroup writeback churn on a 16 vCPU guest this takes umount
  latency from ~92-138ms p50 down to ~5-8ms p50 and the cumulative
  cost of cgroup_writeback_umount() from ~62ms to ~4us per call. The
  initial race fix is kept separate and minimal so it backports
  cleanly to stable trees that still queue switches via
  queue_rcu_work().

* Improve write performance with RWF_DONTCACHE

  Dirty DONTCACHE pages are now tracked per bdi_writeback so that the
  writeback flusher can be kicked in a targeted fashion for
  IOCB_DONTCACHE writes instead of relying on global writeback, and
  the PG_dropbehind flag is preserved when a folio is split.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.writeback

for you to fetch changes up to 0275dc184aa007b260374af6d46fb15741c062a8:

  Merge patch series "mm: improve write performance with RWF_DONTCACHE" (2026-06-04 10:18:25 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.writeback

Please consider pulling these changes from the signed vfs-7.2-rc1.writeback tag.

Thanks!
Christian

----------------------------------------------------------------
Baokun Li (3):
      writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()
      writeback: drop now-unnecessary rcu_barrier() in cgroup_writeback_umount()
      writeback: use a per-sb counter to drain inode wb switches at umount

Christian Brauner (2):
      Merge patch series "writeback: fix race between cgroup_writeback_umount() and inode_switch_wbs()"
      Merge patch series "mm: improve write performance with RWF_DONTCACHE"

Jeff Layton (3):
      mm: preserve PG_dropbehind flag during folio split
      mm: track DONTCACHE dirty pages per bdi_writeback
      mm: kick writeback flusher for IOCB_DONTCACHE with targeted dirty tracking

 fs/fs-writeback.c                | 138 +++++++++++++++++++++++++++++++--------
 include/linux/backing-dev-defs.h |   3 +
 include/linux/fs.h               |   6 +-
 include/linux/fs/super_types.h   |   8 +++
 include/trace/events/writeback.h |   3 +-
 mm/filemap.c                     |  15 ++++-
 mm/huge_memory.c                 |   1 +
 mm/page-writeback.c              |   6 ++
 8 files changed, 147 insertions(+), 33 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 11/16 for v7.2] vfs bh
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (9 preceding siblings ...)
  2026-06-12 15:14 ` [GIT PULL 10/16 for v7.2] vfs writeback Christian Brauner
@ 2026-06-12 15:14 ` Christian Brauner
  2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:14 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This removes b_end_io from struct buffer_head.

Instead of setting bio->bi_end_io to end_bio_bh_io_sync() which then
calls bh->b_end_io(), the new bh_submit() and __bh_submit() interfaces
set bio->bi_end_io to the appropriate completion handler directly,
replacing two indirect function calls in the completion path with one.
It is also one fewer function pointer in the middle of a writable data
structure that can be corrupted, it shrinks struct buffer_head from
104 to 96 bytes allowing roughly 7% more buffer_heads to be cached in
the same amount of memory, and it removes some atomic operations as
the buffer refcount is no longer incremented before calling the end_io
handler.

All in-tree users (fs/buffer.c itself, ext4, jbd2, ocfs2, gfs2,
nilfs2, and md-bitmap) are converted, and submit_bh(),
mark_buffer_async_write(), and end_buffer_write_sync() are removed.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.bh

for you to fetch changes up to f0d857543e4d37464759c338f46ad6c85a618a2e:

  Merge patch series "Remove b_end_io from struct buffer_head" (2026-06-04 10:28:17 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.bh

Please consider pulling these changes from the signed vfs-7.2-rc1.bh tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Remove b_end_io from struct buffer_head"

Matthew Wilcox (Oracle) (34):
      buffer: Remove forward declaration of submit_bh_wbc()
      buffer: Add bh_submit()
      buffer: Remove mark_buffer_async_write_endio()
      buffer: Add bh_end_read(), bh_end_write() and bh_end_async_write()
      buffer: Convert write_dirty_buffer to bh_submit()
      buffer: Convert __bread_slow to bh_submit()
      buffer: Convert __sync_dirty_buffer to bh_submit()
      buffer: Convert __bh_read to bh_submit()
      buffer: Convert __bh_read_batch to bh_submit()
      buffer: Convert block_read_full_folio to bh_submit()
      buffer: Convert __block_write_full_folio to __bh_submit()
      ext4; Convert __ext4_read_bh() to bh_submit()
      ext4: Convert ext4_fc_submit_bh() to bh_submit()
      ext4: Convert write_mmp_block_thawed() to bh_submit()
      ext4: Convert ext4_commit_super() to bh_submit()
      jbd2: Convert journal commit to bh_submit()
      jbd2: Convert jbd2_write_superblock() to bh_submit()
      ocfs2: Convert ocfs2_write_block to bh_submit()
      ocfs2: Convert ocfs2_read_block to bh_submit()
      ocfs2: Convert ocfs2_read_blocks to bh_submit()
      ocfs2: Convert ocfs2_write_super_or_backup to bh_submit()
      gfs2: Convert gfs2_metapath_ra to bh_submit()
      gfs2: Convert gfs2_dir_readahead to bh_submit()
      gfs2: Remove use of b_end_io in gfs2_meta_read_endio()
      gfs2: Convert gfs2_aspace_write_folio to bh_submit()
      buffer: Remove mark_buffer_async_write()
      nilfs2: Convert nilfs_btnode_submit_block to bh_submit()
      nilfs2: Convert nilfs_gccache_submit_read_data to bh_submit()
      nilfs2: Convert nilfs_mdt_submit_block to bh_submit()
      md-bitmap: Convert read_file_page and write_file_page to bh_submit()
      buffer: Remove submit_bh()
      buffer: Remove b_end_io
      buffer: Change calling convention for end_buffer_read_sync()
      buffer: Remove end_buffer_write_sync()

 Documentation/filesystems/locking.rst |  14 --
 Documentation/trace/ftrace.rst        |   4 +-
 drivers/md/md-bitmap.c                |  27 +--
 drivers/md/raid5.h                    |   6 +-
 fs/buffer.c                           | 385 ++++++++++++++++++----------------
 fs/ext4/ext4.h                        |  10 +-
 fs/ext4/fast_commit.c                 |   8 +-
 fs/ext4/ialloc.c                      |   6 +-
 fs/ext4/mmp.c                         |   5 +-
 fs/ext4/super.c                       |  18 +-
 fs/gfs2/bmap.c                        |  13 +-
 fs/gfs2/dir.c                         |  12 +-
 fs/gfs2/meta_io.c                     |  13 +-
 fs/jbd2/commit.c                      |  13 +-
 fs/jbd2/journal.c                     |   4 +-
 fs/nilfs2/btnode.c                    |   4 +-
 fs/nilfs2/gcinode.c                   |   4 +-
 fs/nilfs2/mdt.c                       |   4 +-
 fs/ocfs2/buffer_head_io.c             |  16 +-
 include/linux/buffer_head.h           |  16 +-
 mm/vmscan.c                           |   2 +-
 21 files changed, 288 insertions(+), 296 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 12/16 for v7.2] vfs eventpoll
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (10 preceding siblings ...)
  2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
@ 2026-06-12 15:15 ` Christian Brauner
  2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the eventpoll changes for this cycle:

* eventpoll clarity refactor

  The recent eventpoll UAF fixes (a6dc643c6931 and follow-ups) rode on
  invariants in fs/eventpoll.c that were nowhere documented and had to
  be reverse-engineered from the code: the lifetime relationships
  between struct eventpoll, struct epitem, and struct file, the three
  removal paths coordinating via epi_fget() pins and ep->mtx, the
  ovflist sentinel-encoded scan state machine, the POLLFREE
  release/acquire handshake, and the loop / path check globals
  serialized by epnested_mutex. The fixes were correct but the next
  person to touch this code would hit the same learning curve.

  This series codifies those invariants in source and tightens the
  surrounding structure. No functional changes intended:

  - Documentation: a top-of-file overview with field-protection
    tables for struct eventpoll and struct epitem, a section
    gathering the loop-check / path-check globals next to their
    declarations, labelled comments on the two sides of the POLLFREE
    handshake, refreshed comments on epi_fget() and ep_remove_file(),
    and a docblock on ep_clear_and_put() that names its two-pass
    structure as load-bearing.

  - Mechanical renames: ep_refcount_dec_and_test() -> ep_put() to
    pair with ep_get(), attach_epitem() -> ep_attach_file() for
    ep_remove_file() symmetry, the unused depth argument dropped from
    epoll_mutex_lock(), and the CONFIG_KCMP block relocated next to
    CONFIG_COMPAT so the hot-path code is contiguous.

  - Helper extraction: ep_insert() splits into ep_alloc_epitem() and
    ep_register_epitem(), ep_clear_and_put()'s two passes become
    ep_drain_pollwaits() and ep_drain_tree() so the ordering
    invariant is enforced by the call sequence rather than
    convention, the per-event delivery loop body becomes
    ep_deliver_event(), and the ep->mtx + epnested_mutex acquisition
    dance lifts out of do_epoll_ctl() into ep_ctl_lock() /
    ep_ctl_unlock().

  - Sentinel and predicate cleanup: the EP_UNACTIVE_PTR overload is
    hidden behind named helpers (ep_is_scanning, epi_on_ovflist,
    ...), epi->next is renamed to epi->ovflist_next, and the boolean
    predicates return bool.

  - The per-CTL_ADD scratch state (tfile_check_list, path_count[],
    inserting_into) moves from file-scope globals into a
    stack-allocated struct ep_ctl_ctx plumbed through the loop / path
    check chain.

  Two follow-up fixes are included: missing kernel-doc for the new
  @ctx parameters, and restoring the EP_UNACTIVE_PTR sentinel for
  ctx->tfile_check_list - replacing it with NULL termination broke
  ep_remove_file()'s "never listed" check for the list tail, causing
  a syzbot-reported use-after-free.

* io_uring related epoll cleanups

  One of the nastier things about epoll is how it allows nesting
  contexts inside each other, leading to the necessity of loop
  detection and the issues that have come with that. There is no
  reason to support nesting on the io_uring side, so contain the
  damage and disallow nested contexts from there: eventpoll gains a
  file based control interface and struct epoll_filefd is renamed to
  epoll_key. The io_uring side proper goes on top of this through the
  block tree.

* Fix epoll_wait() reporting false negatives

  ep_events_available() checks ep->rdllist and ep_is_scanning()
  without a lock and can race with a concurrent scan such that
  neither check sees the events, causing epoll_wait() with a zero
  timeout to wrongly report no events even though events are
  available. A sequence lock closes the race and a reproducer is
  added to the eventpoll selftests.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.eventpoll

for you to fetch changes up to a1e9718b406bc4e6d0c63c7b999d06febbdc4091:

  eventpoll: restore EP_UNACTIVE_PTR sentinel for ctx->tfile_check_list (2026-06-04 13:53:50 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.eventpoll

Please consider pulling these changes from the signed vfs-7.2-rc1.eventpoll tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (20):
      eventpoll: expand top-of-file overview / locking doc
      eventpoll: document loop-check / path-check globals
      eventpoll: clarify POLLFREE handshake comments
      eventpoll: refresh epi_fget() / ep_remove_file() comments
      eventpoll: document ep_clear_and_put() two-pass pattern
      eventpoll: rename ep_refcount_dec_and_test() to ep_put()
      eventpoll: drop unused depth argument from epoll_mutex_lock()
      eventpoll: rename attach_epitem() to ep_attach_file()
      eventpoll: relocate KCMP helpers near compat syscalls
      eventpoll: split ep_insert() into alloc + register stages
      eventpoll: split ep_clear_and_put() into drain helpers
      eventpoll: extract ep_deliver_event() from ep_send_events()
      eventpoll: extract lock dance from do_epoll_ctl() into ep_ctl_lock()
      eventpoll: wrap EP_UNACTIVE_PTR in typed sentinel helpers
      eventpoll: rename epi->next and txlist for clarity
      eventpoll: use bool for predicate helpers
      eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx
      Merge patch series "eventpoll: clarity refactor"
      Merge patch series "io_uring related epoll cleanups"
      Merge patch series "eventpoll: Fix epoll_wait() report false negative"

Jens Axboe (4):
      eventpoll: pass struct epoll_filefd through ep_find() and ep_insert()
      eventpoll: export is_file_epoll()
      eventpoll: add file based control interface
      eventpoll: rename struct epoll_filefd to epoll_key

Nam Cao (2):
      selftests/eventpoll: Add test for multiple waiters
      eventpoll: Fix epoll_wait() report false negative

Randy Dunlap (1):
      eventpoll: add missing kernel-doc for @ctx function parameters

Zhan Wei (1):
      eventpoll: restore EP_UNACTIVE_PTR sentinel for ctx->tfile_check_list

 fs/eventpoll.c                                     | 1275 +++++++++++++-------
 include/linux/eventpoll.h                          |    8 +
 .../filesystems/epoll/epoll_wakeup_test.c          |   45 +
 3 files changed, 886 insertions(+), 442 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 13/16 for v7.2] vfs iomap
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (11 preceding siblings ...)
  2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
@ 2026-06-12 15:15 ` Christian Brauner
  2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the iomap changes for this cycle:

* Add the vfs infrastructure required to implement fs-verity support
  for XFS with a post-EOF merkle tree: fsverity generates and stores a
  zero-block hash, and iomap learns to verify data on buffered reads,
  to handle fsverity during writeback via the new IOMAP_F_FSVERITY
  flag, and to write fsverity metadata through iomap_fsverity_write().

* Skip the memset of the iomap in iomap_iter() once the iteration is
  done. In high-IOPS scenarios (4k randread NVMe polling via io_uring)
  the pointless memset wasted memory write bandwidth; this improves
  IOPS by about 5% on ext4 and xfs.

* Add balance_dirty_pages_ratelimited() to iomap_zero_iter(), aligning
  it with iomap_write_iter(). This prepares for the exFAT iomap
  conversion where zeroing beyond valid_size can trigger large-scale
  zeroing operations that caused memory pressure without throttling.

* Remove the over-strict inline data boundary check. If a filesystem
  provides a valid inline_data pointer and length there is no reason
  to require that inline data must not cross a page boundary.

* Don't make REQ_POLLED imply REQ_NOWAIT, matching the earlier
  equivalent block layer fix: there are valid cases to poll for I/O
  completion without REQ_NOWAIT, and REQ_NOWAIT for file system writes
  is currently not supported as writes aren't idempotent.

* Introduce IOMAP_F_ZERO_TAIL for filesystems that maintain a separate
  valid data length (exFAT, NTFS). For a write starting at or beyond
  valid_size, __iomap_write_begin() now zeroes only the tail portion
  of the block while preserving valid data before it, instead of
  leaving stale data in the page cache. The flag is also added to the
  iomap trace event strings.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The exfat tree carries the same "iomap: introduce IOMAP_F_ZERO_TAIL
flag" patch as this tree as a dependency for the exFAT iomap
conversion, so the merge gets a conflict in include/linux/iomap.h
together with the IOMAP_F_FSVERITY additions from this tree. Reported
in [1]. It can be resolved as follows:

[1]: https://lore.kernel.org/linux-next/aiKrepiU3-L6KRqJ@sirena.org.uk

diff --combined include/linux/iomap.h
index cea6bbc97b6ef,3582ed1fe2361..0000000000000
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@@ -91,6 -91,14 +91,14 @@@ struct vm_fault
  #endif /* CONFIG_BLK_DEV_INTEGRITY */
  #define IOMAP_F_ZERO_TAIL	(1U << 10)

+ /*
+  * Indicates reads and writes of fsverity metadata.
+  *
+  * Fsverity metadata is stored after the regular file data and thus beyond
+  * i_size.
+  */
+ #define IOMAP_F_FSVERITY	(1U << 11)
+
  /*
   * Flag reserved for file system specific usage
   */
@@@ -345,6 -353,9 +353,9 @@@ static inline bool iomap_want_unshare_i
  ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from,
  		const struct iomap_ops *ops,
  		const struct iomap_write_ops *write_ops, void *private);
+ int iomap_fsverity_write(struct file *file, loff_t pos, size_t length,
+ 		const void *buf, const struct iomap_ops *ops,
+ 		const struct iomap_write_ops *write_ops);
  void iomap_read_folio(const struct iomap_ops *ops,
  		struct iomap_read_folio_ctx *ctx, void *private);
  void iomap_readahead(const struct iomap_ops *ops,
@@@ -421,6 -432,7 +432,7 @@@ struct iomap_ioend
  	loff_t			io_offset;	/* offset in the file */
  	sector_t		io_sector;	/* start sector of ioend */
  	void			*io_private;	/* file system private data */
+ 	struct fsverity_info	*io_vi;		/* fsverity info */
  	struct bio		io_bio;		/* MUST BE LAST! */
  };

@@@ -495,6 -507,7 +507,7 @@@ struct iomap_read_folio_ctx
  	struct readahead_control *rac;
  	void			*read_ctx;
  	loff_t			read_ctx_file_offset;
+ 	struct fsverity_info	*vi;
  };

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.iomap

for you to fetch changes up to b7bae6880e8de2a5f693c18d87ad5cc26f157eb2:

  iomap: Add IOMAP_F_ZERO_TAIL flag to trace event strings (2026-06-05 13:36:42 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.iomap

Please consider pulling these changes from the signed vfs-7.2-rc1.iomap tag.

Thanks!
Christian

----------------------------------------------------------------
Andrey Albershteyn (4):
      fsverity: generate and store zero-block hash
      iomap: introduce IOMAP_F_FSVERITY and teach writeback to handle fsverity
      iomap: teach iomap to read files with fsverity
      iomap: introduce iomap_fsverity_write() for writing fsverity metadata

Chi Zhiling (1):
      iomap: add dirty page control to iomap_zero_iter

Christian Brauner (1):
      Merge patch series "vfs infrastructure for fs-verity support for XFS with post EOF merkle tree"

Christoph Hellwig (1):
      iomap: don't make REQ_POLLED imply REQ_NOWAIT

Fengnan Chang (1):
      iomap: avoid memset iomap when iter is done

Namjae Jeon (3):
      iomap: remove over-strict inline data boundary check
      iomap: introduce IOMAP_F_ZERO_TAIL flag
      iomap: Add IOMAP_F_ZERO_TAIL flag to trace event strings

 fs/iomap/buffered-io.c       | 116 ++++++++++++++++++++++++++++++++++++++-----
 fs/iomap/direct-io.c         |   5 +-
 fs/iomap/ioend.c             |   1 +
 fs/iomap/iter.c              |  12 ++---
 fs/iomap/trace.h             |   4 +-
 fs/verity/fsverity_private.h |   3 ++
 fs/verity/measure.c          |   4 +-
 fs/verity/open.c             |   3 ++
 fs/verity/pagecache.c        |  22 ++++++++
 include/linux/bio.h          |  14 ------
 include/linux/fsverity.h     |   8 +++
 include/linux/iomap.h        |  27 ++++++----
 12 files changed, 170 insertions(+), 49 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 14/16 for v7.2] vfs xattr
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (12 preceding siblings ...)
  2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
@ 2026-06-12 15:15 ` Christian Brauner
  2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
  2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This reworks the simple xattr api to make it more efficient and easier
to use for all consumers.

The simple_xattr hash table moves from the inode into a per-superblock
cache, removing the per-inode overhead for the common case of few or
no xattrs. The interface now passes struct simple_xattrs ** so lazy
allocation is handled internally instead of by every caller, kernfs
xattr operations on kernfs nodes shared between multiple superblocks
are properly serialized, and tmpfs constructs "security.foo" xattr
names with kasprintf() instead of kmalloc() plus two memcpy()s.

A follow-up fix links kernfs nodes to their parent before the LSM init
hook runs: with the per-sb cache kernfs_xattr_set() computes the cache
via kernfs_root(kn), which faulted on a freshly allocated node when
selinux_kernfs_init_security() called into it - reproducible as a NULL
pointer dereference on the first cgroup mkdir on SELinux-enabled
systems.

On top of this bpffs gains support for trusted.* and security.* xattrs
so that user space and BPF LSM programs can attach metadata - for
example a content hash or a security label - to pinned objects and
directories and inspect it uniformly like on other filesystems. The
store is in-memory and non-persistent, living only for the lifetime of
the mount like everything else in bpffs.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This has merge conflicts with the bpf-next tree in kernel/bpf/inode.c
between commit 9722955b54307 ("bpf: Add simple xattr support to
bpffs") from this tree and commit b93c55b4932dd ("bpf: fix UAF by
restoring RCU-delayed inode freeing in bpffs") from the bpf-next tree,
and in include/linux/bpf.h. Reported in [1] and [2]; Daniel confirmed
the resolution in [3]. They can be resolved as follows:

[1]: https://lore.kernel.org/linux-next/aiF2rsdpUb5LuhmZ@sirena.org.uk
[2]: https://lore.kernel.org/linux-next/aiamrLm8DnCP6dbw@sirena.org.uk
[3]: https://lore.kernel.org/linux-next/8906796e-0542-46d2-bb92-9e49642d86dc@iogearbox.net

diff --cc kernel/bpf/inode.c
index c3f79b5a2f8c0,188c774a469ca..0000000000000
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@@ -842,9 -768,12 +842,13 @@@ static void bpf_destroy_inode(struct in

  	if (!bpf_inode_type(inode, &type))
  		bpf_any_put(inode->i_private, type);
 +	simple_xattrs_free(&opts->xa_cache, &bi->xattrs, NULL);
  }

+ /*
+  * Called after RCU grace period - safe to free inode and anything
+  *  that might be accessed by RCU pathwalk (inode fields, i_link).
+  */
  static void bpf_free_inode(struct inode *inode)
  {
  	if (S_ISLNK(inode->i_mode))

diff --cc include/linux/bpf.h
index 64efc3fdb7163,62bba7a4876f5..0000000000000
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@@ -31,7 -32,8 +32,9 @@@
  #include <linux/static_call.h>
  #include <linux/memcontrol.h>
  #include <linux/cfi.h>
+ #include <linux/key.h>
+ #include <linux/ftrace.h>
 +#include <linux/xattr.h>
  #include <asm/rqspinlock.h>

  struct bpf_verifier_env;

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.xattr

for you to fetch changes up to 9722955b54307e9070994f2382ec06af3d7405e0:

  bpf: Add simple xattr support to bpffs (2026-06-06 15:22:44 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.xattr

Please consider pulling these changes from the signed vfs-7.2-rc1.xattr tag.

Thanks!
Christian

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "Rework simple xattrs"
      kernfs: link kn to its parent before the LSM init hook

Daniel Borkmann (1):
      bpf: Add simple xattr support to bpffs

Miklos Szeredi (4):
      kernfs: fix xattr race condition with multiple superblocks
      tmpfs: simplify constructing "security.foo" xattr names
      simple_xattr: change interface to pass struct simple_xattrs **
      simpe_xattr: use per-sb cache

 fs/kernfs/dir.c             |  22 ++--
 fs/kernfs/file.c            |  13 +--
 fs/kernfs/inode.c           |  36 +++---
 fs/kernfs/kernfs-internal.h |  24 +++-
 fs/kernfs/mount.c           |   2 +-
 fs/pidfs.c                  |  45 ++-----
 fs/xattr.c                  | 278 ++++++++++++++++++++++++++------------------
 include/linux/bpf.h         |   3 +
 include/linux/kernfs.h      |  11 +-
 include/linux/shmem_fs.h    |   3 +-
 include/linux/xattr.h       |  39 ++++---
 kernel/bpf/inode.c          | 256 +++++++++++++++++++++++++++++++++++++---
 mm/shmem.c                  |  50 +++-----
 net/socket.c                |  30 ++---
 14 files changed, 526 insertions(+), 286 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 15/16 for v7.2] vfs misc
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (13 preceding siblings ...)
  2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
@ 2026-06-12 15:16 ` Christian Brauner
  2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

Features

- Reduce pipe->mutex contention by pre-allocating pages outside the
  lock in anon_pipe_write().

  anon_pipe_write() called alloc_page() once per page while holding
  pipe->mutex. The allocation can sleep doing direct reclaim and runs
  memcg charging, which extends the critical section and stalls any
  concurrent reader on the same mutex. Now up to 8 pages are
  pre-allocated before the mutex is taken, leftovers are recycled into
  the per-pipe tmp_page[] cache before unlock, and any remainder is
  released after unlock, keeping the allocator out of the critical
  section on both sides. On a writers x readers sweep with 64KB writes
  against a 1 MB pipe throughput improves 6-28% and average write
  latency drops 5-22%; under memory pressure - when the cost of
  holding the mutex across reclaim is highest - throughput improves
  21-48% and latency drops 17-33%. The microbenchmark is added to
  selftests.

- uaccess/sockptr: fix the ignored_trailing logic in
  copy_struct_to_user() to behave as documented and the usize check in
  copy_struct_from_sockptr() for user pointers, and add
  copy_struct_{from,to}_bounce_buffer() and copy_struct_to_sockptr()
  helpers for upcoming users (IPPROTO_SMBDIRECT, IPPROTO_QUIC).

- bpf: add a sleepable bpf_real_inode() kfunc that resolves the real
  inode backing a dentry via d_real_inode(). On overlayfs the inode
  attached to the dentry doesn't carry the underlying device
  information; this is used by the filesystem restriction BPF program
  that was merged into systemd.

- docs: add guidelines for submitting new filesystems, motivated by
  the maintenance burden abandoned and untestable filesystems impose
  on VFS developers, blocking infrastructure work like folio
  conversions and iomap migration.

Fixes

- libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo()
  and drop the now-redundant assignments in callers. This began as a
  one-line dma-buf fix for a path_noexec() warning; a pseudo
  filesystem has no reason not to set SB_I_NOEXEC. All init_pseudo()
  callers were audited: the only visible effect is on dma-buf where
  SB_I_NOEXEC silences the warning.

- Handle set_blocksize() failures in legacy filesystems (bfs, hpfs,
  qnx4, jfs, befs, affs, isofs, minix, ntfs3, omfs). Mounting a device
  with a sector size > PAGE_SIZE crashed roughly half of them; the
  rest had the same missing error handling pattern. Plus a follow-up
  releasing the superblock buffer_head when setting the minix v3 block
  size fails.

- mount: honour SB_NOUSER in the new mount API.

- fs/fcntl: fix a SOFTIRQ-unsafe lock order in fasync signaling by
  switching the process-group paths of send_sigio() and send_sigurg()
  from read_lock(&tasklist_lock) to RCU, matching the single-PID path.

- vfs: add an FS_USERNS_DELEGATABLE flag and set it for NFS, fixing
  delegated NFS mounts (fsopen() in a container with the mount
  performed by a privileged daemon) that broke when non-init s_user_ns
  was tied to FS_USERNS_MOUNT.

- selftests/namespaces: fix a hang in nsid_test where an unreaped
  grandchild kept the TAP pipe write-end open, a waitpid(-1) race in
  listns_efault_test, and a false FAIL on kernels without listns()
  where the tests should SKIP.

- filelock: fix the break_lease() stub signature for
  CONFIG_FILE_LOCKING=n.

- init/initramfs_test: wait for the async initramfs unpacking before
  running; the test and do_populate_rootfs() share the parser state.

- fs/coredump: reduce redundant log noise in
  validate_coredump_safety().

- iomap: pass the correct length to fserror_report_io() in
  __iomap_write_begin().

- backing-file: fix the backing_file_open() kerneldoc.

Cleanups

- initramfs: refactor the cpio hex header parsing to use hex2bin()
  instead of the hand-rolled simple_strntoul() which is reverted, and
  extend the initramfs KUnit tests to cover header fields with 0x
  prefixes.

- Replace __get_free_pages() and friends with kmalloc()/kzalloc()
  across quota, proc, ocfs2/dlm, nilfs2, nfs, nfsd, libfs, jfs, jbd2,
  isofs, fuse, select, namespace, configfs, binfmt_misc, bfs, and the
  do_mounts init code - part of the larger work of replacing page
  allocator calls with kmalloc().

- Use clear_and_wake_up_bit() in unlock_buffer() and
  journal_end_buffer_io_sync() instead of open-coding the sequence.

- Drop unused VFS exports: unexport drop_super_exclusive(), remove
  start_removing_user_path_at(), and fold __start_removing_path() into
  start_removing_path().

- fs/read_write: narrow the __kernel_write() export with
  EXPORT_SYMBOL_FOR_MODULES().

- vfs: uapi: retire octal and hex constants in favor of (1 << n) for
  the O_ flags. Finding a free bit for a new flag across the
  architectures was needlessly hard with the mixed bases.

- dcache: add extra sanity checks of dead dentries in dentry_free()
  via a new DENTRY_WARN_ONCE() that also prints d_flags.

- iov_iter: use kmemdup_array() in dup_iter() to harden the allocation
  against multiplication overflow.

- fs/pipe: write to ->poll_usage only once.

- vfs: remove an always-taken if-branch in find_next_fd().

- dcache: use kmalloc_flex() for struct external_name in __d_alloc().

- namei: use QSTR() instead of QSTR_INIT() in path_pts().

- sync_file_range: delete dead S_ISLNK code.

- Comment fixes: retire a stale comment in fget_task_next() and fix
  assorted spelling mistakes.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This has a merge conflict with the ext4 tree in fs/jbd2/journal.c
between commit bbe9015f23432b ("jbd2: remove special jbd2 slabs") from
the ext4 tree and commit 2f6702dc6fdcf0 ("jbd2: replace
__get_free_pages() with kmalloc()") from this tree. The change in this
tree is a subset of the ext4 tree's commit, so the conflict can be
resolved by taking the ext4 side. Reported in [1].

It was suggested in [2] to drop the patch from this tree. But the
patch is part of the merged "fs: replace __get_free_pages() call with
kmalloc()" series with a dozen commits on top of it, so dropping it
would have meant rewriting the whole branch after it had been exposed
in linux-next. Since the change is a strict subset of the ext4 commit,
taking the ext4 side during the merge yields the identical end result.

[1]: https://lore.kernel.org/linux-next/aiq8CByJNMlXo6Be@sirena.co.uk
[2]: https://lore.kernel.org/linux-next/airBGjtjTf3Yuy0X@casper.infradead.org

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.misc

for you to fetch changes up to aa5c4fe3ba0cb2af90bbcfa7a8ef4fefcd5c2370:

  backing-file: fix backing_file_open() kerneldoc parameter (2026-06-10 09:49:25 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.misc

Please consider pulling these changes from the signed vfs-7.2-rc1.misc tag.

Thanks!
Christian

----------------------------------------------------------------
Agatha Isabelle Moreira (2):
      fs: buffer: use clear_and_wake_up_bit() in unlock_buffer()
      fs: jbd2: use clear_and_wake_up_bit() in journal_end_buffer_io_sync()

Al Viro (1):
      mount: honour SB_NOUSER in the new mount API

Alexey Dobriyan (1):
      sync_file_range: delete dead S_ISLNK code

Amir Goldstein (1):
      docs: add guidelines for submitting new filesystems

Andy Shevchenko (5):
      initramfs: Sort headers alphabetically
      initramfs: Refactor to use hex2bin() instead of custom approach
      vsprintf: Revert "add simple_strntoul"
      kstrtox: Drop extern keyword in the simple_strtox() declarations
      fs/read_write: Do not export __kernel_write() to the entire world

Breno Leitao (2):
      fs/pipe: pre-allocate pages outside pipe->mutex in anon_pipe_write
      selftests/pipe: add pipe_bench microbenchmark

Christian Brauner (11):
      Merge patch series "uaccess/sockptr: copy_struct_ fixes and more helpers"
      Merge patch series "selftests/namespaces: Fix test hangs and false failures"
      Merge patch series "initramfs: test and improve cpio hex header validation"
      Merge patch series "drop unused VFS exports"
      Merge patch series "fix crashes when mounting legacy file system with sector size > PAGE_SIZE"
      Merge patch series "fs: refactor code to use clear_and_wake_up_bit()"
      Merge patch series "fs: replace __get_free_pages() call with kmalloc()"
      Merge patch series "fs/pipe: reduce pipe->mutex contention by pre-allocating outside the lock"
      Merge patch series "libfs: set SB_I_NOEXEC and SB_I_NODEV in init_pseudo()"
      bpf: add bpf_real_inode() kfunc
      filelock: fix break_lease() stub signature for CONFIG_FILE_LOCKING=n

Christoph Hellwig (15):
      fs: unexport drop_super_exclusive
      fs: remove start_removing_user_path_at
      fs: fold __start_removing_path into start_removing_path
      bfs: handle set_blocksize failures
      hpfs: handle set_blocksize failures
      qnx4: handle set_blocksize failures
      jfs: handle set_blocksize failures
      befs: handle set_blocksize failures
      affs: handle set_blocksize failures
      isofs: handle set_blocksize failures
      minix: handle set_blocksize failures
      ntfs3: handle set_blocksize failures
      omfs: handle set_blocksize failures
      minix: release the sb buffer_head when setting the v3 block size fails
      iomap: pass the correct len to fserror_report_io in __iomap_write_begin

David Disseldorp (2):
      initramfs_test: add fill_cpio() inject_ox parameter
      initramfs_test: test header fields with 0x hex prefix

Jeff Layton (2):
      dcache: add extra sanity checks of the dentry in dentry_free()
      vfs: add FS_USERNS_DELEGATABLE flag and set it for NFS

Jia He (1):
      init/initramfs_test: wait_for_initramfs() before running

John Hubbard (2):
      libfs: set SB_I_NOEXEC and SB_I_NODEV by default in init_pseudo()
      libfs: drop redundant SB_I_NOEXEC/SB_I_NODEV in init_pseudo() callers

Jori Koolstra (2):
      vfs: remove always taken if-branch in find_next_fd()
      vfs: uapi: retire octal and hex numbers in favor of (1 << n) for O_ flags

Li RongQing (1):
      fs/coredump: reduce redundant log noise in validate_coredump_safety

Li Wang (1):
      backing-file: fix backing_file_open() kerneldoc parameter

Mateusz Guzik (2):
      fs/pipe: write to ->poll_usage only once
      fs: retire stale comment in fget_task_next()

Mike Rapoport (Microsoft) (18):
      init: do_mounts: use kmalloc() for allocations of temporary buffers
      quota: allocate dquot_hash with kmalloc()
      proc: replace __get_free_page() with kmalloc()
      ocfs2/dlm: replace __get_free_page() with kmalloc()
      nilfs2: replace get_zeroed_page() with kzalloc()
      NFS: replace __get_free_page() with kmalloc() in nfs_show_devname()
      NFS: remove unused page and page2 in nfs4_replace_transport()
      NFSD: replace __get_free_page() with kmalloc() in nfsd_buffered_readdir()
      libfs: simple_transaction_get(): replace get_zeroed_page() with kzalloc()
      jfs: replace __get_free_page() with kmalloc()
      jbd2: replace __get_free_pages() with kmalloc()
      isofs: replace __get_free_page() with kmalloc()
      fuse: replace __get_free_page() with kmalloc()
      fs/select: replace __get_free_page() with kmalloc()
      fs/namespace: use __getname() to allocate mntpath buffer
      configfs: replace __get_free_pages() with kzalloc()
      binfmt_misc: replace __get_free_page() with kmalloc()
      bfs: replace get_zeroed_page() with kzalloc()

Mingyu Wang (1):
      fs/fcntl: fix SOFTIRQ-unsafe lock order in fasync signaling

Qingshuang Fu (1):
      fs: fix spelling mistakes in comment

Ricardo B. Marlière (3):
      selftests/namespaces: Kill grandchild in nsid fixture teardown
      selftests/namespaces: Fix waitpid race in listns_efault_test cleanup
      selftests/namespaces: Skip efault tests when listns() is not available

Stefan Metzmacher (5):
      uaccess: fix ignored_trailing logic in copy_struct_to_user()
      sockptr: fix usize check in copy_struct_from_sockptr() for user pointers
      uaccess: add copy_struct_{from,to}_bounce_buffer() helpers
      sockptr: let copy_struct_from_sockptr() use copy_struct_from_bounce_buffer()
      sockptr: introduce copy_struct_to_sockptr()

Thorsten Blum (2):
      dcache: use kmalloc_flex() in __d_alloc
      namei: use QSTR() instead of QSTR_INIT() in path_pts

Wang Haoran (1):
      iov_iter: use kmemdup_array for dup_iter to harden against overflow

 .../filesystems/adding-new-filesystems.rst         | 195 +++++++
 Documentation/filesystems/index.rst                |   1 +
 Documentation/filesystems/porting.rst              |   1 -
 arch/alpha/include/uapi/asm/fcntl.h                |  34 +-
 arch/arm/include/uapi/asm/fcntl.h                  |   8 +-
 arch/arm64/include/uapi/asm/fcntl.h                |   8 +-
 arch/m68k/include/uapi/asm/fcntl.h                 |   8 +-
 arch/mips/include/uapi/asm/fcntl.h                 |  22 +-
 arch/parisc/include/uapi/asm/fcntl.h               |  28 +-
 arch/powerpc/include/uapi/asm/fcntl.h              |   8 +-
 arch/sparc/include/uapi/asm/fcntl.h                |  34 +-
 fs/affs/affs.h                                     |   5 -
 fs/affs/super.c                                    |   6 +-
 fs/aio.c                                           |   1 -
 fs/anon_inodes.c                                   |   2 -
 fs/backing-file.c                                  |  13 +-
 fs/befs/linuxvfs.c                                 |   3 +-
 fs/bfs/inode.c                                     |   7 +-
 fs/binfmt_misc.c                                   |   4 +-
 fs/bpf_fs_kfuncs.c                                 |  16 +
 fs/buffer.c                                        |   4 +-
 fs/configfs/file.c                                 |   7 +-
 fs/coredump.c                                      |   3 +-
 fs/dcache.c                                        |  18 +-
 fs/exec.c                                          |   6 +-
 fs/fcntl.c                                         |   8 +-
 fs/file.c                                          |  18 +-
 fs/file_table.c                                    |   4 +-
 fs/fuse/ioctl.c                                    |   5 +-
 fs/hpfs/super.c                                    |   3 +-
 fs/iomap/buffered-io.c                             |   2 +-
 fs/isofs/dir.c                                     |   5 +-
 fs/isofs/inode.c                                   |   3 +-
 fs/jbd2/commit.c                                   |   4 +-
 fs/jbd2/journal.c                                  |   7 +-
 fs/jfs/jfs_dtree.c                                 |  16 +-
 fs/jfs/super.c                                     |   3 +-
 fs/libfs.c                                         |   7 +-
 fs/minix/inode.c                                   |   3 +-
 fs/namei.c                                         |  25 +-
 fs/namespace.c                                     |  11 +-
 fs/nfs/fs_context.c                                |   8 +-
 fs/nfs/nfs4namespace.c                             |  15 +-
 fs/nfs/super.c                                     |   4 +-
 fs/nfsd/vfs.c                                      |   4 +-
 fs/nilfs2/ioctl.c                                  |   4 +-
 fs/nsfs.c                                          |   1 -
 fs/ntfs3/super.c                                   |   8 +-
 fs/ocfs2/dlm/dlmdebug.c                            |  24 +-
 fs/ocfs2/dlm/dlmdomain.c                           |   8 +-
 fs/ocfs2/dlm/dlmmaster.c                           |   5 +-
 fs/ocfs2/dlm/dlmrecovery.c                         |   4 +-
 fs/omfs/inode.c                                    |   6 +-
 fs/pidfs.c                                         |   2 -
 fs/pipe.c                                          | 106 +++-
 fs/proc/base.c                                     |  16 +-
 fs/qnx4/inode.c                                    |   3 +-
 fs/quota/dquot.c                                   |  11 +-
 fs/read_write.c                                    |   5 +-
 fs/select.c                                        |   4 +-
 fs/super.c                                         |  12 +-
 fs/sync.c                                          |   3 +-
 include/linux/filelock.h                           |   2 +-
 include/linux/fs.h                                 |   1 +
 include/linux/kstrtox.h                            |   9 +-
 include/linux/namei.h                              |   1 -
 include/linux/sockptr.h                            |  28 +-
 include/linux/uaccess.h                            |  65 ++-
 include/uapi/asm-generic/fcntl.h                   |  50 +-
 init/do_mounts.c                                   |  21 +-
 init/initramfs.c                                   |  68 ++-
 init/initramfs_test.c                              |  97 +++-
 lib/iov_iter.c                                     |   8 +-
 lib/vsprintf.c                                     |   7 -
 mm/secretmem.c                                     |   2 -
 tools/testing/selftests/Makefile                   |   1 +
 .../selftests/namespaces/listns_efault_test.c      |  33 +-
 tools/testing/selftests/namespaces/nsid_test.c     |  14 +-
 tools/testing/selftests/pipe/.gitignore            |   1 +
 tools/testing/selftests/pipe/Makefile              |   9 +
 tools/testing/selftests/pipe/pipe_bench.c          | 616 +++++++++++++++++++++
 virt/kvm/guest_memfd.c                             |   2 -
 82 files changed, 1464 insertions(+), 390 deletions(-)
 create mode 100644 Documentation/filesystems/adding-new-filesystems.rst
 create mode 100644 tools/testing/selftests/pipe/.gitignore
 create mode 100644 tools/testing/selftests/pipe/Makefile
 create mode 100644 tools/testing/selftests/pipe/pipe_bench.c

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [GIT PULL 16/16 for v7.2] vfs procfs
  2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
                   ` (14 preceding siblings ...)
  2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
@ 2026-06-12 15:16 ` Christian Brauner
  15 siblings, 0 replies; 17+ messages in thread
From: Christian Brauner @ 2026-06-12 15:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */

This contains the procfs changes for this cycle:

* Revamp fs/filesystems.c

  The file was a mess with a hand-rolled linked list in desperate need
  of a cleanup. The filesystems list is now RCU-ified, /proc files can
  be marked permanent from outside fs/proc/, and the string emitted
  when reading /proc/filesystems is pre-generated and cached instead
  of pointer-chasing and printfing entry by entry on every read. The
  file is read frequently because libselinux reads it and is linked
  into numerous frequently used programs (even ones you would not
  suspect, like sed!). Scalability also improves since reference
  maintenance on open/close is bypassed.

  open+read+close cycle single-threaded (ops/s):
  before: 442732
  after:  1063462 (+140%)

  open+read+close cycle with 20 processes (ops/s):
  before: 606177
  after:  3300576 (+444%)

  A follow-up patch adds missing unlocks in some corner cases and
  tidies things up.

* Relax the mount visibility check for subset=pid mounts

  When procfs is mounted with subset=pid, all static files become
  unavailable and only the dynamic pid information is accessible. In
  that case there is no point in imposing the full mount visibility
  restrictions on the mounter - everything that can be hidden in
  procfs is already inaccessible. These restrictions prevented procfs
  from being mounted inside rootless containers since almost all
  container implementations overmount parts of procfs to hide certain
  directories.

  As part of this /proc/self/net is only shown in subset=pid mounts
  for CAP_NET_ADMIN, reconfiguring subset=pid is rejected, the
  SB_I_USERNS_VISIBLE superblock flag is replaced with an
  FS_USERNS_MOUNT_RESTRICTED filesystem flag, fully visible mounts are
  recorded in a list, and the mount restrictions are finally
  documented.

* Protect ptrace_may_access() with exec_update_lock in procfs

  Most uses of ptrace_may_access() in procfs should hold
  exec_update_lock to avoid TOCTOU issues with concurrent privileged
  execve() (like setuid binary execution). This fixes the easy cases -
  the owner and visibility checks and the FD link permission checks -
  with the gnarlier ones to follow later.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This will have a merge conflict with:
[1]: https://lore.kernel.org/20260612-vfs-misc-v72-13d57389d260@brauner

Both add a new fs_flags define at the same location in
include/linux/fs.h. The bit values don't overlap. It can be resolved
as follows:

diff --cc include/linux/fs.h
index 10d35a68f597,e7ff9f8b1485..dcd0575a3830
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@@ -2281,7 -2281,7 +2281,8 @@@ struct file_system_type
  #define FS_MGTIME		64	/* FS uses multigrain timestamps */
  #define FS_LBS			128	/* FS supports LBS */
  #define FS_POWER_FREEZE		256	/* Always freeze on suspend/hibernate */
+ #define FS_USERNS_MOUNT_RESTRICTED 512	/* Restrict mount in userns if not already visible */
 +#define FS_USERNS_DELEGATABLE	1024	/* Can be mounted inside userns from outside */
  #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move() during rename() internally. */
  	int (*init_fs_context)(struct fs_context *);
  	const struct fs_parameter_spec *parameters;

The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:

  Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-7.2-rc1.procfs

for you to fetch changes up to cf30ceccfaec3d2549ff60f7c915625f12dd3a93:

  fs: fix ups and tidy ups to /proc/filesystems caching (2026-06-12 14:26:27 +0200)

----------------------------------------------------------------
vfs-7.2-rc1.procfs

Please consider pulling these changes from the signed vfs-7.2-rc1.procfs tag.

Thanks!
Christian

----------------------------------------------------------------
Alexey Dobriyan (1):
      proc: allow to mark /proc files permanent outside of fs/proc/

Alexey Gladkov (4):
      proc: subset=pid: Show /proc/self/net only for CAP_NET_ADMIN
      proc: prevent reconfiguring subset=pid
      proc: handle subset=pid separately in userns visibility checks
      docs: proc: add documentation about mount restrictions

Christian Brauner (7):
      namespace: record fully visible mounts in list
      fs: move SB_I_USERNS_VISIBLE to FS_USERNS_MOUNT_RESTRICTED
      fs: RCU-ify filesystems list
      sysfs: remove trivial sysfs_get_tree() wrapper
      Merge patch series "revamp fs/filesystems.c"
      Merge patch series "proc: subset=pid: Relax check of mount visibility"
      Merge patch series "proc: protect ptrace_may_access() with exec_update_lock"

Jann Horn (2):
      proc: protect ptrace_may_access() with exec_update_lock (part 1)
      proc: protect ptrace_may_access() with exec_update_lock (FD links)

Mateusz Guzik (2):
      fs: cache the string generated by reading /proc/filesystems
      fs: fix ups and tidy ups to /proc/filesystems caching

 Documentation/filesystems/proc.rst |  19 ++-
 fs/filesystems.c                   | 330 +++++++++++++++++++++++++------------
 fs/mount.h                         |   4 +
 fs/namespace.c                     |  34 +++-
 fs/ocfs2/super.c                   |   1 -
 fs/proc/array.c                    |   6 +
 fs/proc/base.c                     | 160 ++++++++----------
 fs/proc/fd.c                       |  27 ++-
 fs/proc/generic.c                  |  10 ++
 fs/proc/internal.h                 |   5 +-
 fs/proc/namespaces.c               |  12 ++
 fs/proc/proc_net.c                 |   8 +
 fs/proc/root.c                     |  24 ++-
 fs/sysfs/mount.c                   |  18 +-
 include/linux/fs.h                 |   3 +-
 include/linux/fs/super_types.h     |   2 +-
 include/linux/proc_fs.h            |  13 ++
 kernel/acct.c                      |   2 +-
 18 files changed, 429 insertions(+), 249 deletions(-)

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-06-12 15:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-12 15:10 [GIT PULL 00/16 for v7.2] v7.2 Christian Brauner
2026-06-12 15:11 ` [GIT PULL 01/16 for v7.2] vfs kfunc Christian Brauner
2026-06-12 15:11 ` [GIT PULL 02/16 for v7.2] vfs exportfs Christian Brauner
2026-06-12 15:12 ` [GIT PULL 03/16 for v7.2] vfs inode Christian Brauner
2026-06-12 15:12 ` [GIT PULL 04/16 for v7.2] vfs directory delegations Christian Brauner
2026-06-12 15:12 ` [GIT PULL 05/16 for v7.2] vfs casefold Christian Brauner
2026-06-12 15:13 ` [GIT PULL 06/16 for v7.2] kernel task_exec_state Christian Brauner
2026-06-12 15:13 ` [GIT PULL 07/16 for v7.2] kernel misc Christian Brauner
2026-06-12 15:13 ` [GIT PULL 08/16 for v7.2] vfs openat2 Christian Brauner
2026-06-12 15:14 ` [GIT PULL 09/16 for v7.2] vfs super Christian Brauner
2026-06-12 15:14 ` [GIT PULL 10/16 for v7.2] vfs writeback Christian Brauner
2026-06-12 15:14 ` [GIT PULL 11/16 for v7.2] vfs bh Christian Brauner
2026-06-12 15:15 ` [GIT PULL 12/16 for v7.2] vfs eventpoll Christian Brauner
2026-06-12 15:15 ` [GIT PULL 13/16 for v7.2] vfs iomap Christian Brauner
2026-06-12 15:15 ` [GIT PULL 14/16 for v7.2] vfs xattr Christian Brauner
2026-06-12 15:16 ` [GIT PULL 15/16 for v7.2] vfs misc Christian Brauner
2026-06-12 15:16 ` [GIT PULL 16/16 for v7.2] vfs procfs Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.