linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/17 for v6.19] v6.19
@ 2025-11-28 16:48 Christian Brauner
  2025-11-28 16:48 ` [GIT PULL 01/17 for v6.19] vfs iomap Christian Brauner
                   ` (16 more replies)
  0 siblings, 17 replies; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

This is the batch of pull requests for the v6.19 merge window!

We have a couple of inter-dependencies between branches. Notably the
cred guard work and the directory locking work are a prerequisite for
the overlayfs work for this cycle.

We also have an external dependency on the kbuild tree's work to enable
-fms-extension. So pulling the vfs work before the kbuild work will
bring that in. Which is fine. I'm just making sure you're aware of it.

This cycle was quite busy with a lot of infrastructure work and cleanups.

There is the new listns() system call that allows userspace to iterate
through namespaces in the system. Currently there's no direct way to
enumerate namespaces - applications must scan /proc/<pid>/ns/ across all
processes, which is inefficient, incomplete (misses namespaces kept
alive only by file descriptors or bind mounts), and requires broad /proc
access. The new system call supports pagination, filtering by namespace
type, and filtering by owning user namespace.

To support listns() and future namespace work, we've introduced an
active reference count that tracks namespace visibility to userspace. A
namespace is visible when it's in use by a task, persisted through a VFS
object, or is the parent of child namespaces. This prevents resurrection
of namespaces that are pinned only for internal kernel reasons.

There's the credential guard infrastructure change for this cycle that
you triggered. :) We now have with_kernel_creds() and
scoped_with_kernel_creds() guards that allow using kernel credentials
without allocating and copying them. We also have scoped_with_creds()
for the common override_creds()/revert_creds() pattern, and prepare
credential guards for more complex cases. All of overlayfs has been
converted to use these guards.

The inode state accessor work from last cycle continues. We now hide
inode->i_state behind accessors entirely, making plain access fail to
compile. This allows asserting correct usage - locking, flag
manipulation, detecting when code clears already-missing flags or sets
flags when illegal.

Directory operations are getting centralized locking helpers as part of
NeilBrown's effort to eventually allow multiple concurrent operations in
a directory by locking target dentries rather than whole parent
directories.

We now also have recall-only directory delegations for knfsd.

The iomap work includes FUSE support for buffered reads using iomap,
enabling granular uptodate tracking with large folios. There's also zero
range folio batch support to handle dirty folios over unwritten
mappings, and DIO write completions can now run from interrupt context
again for pure overwrites, reducing context switches for
high-performance workloads.

The FD_ADD() and FD_PREPARE() primitives simplify the ubiquitous pattern
of get_unused_fd_flags() + create file + fd_install() that currently
requires cumbersome cleanup paths. The series removes roughly double the
code it adds by eliminating convoluted cleanup logic across many
subsystems. This work came late in the cycle but is quite nice - an
alternative pull request with only trivial filesystem conversions is
available if preferred. The KVM conversions were reverted as they prefer
to take those through their tree.

Note that I provided two pull requests for FD_{ADD,PREPARE}():

(1) [GIT PULL 16/17 for v6.19] vfs fd prepare
    Message-Id: <20251128-vfs-fd-prepare-v619-e23be0b7a0c5@brauner>
    contains everything I sent out and a few later fixes and with a
    revert of the kvm conversions. The kvm maintainers want to take it
    to their tree apparently.

(2) [GIT PULL 17/17 for v6.19] vfs fd prepare minimal
    Message-Id: <20251128-vfs-fd-prepare-minimal-v619-41df48e056e7@brauner>
    contains a condensed version with anything that's complex removed.

I think (1) is fine but I understand wanting to be a bit more
conservative so I also provided (2).

There's the usual collection of cleanups: writeback interface
simplification removing low-level filemap_* interfaces, path lookup
optimizations with cheaper MAY_EXEC handling, step_into()/walk_component()
inlining, and the start of splitting up the monolithic fs.h header into
focused headers for superblock code.

Smaller items include folio_next_pos() helper fixing a 32-bit ocfs2 bug,
minix filesystem syzbot fixes, autofs fix for futile mount triggers in
private mount namespaces, and coredump/pidfd improvements exposing the
coredump signal.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 01/17 for v6.19] vfs iomap
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 02/17 for v6.19] vfs misc Christian Brauner
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the iomap changes for this cycle:

* FUSE iomap Support for Buffered Reads

  This adds iomap support for FUSE buffered reads and readahead. This
  enables granular uptodate tracking with large folios so only
  non-uptodate portions need to be read. Also fixes a race condition
  with large folios + writeback cache that could cause data corruption
  on partial writes followed by reads.

  - Refactored iomap read/readahead bio logic into helpers
  - Added caller-provided callbacks for read operations
  - Moved buffered IO bio logic into new file
  - FUSE now uses iomap for read_folio and readahead

* Zero Range Folio Batch Support

  Adds folio batch support for iomap_zero_range() to handle dirty folios
  over unwritten mappings. Fixes raciness issues where dirty data could
  be lost during zero range operations.

  - filemap_get_folios_tag_range() helper for dirty folio lookup
  - Optional zero range dirty folio processing
  - XFS fills dirty folios on zero range of unwritten mappings
  - Removed old partial EOF zeroing optimization

* DIO Write Completions from Interrupt Context

  Restore pre-iomap behavior where pure overwrite completions run inline
  rather than being deferred to workqueue. Reduces context switches for
  high-performance workloads like ScyllaDB.

  - Removed unused IOCB_DIO_CALLER_COMP code
  - Error completions always run in user context (fixes zonefs)
  - Reworked REQ_FUA selection logic
  - Inverted IOMAP_DIO_INLINE_COMP to IOMAP_DIO_OFFLOAD_COMP

* Buffered IO Cleanups

  Some performance and code clarity improvements:

  - Replace manual bitmap scanning with find_next_bit()
  - Simplify read skip logic for writes
  - Optimize pending async writeback accounting
  - Better variable naming
  - Documentation for iomap_finish_folio_write() requirements

* Misaligned Vectors for Zoned XFS

  Enables sub-block aligned vectors in XFS always-COW mode for zoned
  devices via new IOMAP_DIO_FSBLOCK_ALIGNED flag.

* Bug Fixes

  - Allocate s_dio_done_wq for async reads (fixes syzbot report after error completion changes)
  - Fix iomap_read_end() for already uptodate folios (regression fix)

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251117143259.05d36122@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.iomap

for you to fetch changes up to 7fd8720dff2d9c70cf5a1a13b7513af01952ec02:

  iomap: allocate s_dio_done_wq for async reads as well (2025-11-25 10:22:19 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.iomap tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.iomap

----------------------------------------------------------------
Brian Foster (7):
      filemap: add helper to look up dirty folios in a range
      iomap: remove pos+len BUG_ON() to after folio lookup
      iomap: optional zero range dirty folio processing
      xfs: always trim mapping to requested range for zero range
      xfs: fill dirty folios on zero range of unwritten mappings
      iomap: remove old partial eof zeroing optimization
      xfs: error tag to force zeroing on debug kernels

Christian Brauner (5):
      Merge patch series "fuse: use iomap for buffered reads + readahead"
      Merge patch series "iomap: zero range folio batch support"
      Merge patch series "alloc misaligned vectors for zoned XFS v2"
      Merge patch series "iomap: buffered io changes"
      Merge patch series "enable iomap dio write completions from interrupt context v2"

Christoph Hellwig (8):
      iomap: move buffered io bio logic into new file
      xfs: support sub-block aligned vectors in always COW mode
      fs, iomap: remove IOCB_DIO_CALLER_COMP
      iomap: always run error completions in user context
      iomap: rework REQ_FUA selection
      iomap: support write completions from interrupt context
      iomap: invert the polarity of IOMAP_DIO_INLINE_COMP
      iomap: allocate s_dio_done_wq for async reads as well

Joanne Koong (24):
      iomap: move bio read logic into helper function
      iomap: move read/readahead bio submission logic into helper function
      iomap: simplify iomap_iter_advance()
      iomap: store read/readahead bio generically
      iomap: adjust read range correctly for non-block-aligned positions
      iomap: iterate over folio mapping in iomap_readpage_iter()
      iomap: rename iomap_readpage_iter() to iomap_read_folio_iter()
      iomap: rename iomap_readpage_ctx struct to iomap_read_folio_ctx
      iomap: track pending read bytes more optimally
      iomap: set accurate iter->pos when reading folio ranges
      iomap: add caller-provided callbacks for read and readahead
      iomap: make iomap_read_folio() a void return
      fuse: use iomap for read_folio
      fuse: use iomap for readahead
      fuse: remove fc->blkbits workaround for partial writes
      iomap: rename bytes_pending/bytes_accounted to bytes_submitted/bytes_not_submitted
      iomap: account for unaligned end offsets when truncating read range
      docs: document iomap writeback's iomap_finish_folio_write() requirement
      iomap: optimize pending async writeback accounting
      iomap: simplify ->read_folio_range() error handling for reads
      iomap: simplify when reads can be skipped for writes
      iomap: use find_next_bit() for dirty bitmap scanning
      iomap: use find_next_bit() for uptodate bitmap scanning
      iomap: fix iomap_read_end() for already uptodate folios

Qu Wenruo (1):
      iomap: add IOMAP_DIO_FSBLOCK_ALIGNED flag

 Documentation/filesystems/iomap/operations.rst |  50 +-
 block/fops.c                                   |   5 +-
 fs/backing-file.c                              |   6 -
 fs/dax.c                                       |  30 +-
 fs/erofs/data.c                                |   5 +-
 fs/fuse/dir.c                                  |   2 +-
 fs/fuse/file.c                                 | 286 ++++++-----
 fs/fuse/fuse_i.h                               |   8 -
 fs/fuse/inode.c                                |  13 +-
 fs/gfs2/aops.c                                 |   6 +-
 fs/iomap/Makefile                              |   3 +-
 fs/iomap/bio.c                                 |  88 ++++
 fs/iomap/buffered-io.c                         | 636 +++++++++++++++----------
 fs/iomap/direct-io.c                           | 230 +++++----
 fs/iomap/internal.h                            |  12 +
 fs/iomap/ioend.c                               |   2 -
 fs/iomap/iter.c                                |  20 +-
 fs/iomap/seek.c                                |   8 +-
 fs/iomap/trace.h                               |   7 +-
 fs/xfs/libxfs/xfs_errortag.h                   |   6 +-
 fs/xfs/xfs_aops.c                              |   5 +-
 fs/xfs/xfs_file.c                              |  50 +-
 fs/xfs/xfs_iomap.c                             |  38 +-
 fs/zonefs/file.c                               |   5 +-
 include/linux/fs.h                             |  43 +-
 include/linux/iomap.h                          |  86 +++-
 include/linux/pagemap.h                        |   2 +
 io_uring/rw.c                                  |  16 +-
 mm/filemap.c                                   |  58 +++
 29 files changed, 1093 insertions(+), 633 deletions(-)
 create mode 100644 fs/iomap/bio.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 02/17 for v6.19] vfs misc
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
  2025-11-28 16:48 ` [GIT PULL 01/17 for v6.19] vfs iomap Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 03/17 for v6.19] vfs inode Christian Brauner
                   ` (14 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the usually miscellaneous vfs changes:

Note that this has the kbuild -fms-extensions support merged in as the
pipe changes depends on it.

Features

- Cheaper MAY_EXEC handling for path lookup. This elides MAY_WRITE
  permission checks during path lookup and adds the IOP_FASTPERM_MAY_EXEC
  flag so filesystems like btrfs can avoid expensive permission work.

- Hide dentry_cache behind runtime const machinery.

- Add German Maglione as virtiofs co-maintainer.

Cleanups

- Tidy up and inline step_into() and walk_component() for improved code
  generation.

- Re-enable IOCB_NOWAIT writes to files. This refactors file timestamp
  update logic, fixing a layering bypass in btrfs when updating timestamps
  on device files and improving FMODE_NOCMTIME handling in VFS now that
  nfsd started using it.

- Path lookup optimizations extracting slowpaths into dedicated routines
  and adding branch prediction hints for mntput_no_expire(), fd_install(),
  lookup_slow(), and various other hot paths.

- Enable clang's -fms-extensions flag, requiring a JFS rename to avoid
  conflicts.

- Remove spurious exports in fs/file_attr.c.

- Stop duplicating union pipe_index declaration. This depends on the
  shared kbuild branch that brings in -fms-extensions support which is
  merged into this branch.

- Use MD5 library instead of crypto_shash in ecryptfs.

- Use largest_zero_folio() in iomap_dio_zero().

- Replace simple_strtol/strtoul with kstrtoint/kstrtouint in init and
  initrd code.

- Various typo fixes.

Fixes

- Fix emergency sync for btrfs. Btrfs requires an explicit sync_fs() call
  with wait == 1 to commit super blocks. The emergency sync path never
  passed this, leaving btrfs data uncommitted during emergency sync.

- Use local kmap in watch_queue's post_one_notification().

- Add hint prints in sb_set_blocksize() for LBS dependency on THP.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.misc

for you to fetch changes up to ebf8538979101ef879742dcfaf04b684f5461e12:

  MAINTAINERS: add German Maglione as virtiofs co-maintainer (2025-11-27 10:00:09 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.misc tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.misc

----------------------------------------------------------------
Askar Safin (2):
      fs/splice.c: trivial fix: pipes -> pipe's
      include/linux/fs.h: trivial fix: regualr -> regular

Baokun Li (1):
      bdev: add hint prints in sb_set_blocksize() for LBS dependency on THP

Christian Brauner (6):
      Merge patch series "fs: fully sync all fsese even for an emergency sync"
      Merge patch "kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS"
      Merge branch 'kbuild-6.19.fms.extension'
      Merge patch series "cheaper MAY_EXEC handling for path lookup"
      Merge patch series "re-enable IOCB_NOWAIT writes to files v2"
      Merge patch series "fs: tidy up step_into() & friends before inlining"

Christoph Hellwig (7):
      fs: remove spurious exports in fs/file_attr.c
      fs: refactor file timestamp update logic
      fs: lift the FMODE_NOCMTIME check into file_update_time_flags
      fs: export vfs_utimes
      btrfs: use vfs_utimes to update file timestamps
      btrfs: fix the comment on btrfs_update_time
      orangefs: use inode_update_timestamps directly

Davidlohr Bueso (1):
      watch_queue: Use local kmap in post_one_notification()

Eric Biggers (1):
      ecryptfs: Use MD5 library instead of crypto_shash

Kaushlendra Kumar (1):
      init: Replace simple_strtoul() with kstrtouint() in root_delay_setup()

Mateusz Guzik (13):
      fs: touch up predicts in putname()
      fs: speed up path lookup with cheaper handling of MAY_EXEC
      btrfs: utilize IOP_FASTPERM_MAY_EXEC
      fs: retire now stale MAY_WRITE predicts in inode_permission()
      fs: touch predicts in do_dentry_open()
      fs: hide dentry_cache behind runtime const machinery
      fs: move fd_install() slowpath into a dedicated routine and provide commentary
      fs: touch up predicts in path lookup
      fs: move mntput_no_expire() slowpath into a dedicated routine
      fs: add predicts based on nd->depth
      fs: mark lookup_slow() as noinline
      fs: tidy up step_into() & friends before inlining
      fs: inline step_into() and walk_component()

Nathan Chancellor (2):
      jfs: Rename _inline to avoid conflict with clang's '-fms-extensions'
      kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS

Pankaj Raghav (1):
      iomap: use largest_zero_folio() in iomap_dio_zero()

Qu Wenruo (2):
      fs: do not pass a parameter for sync_inodes_one_sb()
      fs: fully sync all fses even for an emergency sync

Rasmus Villemoes (2):
      Kbuild: enable -fms-extensions
      fs/pipe: stop duplicating union pipe_index declaration

Stefan Hajnoczi (1):
      MAINTAINERS: add German Maglione as virtiofs co-maintainer

Thorsten Blum (1):
      initrd: Replace simple_strtol with kstrtoint to improve ramdisk_start_setup

 MAINTAINERS                           |   1 +
 Makefile                              |   3 +
 arch/arm64/kernel/vdso32/Makefile     |   3 +-
 arch/loongarch/vdso/Makefile          |   2 +-
 arch/parisc/boot/compressed/Makefile  |   2 +-
 arch/powerpc/boot/Makefile            |   3 +-
 arch/s390/Makefile                    |   3 +-
 arch/s390/purgatory/Makefile          |   3 +-
 arch/x86/Makefile                     |   4 +-
 arch/x86/boot/compressed/Makefile     |   7 +-
 block/bdev.c                          |  19 ++++-
 drivers/firmware/efi/libstub/Makefile |   4 +-
 fs/btrfs/inode.c                      |  16 +++-
 fs/btrfs/volumes.c                    |  11 +--
 fs/dcache.c                           |   6 +-
 fs/ecryptfs/Kconfig                   |   2 +-
 fs/ecryptfs/crypto.c                  |  90 +++------------------
 fs/ecryptfs/ecryptfs_kernel.h         |  13 +---
 fs/ecryptfs/inode.c                   |   7 +-
 fs/ecryptfs/keystore.c                |  65 +++-------------
 fs/ecryptfs/main.c                    |   7 ++
 fs/ecryptfs/super.c                   |   5 +-
 fs/file.c                             |  35 +++++++--
 fs/file_attr.c                        |   4 -
 fs/inode.c                            |  58 +++++---------
 fs/iomap/direct-io.c                  |  38 ++++-----
 fs/jfs/jfs_incore.h                   |   6 +-
 fs/namei.c                            | 142 +++++++++++++++++++++++++---------
 fs/namespace.c                        |  38 +++++----
 fs/open.c                             |   6 +-
 fs/orangefs/inode.c                   |   4 +-
 fs/splice.c                           |   2 +-
 fs/sync.c                             |   7 +-
 fs/utimes.c                           |   1 +
 include/asm-generic/vmlinux.lds.h     |   3 +-
 include/linux/fs.h                    |  15 ++--
 include/linux/pipe_fs_i.h             |  23 ++----
 init/do_mounts.c                      |   3 +-
 init/do_mounts_rd.c                   |   3 +-
 kernel/watch_queue.c                  |   4 +-
 scripts/Makefile.extrawarn            |   4 +-
 41 files changed, 329 insertions(+), 343 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 03/17 for v6.19] vfs inode
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
  2025-11-28 16:48 ` [GIT PULL 01/17 for v6.19] vfs iomap Christian Brauner
  2025-11-28 16:48 ` [GIT PULL 02/17 for v6.19] vfs misc Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 04/17 for v6.19] vfs writeback Christian Brauner
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains inode specific changes for this cycle:

Features

- Hide inode->i_state behind accessors. Open-coded accesses prevent
  asserting they are done correctly. One obvious aspect is locking, but
  significantly more can be checked. For example it can be detected when
  the code is clearing flags which are already missing, or is setting
  flags when it is illegal (e.g., I_FREEING when ->i_count > 0).

- Provide accessors for ->i_state, converts all filesystems using coccinelle
  and manual conversions (btrfs, ceph, smb, f2fs, gfs2, overlayfs, nilfs2,
  xfs), and makes plain ->i_state access fail to compile.

- Rework I_NEW handling to operate without fences, simplifying the code
  after the accessor infrastructure is in place.

Cleanups

- Move wait_on_inode() from writeback.h to fs.h.

- Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
  for clarity.

- Cosmetic fixes to LRU handling.

- Push list presence check into inode_io_list_del().

- Touch up predicts in __d_lookup_rcu().

- ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage.

- Assert on ->i_count in iput_final().

- Assert ->i_lock held in __iget().

Fixes

- Add missing fences to I_NEW handling.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.inode

for you to fetch changes up to ca0d620b0afae20a7bcd5182606eba6860b2dbf2:

  dcache: touch up predicts in __d_lookup_rcu() (2025-11-28 10:31:45 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.inode tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.inode

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "hide ->i_state behind accessors"

Mateusz Guzik (22):
      fs: assert ->i_lock held in __iget()
      fs: assert on ->i_count in iput_final()
      ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
      fs: add missing fences to I_NEW handling
      fs: move wait_on_inode() from writeback.h to fs.h
      fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
      fs: provide accessors for ->i_state
      Coccinelle-based conversion to use ->i_state accessors
      Manual conversion to use ->i_state accessors of all places not covered by coccinelle
      btrfs: use the new ->i_state accessors
      ceph: use the new ->i_state accessors
      smb: use the new ->i_state accessors
      f2fs: use the new ->i_state accessors
      gfs2: use the new ->i_state accessors
      overlayfs: use the new ->i_state accessors
      nilfs2: use the new ->i_state accessors
      xfs: use the new ->i_state accessors
      fs: make plain ->i_state access fail to compile
      fs: rework I_NEW handling to operate without fences
      fs: cosmetic fixes to lru handling
      fs: push list presence check into inode_io_list_del()
      dcache: touch up predicts in __d_lookup_rcu()

 Documentation/filesystems/porting.rst |   2 +-
 block/bdev.c                          |   4 +-
 drivers/dax/super.c                   |   2 +-
 fs/9p/vfs_inode.c                     |   2 +-
 fs/9p/vfs_inode_dotl.c                |   2 +-
 fs/affs/inode.c                       |   2 +-
 fs/afs/dir.c                          |   4 +-
 fs/afs/dynroot.c                      |   6 +-
 fs/afs/inode.c                        |   8 +-
 fs/befs/linuxvfs.c                    |   2 +-
 fs/bfs/inode.c                        |   2 +-
 fs/btrfs/inode.c                      |  10 +-
 fs/buffer.c                           |   4 +-
 fs/ceph/cache.c                       |   2 +-
 fs/ceph/crypto.c                      |   4 +-
 fs/ceph/file.c                        |   4 +-
 fs/ceph/inode.c                       |  28 ++--
 fs/coda/cnode.c                       |   4 +-
 fs/cramfs/inode.c                     |   2 +-
 fs/crypto/keyring.c                   |   2 +-
 fs/crypto/keysetup.c                  |   2 +-
 fs/dcache.c                           |  29 ++--
 fs/drop_caches.c                      |   2 +-
 fs/ecryptfs/inode.c                   |   6 +-
 fs/efs/inode.c                        |   2 +-
 fs/erofs/inode.c                      |   2 +-
 fs/ext2/inode.c                       |   2 +-
 fs/ext4/inode.c                       |  13 +-
 fs/ext4/orphan.c                      |   4 +-
 fs/f2fs/data.c                        |   2 +-
 fs/f2fs/inode.c                       |   2 +-
 fs/f2fs/namei.c                       |   4 +-
 fs/f2fs/super.c                       |   2 +-
 fs/freevxfs/vxfs_inode.c              |   2 +-
 fs/fs-writeback.c                     | 132 +++++++++---------
 fs/fuse/inode.c                       |   4 +-
 fs/gfs2/file.c                        |   2 +-
 fs/gfs2/glock.c                       |   2 +-
 fs/gfs2/glops.c                       |   2 +-
 fs/gfs2/inode.c                       |   4 +-
 fs/gfs2/ops_fstype.c                  |   2 +-
 fs/hfs/btree.c                        |   2 +-
 fs/hfs/inode.c                        |   2 +-
 fs/hfsplus/super.c                    |   2 +-
 fs/hostfs/hostfs_kern.c               |   2 +-
 fs/hpfs/dir.c                         |   2 +-
 fs/hpfs/inode.c                       |   2 +-
 fs/inode.c                            | 247 +++++++++++++++++++---------------
 fs/isofs/inode.c                      |   2 +-
 fs/jffs2/fs.c                         |   4 +-
 fs/jfs/file.c                         |   4 +-
 fs/jfs/inode.c                        |   2 +-
 fs/jfs/jfs_txnmgr.c                   |   2 +-
 fs/kernfs/inode.c                     |   2 +-
 fs/libfs.c                            |   6 +-
 fs/minix/inode.c                      |   2 +-
 fs/namei.c                            |   8 +-
 fs/netfs/misc.c                       |   8 +-
 fs/netfs/read_single.c                |   6 +-
 fs/nfs/inode.c                        |   2 +-
 fs/nfs/pnfs.c                         |   2 +-
 fs/nfsd/vfs.c                         |   2 +-
 fs/nilfs2/cpfile.c                    |   2 +-
 fs/nilfs2/dat.c                       |   2 +-
 fs/nilfs2/ifile.c                     |   2 +-
 fs/nilfs2/inode.c                     |  10 +-
 fs/nilfs2/sufile.c                    |   2 +-
 fs/notify/fsnotify.c                  |   2 +-
 fs/ntfs3/inode.c                      |   2 +-
 fs/ocfs2/dlmglue.c                    |   2 +-
 fs/ocfs2/inode.c                      |  27 +---
 fs/ocfs2/inode.h                      |   1 -
 fs/ocfs2/ocfs2_trace.h                |   2 -
 fs/ocfs2/super.c                      |   2 +-
 fs/omfs/inode.c                       |   2 +-
 fs/openpromfs/inode.c                 |   2 +-
 fs/orangefs/inode.c                   |   2 +-
 fs/orangefs/orangefs-utils.c          |   6 +-
 fs/overlayfs/dir.c                    |   2 +-
 fs/overlayfs/inode.c                  |   6 +-
 fs/overlayfs/util.c                   |  10 +-
 fs/pipe.c                             |   2 +-
 fs/qnx4/inode.c                       |   2 +-
 fs/qnx6/inode.c                       |   2 +-
 fs/quota/dquot.c                      |   2 +-
 fs/romfs/super.c                      |   2 +-
 fs/smb/client/cifsfs.c                |   2 +-
 fs/smb/client/inode.c                 |  14 +-
 fs/squashfs/inode.c                   |   2 +-
 fs/sync.c                             |   2 +-
 fs/ubifs/file.c                       |   2 +-
 fs/ubifs/super.c                      |   2 +-
 fs/udf/inode.c                        |   2 +-
 fs/ufs/inode.c                        |   2 +-
 fs/xfs/scrub/common.c                 |   2 +-
 fs/xfs/scrub/inode_repair.c           |   2 +-
 fs/xfs/scrub/parent.c                 |   2 +-
 fs/xfs/xfs_bmap_util.c                |   2 +-
 fs/xfs/xfs_health.c                   |   4 +-
 fs/xfs/xfs_icache.c                   |   6 +-
 fs/xfs/xfs_inode.c                    |   6 +-
 fs/xfs/xfs_inode_item.c               |   4 +-
 fs/xfs/xfs_iops.c                     |   2 +-
 fs/xfs/xfs_reflink.h                  |   2 +-
 fs/zonefs/super.c                     |   4 +-
 include/linux/backing-dev.h           |   5 +-
 include/linux/fs.h                    |  99 ++++++++++++--
 include/linux/writeback.h             |   9 +-
 include/trace/events/writeback.h      |   8 +-
 mm/backing-dev.c                      |   2 +-
 mm/filemap.c                          |   4 +-
 mm/truncate.c                         |   6 +-
 mm/vmscan.c                           |   2 +-
 mm/workingset.c                       |   2 +-
 security/landlock/fs.c                |   2 +-
 115 files changed, 514 insertions(+), 414 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 04/17 for v6.19] vfs writeback
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (2 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 03/17 for v6.19] vfs inode Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 05/17 for v6.19] namespaces Christian Brauner
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains writeback changes for this cycle:

Features

- Allow file systems to increase the minimum writeback chunk size. The
  relatively low minimal writeback size of 4MiB means that written back
  inodes on rotational media are switched a lot. Besides introducing
  additional seeks, this also can lead to extreme file fragmentation on
  zoned devices when a lot of files are cached relative to the available
  writeback bandwidth. This adds a superblock field that allows the file
  system to override the default size, and sets it to the zone size for
  zoned XFS.

- Add logging for slow writeback when it exceeds sysctl_hung_task_timeout_secs.
  This helps identify tasks waiting for a long time and pinpoint potential
  issues. Recording the starting jiffies is also useful when debugging a
  crashed vmcore.

- Wake up waiting tasks when finishing the writeback of a chunk.

Cleanups

- filemap_* writeback interface cleanups. Adding filemap_fdatawrite_wbc
  ended up being a mistake, as all but the original btrfs caller should
  be using better high level interfaces instead. This series removes all
  these low-level interfaces, switches btrfs to a more specific interface,
  and cleans up other too low-level interfaces. With this the
  writeback_control that is passed to the writeback code is only
  initialized in three places.

- Remove __filemap_fdatawrite, __filemap_fdatawrite_range, and
  filemap_fdatawrite_wbc

- Add filemap_flush_nr helper for btrfs

- Push struct writeback_control into start_delalloc_inodes in btrfs

- Rename filemap_fdatawrite_range_kick to filemap_flush_range

- Stop opencoding filemap_fdatawrite_range in 9p, ocfs2, and mm

- Make wbc_to_tag() inline and use it in fs.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

diff --cc include/linux/writeback.h
index 102071ffedcb,2a81816f7507..000000000000
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@@ -189,6 -189,20 +189,13 @@@ void wakeup_flusher_threads_bdi(struct
  void inode_wait_for_writeback(struct inode *inode);
  void inode_io_list_del(struct inode *inode);

 -/* writeback.h requires fs.h; it, too, is not included from here. */
 -static inline void wait_on_inode(struct inode *inode)
 -{
 -      wait_var_event(inode_state_wait_address(inode, __I_NEW),
 -                     !(READ_ONCE(inode->i_state) & I_NEW));
 -}
 -
+ static inline xa_mark_t wbc_to_tag(struct writeback_control *wbc)
+ {
+       if (wbc->sync_mode == WB_SYNC_ALL || wbc->tagged_writepages)
+               return PAGECACHE_TAG_TOWRITE;
+       return PAGECACHE_TAG_DIRTY;
+ }
+
  #ifdef CONFIG_CGROUP_WRITEBACK

  #include <linux/cgroup.h>

Merge conflicts with other trees
================================

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.writeback

for you to fetch changes up to 4952f35f0545f3b53dab8d5fd727c4827c2a2778:

  fs: Make wbc_to_tag() inline and use it in fs. (2025-10-29 23:33:48 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.writeback tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.writeback

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "filemap_* writeback interface cleanups v2"
      Merge patch series "allow file systems to increase the minimum writeback chunk size v2"

Christoph Hellwig (13):
      mm: don't opencode filemap_fdatawrite_range in filemap_invalidate_inode
      9p: don't opencode filemap_fdatawrite_range in v9fs_mmap_vm_close
      ocfs2: don't opencode filemap_fdatawrite_range in ocfs2_journal_submit_inode_data_buffers
      btrfs: use the local tmp_inode variable in start_delalloc_inodes
      btrfs: push struct writeback_control into start_delalloc_inodes
      mm,btrfs: add a filemap_flush_nr helper
      mm: remove __filemap_fdatawrite
      mm: remove filemap_fdatawrite_wbc
      mm: remove __filemap_fdatawrite_range
      mm: rename filemap_fdatawrite_range_kick to filemap_flush_range
      writeback: cleanup writeback_chunk_size
      writeback: allow the file system to override MIN_WRITEBACK_PAGES
      xfs: set s_min_writeback_pages for zoned file systems

Julian Sun (3):
      writeback: Wake up waiting tasks when finishing the writeback of a chunk.
      writeback: Add logging for slow writeback (exceeds sysctl_hung_task_timeout_secs)
      fs: Make wbc_to_tag() inline and use it in fs.

 fs/9p/vfs_file.c                 |  17 ++----
 fs/btrfs/extent_io.c             |   5 +-
 fs/btrfs/inode.c                 |  46 +++++------------
 fs/ceph/addr.c                   |   6 +--
 fs/ext4/inode.c                  |   5 +-
 fs/f2fs/data.c                   |   5 +-
 fs/fs-writeback.c                |  55 ++++++++++++--------
 fs/gfs2/aops.c                   |   5 +-
 fs/ocfs2/journal.c               |  11 +---
 fs/super.c                       |   1 +
 fs/sync.c                        |  10 ++--
 fs/xfs/xfs_zone_alloc.c          |  28 +++++++++-
 include/linux/backing-dev-defs.h |   2 +
 include/linux/fs.h               |   7 +--
 include/linux/pagemap.h          |   5 +-
 include/linux/writeback.h        |  12 +++++
 mm/fadvise.c                     |   3 +-
 mm/filemap.c                     | 109 ++++++++++++++++-----------------------
 mm/page-writeback.c              |   6 ---
 19 files changed, 154 insertions(+), 184 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 05/17 for v6.19] namespaces
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (3 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 04/17 for v6.19] vfs writeback Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 19:06   ` Eric W. Biederman
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 06/17 for v6.19] vfs coredump Christian Brauner
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains substantial namespace infrastructure changes including a new
system call, active reference counting, and extensive header cleanups.
The branch depends on the shared kbuild branch for -fms-extensions support.

Features

- listns() System Call

  Add a new listns() system call that allows userspace to iterate through
  namespaces in the system. This provides a programmatic interface to
  discover and inspect namespaces, addressing longstanding limitations:

  Currently, there is no direct way for userspace to enumerate namespaces.
  Applications must resort to scanning /proc//ns/ across all processes,
  which is:

  1. Inefficient - requires iterating over all processes
  2. Incomplete - misses namespaces not attached to any running process but
     kept alive by file descriptors, bind mounts, or parent references
  3. Permission-heavy - requires access to /proc for many processes
  4. No ordering or ownership information
  5. No filtering per namespace type

  The listns() system call solves these problems:

  ssize_t listns(const struct ns_id_req *req, u64 *ns_ids,
                 size_t nr_ns_ids, unsigned int flags);

  struct ns_id_req {
          __u32 size;
          __u32 spare;
          __u64 ns_id;
          struct /* listns */ {
                  __u32 ns_type;
                  __u32 spare2;
                  __u64 user_ns_id;
          };
  };

  Features include:

  - Pagination support for large namespace sets

  - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.)

  - Filtering by owning user namespace

  - Permission checks respecting namespace isolation

- Active Reference Counting

  Introduce an active reference count that tracks namespace visibility to
  userspace. A namespace is visible in the following cases:

  1. The namespace is in use by a task
  2. The namespace is persisted through a VFS object (namespace file
     descriptor or bind-mount)
  3. The namespace is a hierarchical type and is the parent of child
     namespaces

  The active reference count does not regulate lifetime (that's still done
  by the normal reference count) - it only regulates visibility to namespace
  file handles and listns().

  This prevents resurrection of namespaces that are pinned only for internal
  kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB
  references on idle CPUs, etc.) which should not be accessible via (1)-(3).

- Unified Namespace Tree

  Introduce a unified tree structure for all namespaces with:

  - Fixed IDs assigned to initial namespaces

  - Lookup based solely on inode number

  - Maintained list of owned namespaces per user namespace

  - Simplified rbtree comparison helpers

Cleanups

- Header Reorganization

  - Move namespace types into separate header (ns_common_types.h)

  - Decouple nstree from ns_common header

  - Move nstree types into separate header

  - Switch to new ns_tree_{node,root} structures with helper functions

  - Use guards for ns_tree_lock

- Initial Namespace Reference Count Optimization

  - Make all reference counts on initial namespaces a nop to avoid
    pointless cacheline ping-pong for namespaces that can never go away

  - Drop custom reference count initialization for initial namespaces

  - Add NS_COMMON_INIT() macro and use it for all namespaces

  - pid: rely on common reference count behavior

- Miscellaneous Cleanups

  - Rename exit_task_namespaces() to exit_nsproxy_namespaces()

  - Rename is_initial_namespace() and make argument const

  - Use boolean to indicate anonymous mount namespace

  - Simplify owner list iteration in nstree

  - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly

  - nsfs: use inode_just_drop()

  - pidfs: raise DCACHE_DONTCACHE explicitly

  - pidfs: simplify PIDFD_GET__NAMESPACE ioctls

  - libfs: allow to specify s_d_flags

  - cgroup: add cgroup namespace to tree after owner is set

  - nsproxy: fix free_nsproxy() and simplify create_new_namespaces()

Fixes

- setns(pidfd, ...) Race Condition

  Fix a subtle race when using pidfds with setns(). When the target task
  exits after prepare_nsset() but before commit_nsset(), the namespace's
  active reference count might have been dropped. If setns() then installs
  the namespaces, it would bump the active reference count from zero without
  taking the required reference on the owner namespace, leading to underflow
  when later decremented.

  The fix resurrects the ownership chain if necessary - if the caller
  succeeded in grabbing passive references, the setns() should succeed even
  if the target task exits or gets reaped.

- Return EFAULT on put_user() error instead of success

- Make sure references are dropped outside of RCU lock (some namespaces
  like mount namespace sleep when putting the last reference)

- Don't skip active reference count initialization for network namespace

- Add asserts for active refcount underflow

- Add asserts for initial namespace reference counts (both passive and
  active)

- ipc: enable is_ns_init_id() assertions

- Fix kernel-doc comments for internal nstree functions

- Selftests

  - 15 active reference count tests

  - 9 listns() functionality tests

  - 7 listns() permission tests

  - 12 inactive namespace resurrection tests

  - 3 threaded active reference count tests

  - commit_creds() active reference tests

  - Pagination and stress tests

  - EFAULT handling test

  - nsid tests fixes

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

diff --cc fs/namespace.c
index a7fd9682bcf9,25289b869be1..000000000000
--- a/fs/namespace.c
+++ b/fs/namespace.c

Merge conflicts with other trees
================================

[1] https://lore.kernel.org/linux-next/20251118110822.72e36c15@canb.auug.org.au

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/namespace-6.19-rc1

for you to fetch changes up to a71e4f103aed69e7a11ea913312726bb194c76ee:

  pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls (2025-11-17 16:23:13 +0100)

Please consider pulling these changes from the signed namespace-6.19-rc1 tag.

Thanks!
Christian

----------------------------------------------------------------
namespace-6.19-rc1

----------------------------------------------------------------
Christian Brauner (107):
      libfs: allow to specify s_d_flags
      nsfs: use inode_just_drop()
      nsfs: raise DCACHE_DONTCACHE explicitly
      pidfs: raise DCACHE_DONTCACHE explicitly
      nsfs: raise SB_I_NODEV and SB_I_NOEXEC
      cgroup: add cgroup namespace to tree after owner is set
      nstree: simplify return
      ns: add missing authorship
      ns: add NS_COMMON_INIT()
      ns: use NS_COMMON_INIT() for all namespaces
      ns: initialize ns_list_node for initial namespaces
      ns: add __ns_ref_read()
      ns: rename to exit_nsproxy_namespaces()
      ns: add active reference count
      ns: use anonymous struct to group list member
      nstree: introduce a unified tree
      nstree: allow lookup solely based on inode
      nstree: assign fixed ids to the initial namespaces
      nstree: maintain list of owned namespaces
      nstree: simplify rbtree comparison helpers
      nstree: add unified namespace list
      nstree: add listns()
      arch: hookup listns() system call
      nsfs: update tools header
      selftests/filesystems: remove CLONE_NEWPIDNS from setup_userns() helper
      selftests/namespaces: first active reference count tests
      selftests/namespaces: second active reference count tests
      selftests/namespaces: third active reference count tests
      selftests/namespaces: fourth active reference count tests
      selftests/namespaces: fifth active reference count tests
      selftests/namespaces: sixth active reference count tests
      selftests/namespaces: seventh active reference count tests
      selftests/namespaces: eigth active reference count tests
      selftests/namespaces: ninth active reference count tests
      selftests/namespaces: tenth active reference count tests
      selftests/namespaces: eleventh active reference count tests
      selftests/namespaces: twelth active reference count tests
      selftests/namespaces: thirteenth active reference count tests
      selftests/namespaces: fourteenth active reference count tests
      selftests/namespaces: fifteenth active reference count tests
      selftests/namespaces: add listns() wrapper
      selftests/namespaces: first listns() test
      selftests/namespaces: second listns() test
      selftests/namespaces: third listns() test
      selftests/namespaces: fourth listns() test
      selftests/namespaces: fifth listns() test
      selftests/namespaces: sixth listns() test
      selftests/namespaces: seventh listns() test
      selftests/namespaces: eigth listns() test
      selftests/namespaces: ninth listns() test
      selftests/namespaces: first listns() permission test
      selftests/namespaces: second listns() permission test
      selftests/namespaces: third listns() permission test
      selftests/namespaces: fourth listns() permission test
      selftests/namespaces: fifth listns() permission test
      selftests/namespaces: sixth listns() permission test
      selftests/namespaces: seventh listns() permission test
      selftests/namespaces: first inactive namespace resurrection test
      selftests/namespaces: second inactive namespace resurrection test
      selftests/namespaces: third inactive namespace resurrection test
      selftests/namespaces: fourth inactive namespace resurrection test
      selftests/namespaces: fifth inactive namespace resurrection test
      selftests/namespaces: sixth inactive namespace resurrection test
      selftests/namespaces: seventh inactive namespace resurrection test
      selftests/namespaces: eigth inactive namespace resurrection test
      selftests/namespaces: ninth inactive namespace resurrection test
      selftests/namespaces: tenth inactive namespace resurrection test
      selftests/namespaces: eleventh inactive namespace resurrection test
      selftests/namespaces: twelth inactive namespace resurrection test
      selftests/namespace: first threaded active reference count test
      selftests/namespace: second threaded active reference count test
      selftests/namespace: third threaded active reference count test
      selftests/namespace: commit_creds() active reference tests
      selftests/namespace: add stress test
      selftests/namespace: test listns() pagination
      Merge patch series "nstree: listns()"
      ns: don't skip active reference count initialization
      ns: don't increment or decrement initial namespaces
      ns: make sure reference are dropped outside of rcu lock
      ns: return EFAULT on put_user() error
      ns: handle setns(pidfd, ...) cleanly
      ns: add asserts for active refcount underflow
      selftests/namespaces: add active reference count regression test
      Merge patch "kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS"
      selftests/namespaces: test for efault
      Merge patch series "ns: fixes for namespace iteration and active reference counting"
      Merge branch 'kbuild-6.19.fms.extension'
      ns: move namespace types into separate header
      nstree: decouple from ns_common header
      nstree: move nstree types into separate header
      nstree: add helper to operate on struct ns_tree_{node,root}
      nstree: switch to new structures
      nstree: simplify owner list iteration
      nstree: use guards for ns_tree_lock
      ns: make is_initial_namespace() argument const
      ns: rename is_initial_namespace()
      fs: use boolean to indicate anonymous mount namespace
      ipc: enable is_ns_init_id() assertions
      ns: make all reference counts on initial namespace a nop
      ns: add asserts for initial namespace reference counts
      ns: add asserts for initial namespace active reference counts
      pid: rely on common reference count behavior
      ns: drop custom reference count initialization for initial namespaces
      selftests/namespaces: fix nsid tests
      Merge patch series "ns: header cleanups and initial namespace reference count improvements"
      nsproxy: fix free_nsproxy() and simplify create_new_namespaces()
      pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls

Kriish Sharma (1):
      nstree: fix kernel-doc comments for internal functions

Nathan Chancellor (2):
      jfs: Rename _inline to avoid conflict with clang's '-fms-extensions'
      kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS

Rasmus Villemoes (1):
      Kbuild: enable -fms-extensions

 Makefile                                           |    3 +
 arch/alpha/kernel/syscalls/syscall.tbl             |    1 +
 arch/arm/tools/syscall.tbl                         |    1 +
 arch/arm64/kernel/vdso32/Makefile                  |    3 +-
 arch/arm64/tools/syscall_32.tbl                    |    1 +
 arch/loongarch/vdso/Makefile                       |    2 +-
 arch/m68k/kernel/syscalls/syscall.tbl              |    1 +
 arch/microblaze/kernel/syscalls/syscall.tbl        |    1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl          |    1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl          |    1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl          |    1 +
 arch/parisc/boot/compressed/Makefile               |    2 +-
 arch/parisc/kernel/syscalls/syscall.tbl            |    1 +
 arch/powerpc/boot/Makefile                         |    3 +-
 arch/powerpc/kernel/syscalls/syscall.tbl           |    1 +
 arch/s390/Makefile                                 |    3 +-
 arch/s390/kernel/syscalls/syscall.tbl              |    1 +
 arch/s390/purgatory/Makefile                       |    3 +-
 arch/sh/kernel/syscalls/syscall.tbl                |    1 +
 arch/sparc/kernel/syscalls/syscall.tbl             |    1 +
 arch/x86/Makefile                                  |    4 +-
 arch/x86/boot/compressed/Makefile                  |    7 +-
 arch/x86/entry/syscalls/syscall_32.tbl             |    1 +
 arch/x86/entry/syscalls/syscall_64.tbl             |    1 +
 arch/xtensa/kernel/syscalls/syscall.tbl            |    1 +
 drivers/firmware/efi/libstub/Makefile              |    4 +-
 fs/jfs/jfs_incore.h                                |    6 +-
 fs/libfs.c                                         |    1 +
 fs/mount.h                                         |    3 +-
 fs/namespace.c                                     |   12 +-
 fs/nsfs.c                                          |  101 +-
 fs/pidfs.c                                         |   76 +-
 include/linux/ns/ns_common_types.h                 |  196 ++
 include/linux/ns/nstree_types.h                    |   55 +
 include/linux/ns_common.h                          |  233 +-
 include/linux/nsfs.h                               |    3 +
 include/linux/nsproxy.h                            |    9 +-
 include/linux/nstree.h                             |   52 +-
 include/linux/pid_namespace.h                      |    3 +-
 include/linux/pseudo_fs.h                          |    1 +
 include/linux/syscalls.h                           |    4 +
 include/linux/user_namespace.h                     |    4 +-
 include/uapi/asm-generic/unistd.h                  |    4 +-
 include/uapi/linux/nsfs.h                          |   58 +
 init/version-timestamp.c                           |    7 +-
 ipc/msgutil.c                                      |    7 +-
 ipc/namespace.c                                    |    3 +-
 kernel/cgroup/cgroup.c                             |   11 +-
 kernel/cgroup/namespace.c                          |    2 +-
 kernel/cred.c                                      |    6 +
 kernel/exit.c                                      |    3 +-
 kernel/fork.c                                      |    3 +-
 kernel/nscommon.c                                  |  246 +-
 kernel/nsproxy.c                                   |   57 +-
 kernel/nstree.c                                    |  782 +++++-
 kernel/pid.c                                       |   12 +-
 kernel/pid_namespace.c                             |    2 +-
 kernel/time/namespace.c                            |    5 +-
 kernel/user.c                                      |    7 +-
 net/core/net_namespace.c                           |    2 +-
 scripts/Makefile.extrawarn                         |    4 +-
 scripts/syscall.tbl                                |    1 +
 tools/include/uapi/linux/nsfs.h                    |   70 +
 tools/testing/selftests/filesystems/utils.c        |    2 +-
 tools/testing/selftests/namespaces/.gitignore      |    9 +
 tools/testing/selftests/namespaces/Makefile        |   24 +-
 .../selftests/namespaces/cred_change_test.c        |  814 ++++++
 .../selftests/namespaces/listns_efault_test.c      |  530 ++++
 .../selftests/namespaces/listns_pagination_bug.c   |  138 +
 .../selftests/namespaces/listns_permissions_test.c |  759 ++++++
 tools/testing/selftests/namespaces/listns_test.c   |  679 +++++
 .../selftests/namespaces/ns_active_ref_test.c      | 2672 ++++++++++++++++++++
 tools/testing/selftests/namespaces/nsid_test.c     |  107 +-
 .../namespaces/regression_pidfd_setns_test.c       |  113 +
 .../testing/selftests/namespaces/siocgskns_test.c  | 1824 +++++++++++++
 tools/testing/selftests/namespaces/stress_test.c   |  626 +++++
 tools/testing/selftests/namespaces/wrappers.h      |   35 +
 77 files changed, 9997 insertions(+), 436 deletions(-)
 create mode 100644 include/linux/ns/ns_common_types.h
 create mode 100644 include/linux/ns/nstree_types.h
 create mode 100644 tools/testing/selftests/namespaces/cred_change_test.c
 create mode 100644 tools/testing/selftests/namespaces/listns_efault_test.c
 create mode 100644 tools/testing/selftests/namespaces/listns_pagination_bug.c
 create mode 100644 tools/testing/selftests/namespaces/listns_permissions_test.c
 create mode 100644 tools/testing/selftests/namespaces/listns_test.c
 create mode 100644 tools/testing/selftests/namespaces/ns_active_ref_test.c
 create mode 100644 tools/testing/selftests/namespaces/regression_pidfd_setns_test.c
 create mode 100644 tools/testing/selftests/namespaces/siocgskns_test.c
 create mode 100644 tools/testing/selftests/namespaces/stress_test.c
 create mode 100644 tools/testing/selftests/namespaces/wrappers.h

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 06/17 for v6.19] vfs coredump
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (4 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 05/17 for v6.19] namespaces Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 07/17 for v6.19] vfs folio Christian Brauner
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the pidfd changes for this cycle.

Features

- Expose Coredump Signal via pidfd

  Expose the signal that caused the coredump through the pidfd interface.
  The recent changes to rework coredump handling to rely on unix sockets
  are in the process of being used in systemd. The previous systemd
  coredump container interface requires the coredump file descriptor and
  basic information including the signal number to be sent to the container.
  This means the signal number needs to be available before sending the
  coredump to the container.

- Add supported_mask Field to pidfd

  Add a new supported_mask field to struct pidfd_info that indicates which
  information fields are supported by the running kernel. This allows
  userspace to detect feature availability without relying on error codes
  or kernel version checks.

Cleanups

- Drop struct pidfs_exit_info and prepare to drop exit_info pointer,
  simplifying the internal publication mechanism for exit and coredump
  information retrievable via the pidfd ioctl.

- Use guard() for task_lock in pidfs.

- Reduce wait_pidfd lock scope.

- Add missing PIDFD_INFO_SIZE_VER1 constant.

- Add missing BUILD_BUG_ON() assert on struct pidfd_info.

Fixes

- Fix PIDFD_INFO_COREDUMP handling.

Selftests

- Split out coredump socket tests and common helpers into separate files
  for better organization.

- Fix userspace coredump client detection issues.

- Handle edge-triggered epoll correctly.

- Ignore ENOSPC errors in tests.

- Add debug logging to coredump socket tests, socket protocol tests,
  and test helpers.

- Add tests for PIDFD_INFO_COREDUMP_SIGNAL.

- Add tests for supported_mask field.

- Update pidfd header for selftests.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.coredump

for you to fetch changes up to 390d967653e17205f0e519f691b7d6be0a05ff45:

  pidfs: reduce wait_pidfd lock scope (2025-11-05 00:09:06 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.coredump tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.coredump

----------------------------------------------------------------
Christian Brauner (24):
      pidfs: use guard() for task_lock
      pidfs: fix PIDFD_INFO_COREDUMP handling
      pidfs: add missing PIDFD_INFO_SIZE_VER1
      pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info
      pidfd: add a new supported_mask field
      pidfs: prepare to drop exit_info pointer
      pidfs: drop struct pidfs_exit_info
      pidfs: expose coredump signal
      selftests/pidfd: update pidfd header
      selftests/pidfd: add first supported_mask test
      selftests/pidfd: add second supported_mask test
      selftests/coredump: split out common helpers
      selftests/coredump: split out coredump socket tests
      selftests/coredump: fix userspace client detection
      selftests/coredump: fix userspace coredump client detection
      selftests/coredump: handle edge-triggered epoll correctly
      selftests/coredump: add debug logging to test helpers
      selftests/coredump: add debug logging to coredump socket tests
      selftests/coredump: add debug logging to coredump socket protocol tests
      selftests/coredump: ignore ENOSPC errors
      selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test
      selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test
      Merge patch series "coredump: cleanups & pidfd extension"
      pidfs: reduce wait_pidfd lock scope

 fs/pidfs.c                                         |  113 +-
 include/uapi/linux/pidfd.h                         |   11 +-
 tools/testing/selftests/coredump/.gitignore        |    4 +
 tools/testing/selftests/coredump/Makefile          |    8 +-
 .../coredump/coredump_socket_protocol_test.c       | 1568 ++++++++++++++++++
 .../selftests/coredump/coredump_socket_test.c      |  742 +++++++++
 tools/testing/selftests/coredump/coredump_test.h   |   59 +
 .../selftests/coredump/coredump_test_helpers.c     |  383 +++++
 tools/testing/selftests/coredump/stackdump_test.c  | 1662 +-------------------
 tools/testing/selftests/pidfd/pidfd.h              |   15 +-
 tools/testing/selftests/pidfd/pidfd_info_test.c    |   73 +
 11 files changed, 2927 insertions(+), 1711 deletions(-)
 create mode 100644 tools/testing/selftests/coredump/.gitignore
 create mode 100644 tools/testing/selftests/coredump/coredump_socket_protocol_test.c
 create mode 100644 tools/testing/selftests/coredump/coredump_socket_test.c
 create mode 100644 tools/testing/selftests/coredump/coredump_test.h
 create mode 100644 tools/testing/selftests/coredump/coredump_test_helpers.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 07/17 for v6.19] vfs folio
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (5 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 06/17 for v6.19] vfs coredump Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 22:08   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 08/17 for v6.19] cred guards Christian Brauner
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Add a new folio_next_pos() helper function that returns the file position
of the first byte after the current folio. This is a common operation in
filesystems when needing to know the end of the current folio.

The helper is lifted from btrfs which already had its own version, and
is now used across multiple filesystems and subsystems:

- btrfs
- buffer
- ext4
- f2fs
- gfs2
- iomap
- netfs
- xfs
- mm

This fixes a long-standing bug in ocfs2 on 32-bit systems with files
larger than 2GiB. Presumably this is not a common configuration, but the
fix is backported anyway. The other filesystems did not have bugs, they
were just mildly inefficient.

This also introduce uoff_t as the unsigned version of loff_t. A recent
commit inadvertently changed a comparison from being unsigned (on 64-bit
systems) to being signed (which it had always been on 32-bit systems),
leading to sporadic fstests failures.

Generally file sizes are restricted to being a signed integer, but in
places where -1 is passed to indicate "up to the end of the file", it is
convenient to have an unsigned type to ensure comparisons are always
unsigned regardless of architecture.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251103085832.5d7ff280@canb.auug.org.au

[2]: https://lore.kernel.org/linux-next/20251124100508.64a6974a@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.folio

for you to fetch changes up to 37d369fa97cc0774ea4eab726d16bcb5fbe3a104:

  fs: Add uoff_t (2025-11-25 10:07:42 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.folio tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.folio

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Add and use folio_next_pos()"

Matthew Wilcox (Oracle) (11):
      filemap: Add folio_next_pos()
      btrfs: Use folio_next_pos()
      buffer: Use folio_next_pos()
      ext4: Use folio_next_pos()
      f2fs: Use folio_next_pos()
      gfs2: Use folio_next_pos()
      iomap: Use folio_next_pos()
      netfs: Use folio_next_pos()
      xfs: Use folio_next_pos()
      mm: Use folio_next_pos()
      fs: Add uoff_t

 fs/btrfs/compression.h                 |  4 ++--
 fs/btrfs/defrag.c                      |  7 ++++---
 fs/btrfs/extent_io.c                   | 16 ++++++++--------
 fs/btrfs/file.c                        |  9 +++++----
 fs/btrfs/inode.c                       | 11 ++++++-----
 fs/btrfs/misc.h                        |  5 -----
 fs/btrfs/ordered-data.c                |  2 +-
 fs/btrfs/subpage.c                     |  5 +++--
 fs/buffer.c                            |  2 +-
 fs/ext4/inode.c                        | 10 +++++-----
 fs/f2fs/compress.c                     |  2 +-
 fs/gfs2/aops.c                         |  3 +--
 fs/iomap/buffered-io.c                 | 10 ++++------
 fs/netfs/buffered_write.c              |  2 +-
 fs/netfs/misc.c                        |  2 +-
 fs/ocfs2/alloc.c                       |  2 +-
 fs/xfs/scrub/xfarray.c                 |  2 +-
 fs/xfs/xfs_aops.c                      |  2 +-
 include/linux/mm.h                     |  8 ++++----
 include/linux/pagemap.h                | 11 +++++++++++
 include/linux/shmem_fs.h               |  2 +-
 include/linux/types.h                  |  1 +
 include/uapi/asm-generic/posix_types.h |  1 +
 mm/shmem.c                             |  8 ++++----
 mm/truncate.c                          |  4 ++--
 25 files changed, 70 insertions(+), 61 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 08/17 for v6.19] cred guards
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (6 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 07/17 for v6.19] vfs folio Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 21:53   ` Linus Torvalds
  2025-12-01 22:08   ` [GIT PULL 08/17 for v6.19] cred guards pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 09/17 for v6.19] vfs headers Christian Brauner
                   ` (8 subsequent siblings)
  16 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains substantial credential infrastructure improvements adding
euard-based credential management that simplifies code and eliminates
manual reference counting in many subsystems.

Features

- Kernel Credential Guards

  Add with_kernel_creds() and scoped_with_kernel_creds() guards that allow
  using the kernel credentials without allocating and copying them. This
  was requested by Linus after seeing repeated prepare_kernel_creds() calls
  that duplicate the kernel credentials only to drop them again later.

  The new guards completely avoid the allocation and never expose the
  temporary variable to hold the kernel credentials anywhere in callers.

- Generic Credential Guards

  Add scoped_with_creds() guards for the common
  override_creds()/revert_creds() pattern. This builds on earlier work
  that made override_creds()/revert_creds() completely reference count
  free.

- Prepare Credential Guards

  Add prepare credential guards for the more complex pattern of
  preparing a new set of credentials and overriding the current
  credentials with them:
  (1) prepare_creds()
  (2) modify new creds
  (3) override_creds()
  (4) revert_creds()
  (5) put_cred()

Cleanups

- Make init_cred static since it should not be directly accessed.

- Add kernel_cred() helper to properly access the kernel credentials.

- Fix scoped_class() macro that was introduced two cycles ago.

- coredump: split out do_coredump() from vfs_coredump() for cleaner
  credential handling.

- coredump: move revert_cred() before coredump_cleanup().

- coredump: mark struct mm_struct as const.

- coredump: pass struct linux_binfmt as const.

- sev-dev: use guard for path.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

diff --cc fs/backing-file.c
index 2a86bb6fcd13,ea137be16331..000000000000
--- a/fs/backing-file.c
+++ b/fs/backing-file.c
@@@ -227,40 -267,14 +267,8 @@@ ssize_t backing_file_write_iter(struct 
  	    !(file->f_mode & FMODE_CAN_ODIRECT))
  		return -EINVAL;
  
- 	old_cred = override_creds(ctx->cred);
- 	if (is_sync_kiocb(iocb)) {
- 		rwf_t rwf = iocb_to_rw_flags(flags);
- 
- 		ret = vfs_iter_write(file, iter, &iocb->ki_pos, rwf);
- 		if (ctx->end_write)
- 			ctx->end_write(iocb, ret);
- 	} else {
- 		struct backing_aio *aio;
- 
- 		ret = backing_aio_init_wq(iocb);
- 		if (ret)
- 			goto out;
- 
- 		ret = -ENOMEM;
- 		aio = kmem_cache_zalloc(backing_aio_cachep, GFP_KERNEL);
- 		if (!aio)
- 			goto out;
- 
- 		aio->orig_iocb = iocb;
- 		aio->end_write = ctx->end_write;
- 		kiocb_clone(&aio->iocb, iocb, get_file(file));
- 		aio->iocb.ki_flags = flags;
- 		aio->iocb.ki_complete = backing_aio_queue_completion;
- 		refcount_set(&aio->ref, 2);
- 		ret = vfs_iocb_iter_write(file, &aio->iocb, iter);
- 		backing_aio_put(aio);
- 		if (ret != -EIOCBQUEUED)
- 			backing_aio_cleanup(aio, ret);
- 	}
- out:
- 	revert_creds(old_cred);
 -	/*
 -	 * Stacked filesystems don't support deferred completions, don't copy
 -	 * this property in case it is set by the issuer.
 -	 */
 -	flags &= ~IOCB_DIO_CALLER_COMP;
--
- 	return ret;
+ 	scoped_with_creds(ctx->cred)
+ 		return do_backing_file_write_iter(file, iter, iocb, flags, ctx->end_write);
  }
  EXPORT_SYMBOL_GPL(backing_file_write_iter);
  
diff --cc fs/nfs/localio.c
index 656976b4f42c,0c89a9d1e089..000000000000
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@@ -620,37 -595,30 +620,34 @@@ static void nfs_local_call_read(struct 
  	struct nfs_local_kiocb *iocb =
  		container_of(work, struct nfs_local_kiocb, work);
  	struct file *filp = iocb->kiocb.ki_filp;
- 	const struct cred *save_cred;
 +	bool force_done = false;
  	ssize_t status;
 +	int n_iters;
  
- 	save_cred = override_creds(filp->f_cred);
- 
- 	n_iters = atomic_read(&iocb->n_iters);
- 	for (int i = 0; i < n_iters ; i++) {
- 		if (iocb->iter_is_dio_aligned[i]) {
- 			iocb->kiocb.ki_flags |= IOCB_DIRECT;
- 			/* Only use AIO completion if DIO-aligned segment is last */
- 			if (i == iocb->end_iter_index) {
- 				iocb->kiocb.ki_complete = nfs_local_read_aio_complete;
- 				iocb->aio_complete_work = nfs_local_read_aio_complete_work;
- 			}
- 		} else
- 			iocb->kiocb.ki_flags &= ~IOCB_DIRECT;
- 
- 		status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]);
- 		if (status != -EIOCBQUEUED) {
- 			if (unlikely(status >= 0 && status < iocb->iters[i].count))
- 				force_done = true; /* Partial read */
- 			if (nfs_local_pgio_done(iocb, status, force_done)) {
- 				nfs_local_read_iocb_done(iocb);
- 				break;
+ 	scoped_with_creds(filp->f_cred) {
 -		for (int i = 0; i < iocb->n_iters ; i++) {
++		n_iters = atomic_read(&iocb->n_iters);
++		for (int i = 0; i < n_iters ; i++) {
+ 			if (iocb->iter_is_dio_aligned[i]) {
+ 				iocb->kiocb.ki_flags |= IOCB_DIRECT;
 -				iocb->kiocb.ki_complete = nfs_local_read_aio_complete;
 -				iocb->aio_complete_work = nfs_local_read_aio_complete_work;
 -			}
++				/* Only use AIO completion if DIO-aligned segment is last */
++				if (i == iocb->end_iter_index) {
++					iocb->kiocb.ki_complete = nfs_local_read_aio_complete;
++					iocb->aio_complete_work = nfs_local_read_aio_complete_work;
++				}
++			} else
++				iocb->kiocb.ki_flags &= ~IOCB_DIRECT;
+ 
 -			iocb->kiocb.ki_pos = iocb->offset[i];
+ 			status = filp->f_op->read_iter(&iocb->kiocb, &iocb->iters[i]);
+ 			if (status != -EIOCBQUEUED) {
 -				nfs_local_pgio_done(iocb->hdr, status);
 -				if (iocb->hdr->task.tk_status)
++				if (unlikely(status >= 0 && status < iocb->iters[i].count))
++					force_done = true; /* Partial read */
++				if (nfs_local_pgio_done(iocb, status, force_done)) {
++					nfs_local_read_iocb_done(iocb);
+ 					break;
++				}
  			}
  		}
  	}
--
- 	revert_creds(save_cred);
 -	if (status != -EIOCBQUEUED) {
 -		nfs_local_read_done(iocb, status);
 -		nfs_local_pgio_release(iocb);
 -	}
  }
  
  static int
@@@ -826,41 -839,20 +823,40 @@@ static void nfs_local_call_write(struc
  		container_of(work, struct nfs_local_kiocb, work);
  	struct file *filp = iocb->kiocb.ki_filp;
  	unsigned long old_flags = current->flags;
- 	const struct cred *save_cred;
 +	bool force_done = false;
  	ssize_t status;
 +	int n_iters;
  
  	current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO;
- 	save_cred = override_creds(filp->f_cred);
  
 -	scoped_with_creds(filp->f_cred)
 -		status = do_nfs_local_call_write(iocb, filp);
 -
 -	current->flags = old_flags;
++	scoped_with_creds(filp->f_cred) {
 +	file_start_write(filp);
- 	n_iters = atomic_read(&iocb->n_iters);
- 	for (int i = 0; i < n_iters ; i++) {
- 		if (iocb->iter_is_dio_aligned[i]) {
- 			iocb->kiocb.ki_flags |= IOCB_DIRECT;
- 			/* Only use AIO completion if DIO-aligned segment is last */
- 			if (i == iocb->end_iter_index) {
- 				iocb->kiocb.ki_complete = nfs_local_write_aio_complete;
- 				iocb->aio_complete_work = nfs_local_write_aio_complete_work;
- 			}
- 		} else
- 			iocb->kiocb.ki_flags &= ~IOCB_DIRECT;
- 
- 		status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]);
- 		if (status != -EIOCBQUEUED) {
- 			if (unlikely(status >= 0 && status < iocb->iters[i].count))
- 				force_done = true; /* Partial write */
- 			if (nfs_local_pgio_done(iocb, status, force_done)) {
- 				nfs_local_write_iocb_done(iocb);
- 				break;
++		n_iters = atomic_read(&iocb->n_iters);
++		for (int i = 0; i < n_iters ; i++) {
++			if (iocb->iter_is_dio_aligned[i]) {
++				iocb->kiocb.ki_flags |= IOCB_DIRECT;
++				/* Only use AIO completion if DIO-aligned segment is last */
++				if (i == iocb->end_iter_index) {
++					iocb->kiocb.ki_complete = nfs_local_write_aio_complete;
++					iocb->aio_complete_work = nfs_local_write_aio_complete_work;
++				}
++			} else
++				iocb->kiocb.ki_flags &= ~IOCB_DIRECT;
+ 
 -	if (status != -EIOCBQUEUED) {
 -		nfs_local_write_done(iocb, status);
 -		nfs_local_vfs_getattr(iocb);
 -		nfs_local_pgio_release(iocb);
++			status = filp->f_op->write_iter(&iocb->kiocb, &iocb->iters[i]);
++			if (status != -EIOCBQUEUED) {
++				if (unlikely(status >= 0 && status < iocb->iters[i].count))
++					force_done = true; /* Partial write */
++				if (nfs_local_pgio_done(iocb, status, force_done)) {
++					nfs_local_write_iocb_done(iocb);
++					break;
++				}
 +			}
 +		}
++		file_end_write(filp);
  	}
- 	file_end_write(filp);
 +
- 	revert_creds(save_cred);
 +	current->flags = old_flags;
  }
  
  static int

Merge conflicts with other trees
================================

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/kernel-6.19-rc1.cred

for you to fetch changes up to c8e00cdc7425d5c60fd1ce6e7f71e5fb1b236991:

  Merge patch series "credential guards: credential preparation" (2025-11-05 23:11:52 +0100)

Please consider pulling these changes from the signed kernel-6.19-rc1.cred tag.

Thanks!
Christian

----------------------------------------------------------------
kernel-6.19-rc1.cred

----------------------------------------------------------------
Christian Brauner (39):
      cleanup: fix scoped_class()
      cred: add kernel_cred() helper
      cred: make init_cred static
      cred: add scoped_with_kernel_creds()
      firmware: don't copy kernel creds
      nbd: don't copy kernel creds
      target: don't copy kernel creds
      unix: don't copy creds
      Merge patch series "creds: add {scoped_}with_kernel_creds()"
      cred: add scoped_with_creds() guards
      aio: use credential guards
      backing-file: use credential guards for reads
      backing-file: use credential guards for writes
      backing-file: use credential guards for splice read
      backing-file: use credential guards for splice write
      backing-file: use credential guards for mmap
      binfmt_misc: use credential guards
      erofs: use credential guards
      nfs: use credential guards in nfs_local_call_read()
      nfs: use credential guards in nfs_local_call_write()
      nfs: use credential guards in nfs_idmap_get_key()
      smb: use credential guards in cifs_get_spnego_key()
      act: use credential guards in acct_write_process()
      cgroup: use credential guards in cgroup_attach_permissions()
      net/dns_resolver: use credential guards in dns_query()
      Merge patch series "credentials guards: the easy cases"
      cred: add prepare credential guard
      sev-dev: use guard for path
      sev-dev: use prepare credential guard
      sev-dev: use override credential guards
      coredump: move revert_cred() before coredump_cleanup()
      coredump: pass struct linux_binfmt as const
      coredump: mark struct mm_struct as const
      coredump: split out do_coredump() from vfs_coredump()
      coredump: use prepare credential guard
      coredump: use override credential guard
      trace: use prepare credential guard
      trace: use override credential guard
      Merge patch series "credential guards: credential preparation"

 drivers/base/firmware_loader/main.c   |  59 ++++++--------
 drivers/block/nbd.c                   |  54 +++++--------
 drivers/crypto/ccp/sev-dev.c          |  17 ++--
 drivers/target/target_core_configfs.c |  14 +---
 fs/aio.c                              |   6 +-
 fs/backing-file.c                     | 147 +++++++++++++++++-----------------
 fs/binfmt_misc.c                      |   7 +-
 fs/coredump.c                         | 142 ++++++++++++++++----------------
 fs/erofs/fileio.c                     |   6 +-
 fs/nfs/localio.c                      |  59 +++++++-------
 fs/nfs/nfs4idmap.c                    |   7 +-
 fs/smb/client/cifs_spnego.c           |   6 +-
 include/linux/cleanup.h               |  15 ++--
 include/linux/cred.h                  |  22 +++++
 include/linux/init_task.h             |   1 -
 include/linux/sched/coredump.h        |   2 +-
 init/init_task.c                      |  27 +++++++
 kernel/acct.c                         |  29 +++----
 kernel/cgroup/cgroup.c                |  10 +--
 kernel/cred.c                         |  27 -------
 kernel/trace/trace_events_user.c      |  22 ++---
 net/dns_resolver/dns_query.c          |   6 +-
 net/unix/af_unix.c                    |  17 +---
 security/keys/process_keys.c          |   2 +-
 24 files changed, 330 insertions(+), 374 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 09/17 for v6.19] vfs headers
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (7 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 08/17 for v6.19] cred guards Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 23:22   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 10/17 for v6.19] vfs super guards Christian Brauner
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains initial work to start splitting up fs.h.

Begin the long-overdue work of splitting up the monolithic fs.h header.
The header has grown to over 3000 lines and includes types and functions
for many different subsystems, making it difficult to navigate and
causing excessive compilation dependencies.

This series introduces new focused headers for superblock-related code:

- Rename fs_types.h to fs_dirent.h to better reflect its actual content
  (directory entry types)

- Add fs/super_types.h containing superblock type definitions

- Add fs/super.h containing superblock function declarations

This is the first step in a longer effort to modularize the VFS headers.

Cleanups

- Inode Field Layout Optimization (Mateusz Guzik)

- Move inode fields used during fast path lookup closer together to improve
  cache locality during path resolution.

- current_umask() Optimization (Mateusz Guzik)

- Inline current_umask() and move it to fs_struct.h. This improves
  performance by avoiding function call overhead for this frequently-used
  function, and places it in a more appropriate header since it operates
  on fs_struct.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

diff --cc include/linux/fs.h
index 1011b82977fc,64dc2e2c281f..000000000000
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@@ -2880,11 -2335,8 +2395,9 @@@ static inline void super_set_sysfs_name
        va_end(args);
  }

- extern int current_umask(void);
-
  extern void ihold(struct inode * inode);
  extern void iput(struct inode *);
 +void iput_not_last(struct inode *);
  int inode_update_timestamps(struct inode *inode, int flags);
  int generic_update_time(struct inode *, int);

@@@ -3481,11 -2929,8 +2988,8 @@@ static inline void remove_inode_hash(st
  }

  extern void inode_sb_list_add(struct inode *inode);
 -extern void inode_add_lru(struct inode *inode);
 +extern void inode_lru_list_add(struct inode *inode);

- int sb_set_blocksize(struct super_block *sb, int size);
- int __must_check sb_min_blocksize(struct super_block *sb, int size);
-
  int generic_file_mmap(struct file *, struct vm_area_struct *);
  int generic_file_mmap_prepare(struct vm_area_desc *desc);
  int generic_file_readonly_mmap(struct file *, struct vm_area_struct *);
diff --git a/include/linux/fs/super.h b/include/linux/fs/super.h
index c0d22b12c1c9..69c11b28ed65 100644
--- a/include/linux/fs/super.h
+++ b/include/linux/fs/super.h
@@ -223,7 +223,7 @@ static inline bool sb_has_encoding(const struct super_block *sb)
 }

 int sb_set_blocksize(struct super_block *sb, int size);
-int sb_min_blocksize(struct super_block *sb, int size);
+int __must_check sb_min_blocksize(struct super_block *sb, int size);

 int freeze_super(struct super_block *super, enum freeze_holder who,
                 const void *freeze_owner);
diff --git a/include/linux/fs/super_types.h b/include/linux/fs/super_types.h
index 45cfd45b9fe0..6bd3009e09b3 100644
--- a/include/linux/fs/super_types.h
+++ b/include/linux/fs/super_types.h
@@ -267,6 +267,7 @@ struct super_block {

        spinlock_t                              s_inode_wblist_lock;
        struct list_head                        s_inodes_wb;    /* writeback inodes */
+       long                                    s_min_writeback_pages;
 } __randomize_layout;

 /*

Merge conflicts with other trees
================================

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.fs_header

for you to fetch changes up to dca3aa666fbd71118905d88bb1c353881002b647:

  fs: move inode fields used during fast path lookup closer together (2025-11-11 10:49:54 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.fs_header tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.fs_header

----------------------------------------------------------------
Christian Brauner (4):
      fs: rename fs_types.h to fs_dirent.h
      fs: add fs/super_types.h header
      fs: add fs/super.h header
      Merge patch series "fs: start to split up fs.h"

Mateusz Guzik (2):
      fs: inline current_umask() and move it to fs_struct.h
      fs: move inode fields used during fast path lookup closer together

 fs/9p/acl.c                               |   1 +
 fs/Makefile                               |   2 +-
 fs/btrfs/inode.c                          |   1 +
 fs/f2fs/acl.c                             |   1 +
 fs/fat/inode.c                            |   1 +
 fs/{fs_types.c => fs_dirent.c}            |   2 +-
 fs/fs_struct.c                            |   6 -
 fs/hfsplus/options.c                      |   1 +
 fs/hpfs/super.c                           |   1 +
 fs/nilfs2/nilfs.h                         |   1 +
 fs/ntfs3/super.c                          |   1 +
 fs/ocfs2/acl.c                            |   1 +
 fs/omfs/inode.c                           |   1 +
 fs/smb/client/file.c                      |   1 +
 fs/smb/client/inode.c                     |   1 +
 fs/smb/client/smb1ops.c                   |   1 +
 include/linux/fs.h                        | 533 +-----------------------------
 include/linux/fs/super.h                  | 233 +++++++++++++
 include/linux/fs/super_types.h            | 335 +++++++++++++++++++
 include/linux/{fs_types.h => fs_dirent.h} |  11 +-
 include/linux/fs_struct.h                 |   6 +
 include/linux/namei.h                     |   1 +
 22 files changed, 600 insertions(+), 542 deletions(-)
 rename fs/{fs_types.c => fs_dirent.c} (98%)
 create mode 100644 include/linux/fs/super.h
 create mode 100644 include/linux/fs/super_types.h
 rename include/linux/{fs_types.h => fs_dirent.h} (92%)

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [GIT PULL 10/17 for v6.19] vfs super guards
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (8 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 09/17 for v6.19] vfs headers Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 23:22   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 11/17 for v6.19] minix Christian Brauner
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This starts the work of introducing guards for superblock related locks.
Note that this branch includes the fs_header cleanups as a dependency.

Introduce super_write_guard for scoped superblock write protection. This
provides a guard-based alternative to the manual sb_start_write() and
sb_end_write() pattern, allowing the compiler to automatically handle
the cleanup.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.guards

for you to fetch changes up to 73fd0dba0beb1d2d1695ee5452eac8dfabce3f9e:

  Merge patch series "fs: introduce super write guard" (2025-11-05 22:59:31 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.guards tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.guards

----------------------------------------------------------------
Christian Brauner (13):
      fs: rename fs_types.h to fs_dirent.h
      fs: add fs/super_types.h header
      fs: add fs/super.h header
      Merge patch series "fs: start to split up fs.h"
      fs: add super_write_guard
      btrfs: use super write guard in btrfs_reclaim_bgs_work()
      btrfs: use super write guard btrfs_run_defrag_inode()
      btrfs: use super write guard in sb_start_write()
      ext4: use super write guard in write_mmp_block()
      btrfs: use super write guard in relocating_repair_kthread()
      open: use super write guard in do_ftruncate()
      xfs: use super write guard in xfs_file_ioctl()
      Merge patch series "fs: introduce super write guard"

Mateusz Guzik (1):
      fs: inline current_umask() and move it to fs_struct.h

 fs/9p/acl.c                               |   1 +
 fs/Makefile                               |   2 +-
 fs/btrfs/block-group.c                    |  10 +-
 fs/btrfs/defrag.c                         |   7 +-
 fs/btrfs/inode.c                          |   1 +
 fs/btrfs/volumes.c                        |   9 +-
 fs/ext4/mmp.c                             |   8 +-
 fs/f2fs/acl.c                             |   1 +
 fs/fat/inode.c                            |   1 +
 fs/{fs_types.c => fs_dirent.c}            |   2 +-
 fs/fs_struct.c                            |   6 -
 fs/hfsplus/options.c                      |   1 +
 fs/hpfs/super.c                           |   1 +
 fs/nilfs2/nilfs.h                         |   1 +
 fs/ntfs3/super.c                          |   1 +
 fs/ocfs2/acl.c                            |   1 +
 fs/omfs/inode.c                           |   1 +
 fs/open.c                                 |   9 +-
 fs/smb/client/file.c                      |   1 +
 fs/smb/client/inode.c                     |   1 +
 fs/smb/client/smb1ops.c                   |   1 +
 fs/xfs/xfs_ioctl.c                        |   6 +-
 include/linux/fs.h                        | 528 +-----------------------------
 include/linux/fs/super.h                  | 238 ++++++++++++++
 include/linux/fs/super_types.h            | 335 +++++++++++++++++++
 include/linux/{fs_types.h => fs_dirent.h} |  11 +-
 include/linux/fs_struct.h                 |   6 +
 include/linux/namei.h                     |   1 +
 28 files changed, 620 insertions(+), 571 deletions(-)
 rename fs/{fs_types.c => fs_dirent.c} (98%)
 create mode 100644 include/linux/fs/super.h
 create mode 100644 include/linux/fs/super_types.h
 rename include/linux/{fs_types.h => fs_dirent.h} (92%)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 11/17 for v6.19] minix
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (9 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 10/17 for v6.19] vfs super guards Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 23:22   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 12/17 for v6.19] vfs directory delegations Christian Brauner
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Fix two syzbot corruption bugs in the minix filesystem. Syzbot fuzzes
filesystems by trying to mount and manipulate deliberately corrupted
images. This should not lead to BUG_ONs and WARN_ONs for easy to detect
corruptions.

- Add error handling to minix filesystem for inode corruption detection,
  enabling the filesystem to report such corruptions cleanly.

- Fix a drop_nlink warning in minix_rmdir() triggered by corrupted
  directory link counts.

- Fix a drop_nlink warning in minix_rename() triggered by corrupted
  inode link counts.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.minix

for you to fetch changes up to 0d534518ce87317e884dbd1485111b0f1606a194:

  Merge patch series "Fix two syzbot corruption bugs in minix filesystem" (2025-11-05 13:45:26 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.minix tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.minix

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Fix two syzbot corruption bugs in minix filesystem"

Jori Koolstra (3):
      Add error handling to minix filesystem for inode corruption detection
      Fix a drop_nlink warning in minix_rmdir
      Fix a drop_nlink warning in minix_rename

 fs/minix/inode.c | 16 ++++++++++++++++
 fs/minix/minix.h |  9 +++++++++
 fs/minix/namei.c | 39 ++++++++++++++++++++++++++++++++-------
 3 files changed, 57 insertions(+), 7 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 12/17 for v6.19] vfs directory delegations
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (10 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 11/17 for v6.19] minix Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-02  3:19   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 13/17 for v6.19] vfs directory locking Christian Brauner
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the work for cecall-only directory delegations for knfsd.

Add support for simple, recallable-only directory delegations. This was
decided at the fall NFS Bakeathon where the NFS client and server
maintainers discussed how to merge directory delegation support.

The approach starts with recallable-only delegations for several reasons:

1. RFC8881 has gaps that are being addressed in RFC8881bis. In particular,
  it requires directory position information for CB_NOTIFY callbacks,
  which is difficult to implement properly under Linux. The spec is being
  extended to allow that information to be omitted.

2. Client-side support for CB_NOTIFY still lags. The client side involves
  heuristics about when to request a delegation.

3. Early indication shows simple, recallable-only delegations can help
  performance. Anna Schumaker mentioned seeing a multi-minute speedup in
  xfstests runs with them enabled.

With these changes, userspace can also request a read lease on a
directory that will be recalled on conflicting accesses. This may be
useful for applications like Samba. Users can disable leases altogether
via the fs.leases-enable sysctl if needed.

VFS Changes

- Dedicated Type for Delegations

  Introduce struct delegated_inode to track inodes that may have delegations
  that need to be broken. This replaces the previous approach of passing
  raw inode pointers through the delegation breaking code paths, providing
  better type safety and clearer semantics for the delegation machinery.

- Break parent directory delegations in open(..., O_CREAT) codepath

- Allow mkdir to wait for delegation break on parent

- Allow rmdir to wait for delegation break on parent

- Add try_break_deleg calls for parents to vfs_link(), vfs_rename(),
  and vfs_unlink()

- Make vfs_create(), vfs_mknod(), and vfs_symlink() break delegations
  on parent directory

- Clean up argument list for vfs_create()

- Expose delegation support to userland

Filelock Changes

- Make lease_alloc() take a flags argument

- Rework the __break_lease API to use flags

- Add struct delegated_inode

- Push the S_ISREG check down to ->setlease handlers

  - Lift the ban on directory leases in generic_setlease

NFSD Changes

- Allow filecache to hold S_IFDIR files

- Allow DELEGRETURN on directories

- Wire up GET_DIR_DELEGATION handling

Fixes

- Fix kernel-doc warnings in __fcntl_getlease
- Add needed headers for new struct delegation definition

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

[1] https://lore.kernel.org/linux-next/20251117073452.2c9b0190@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.directory.delegations

for you to fetch changes up to 4be9e04ebf75a5c4478c1c6295e2122e5dc98f5f:

  vfs: add needed headers for new struct delegation definition (2025-11-28 10:55:34 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.directory.delegations tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.directory.delegations

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "vfs: recall-only directory delegations for knfsd"

Jeff Layton (18):
      filelock: make lease_alloc() take a flags argument
      filelock: rework the __break_lease API to use flags
      filelock: add struct delegated_inode
      filelock: push the S_ISREG check down to ->setlease handlers
      vfs: add try_break_deleg calls for parents to vfs_{link,rename,unlink}
      vfs: allow mkdir to wait for delegation break on parent
      vfs: allow rmdir to wait for delegation break on parent
      vfs: break parent dir delegations in open(..., O_CREAT) codepath
      vfs: clean up argument list for vfs_create()
      vfs: make vfs_create break delegations on parent directory
      vfs: make vfs_mknod break delegations on parent directory
      vfs: make vfs_symlink break delegations on parent dir
      filelock: lift the ban on directory leases in generic_setlease
      nfsd: allow filecache to hold S_IFDIR files
      nfsd: allow DELEGRETURN on directories
      nfsd: wire up GET_DIR_DELEGATION handling
      vfs: expose delegation support to userland
      vfs: add needed headers for new struct delegation definition

Randy Dunlap (1):
      filelock: __fcntl_getlease: fix kernel-doc warnings

 drivers/base/devtmpfs.c    |   6 +-
 fs/attr.c                  |   2 +-
 fs/cachefiles/namei.c      |   2 +-
 fs/ecryptfs/inode.c        |  11 ++-
 fs/fcntl.c                 |  13 ++++
 fs/fuse/dir.c              |   1 +
 fs/init.c                  |   6 +-
 fs/locks.c                 | 103 ++++++++++++++++++++--------
 fs/namei.c                 | 162 +++++++++++++++++++++++++++++++++------------
 fs/nfs/nfs4file.c          |   2 +
 fs/nfsd/filecache.c        |  57 ++++++++++++----
 fs/nfsd/filecache.h        |   2 +
 fs/nfsd/nfs3proc.c         |   2 +-
 fs/nfsd/nfs4proc.c         |  22 +++++-
 fs/nfsd/nfs4recover.c      |   6 +-
 fs/nfsd/nfs4state.c        | 103 +++++++++++++++++++++++++++-
 fs/nfsd/state.h            |   5 ++
 fs/nfsd/vfs.c              |  16 ++---
 fs/nfsd/vfs.h              |   2 +-
 fs/open.c                  |  12 ++--
 fs/overlayfs/overlayfs.h   |  10 +--
 fs/posix_acl.c             |   8 +--
 fs/smb/client/cifsfs.c     |   3 +
 fs/smb/server/vfs.c        |   9 ++-
 fs/utimes.c                |   4 +-
 fs/xattr.c                 |  12 ++--
 fs/xfs/scrub/orphanage.c   |   2 +-
 include/linux/filelock.h   |  98 +++++++++++++++++++++------
 include/linux/fs.h         |  24 ++++---
 include/linux/xattr.h      |   4 +-
 include/uapi/linux/fcntl.h |  16 +++++
 net/unix/af_unix.c         |   2 +-
 32 files changed, 550 insertions(+), 177 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 13/17 for v6.19] vfs directory locking
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (11 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 12/17 for v6.19] vfs directory delegations Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-02  3:19   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 14/17 for v6.19] overlayfs cred guards Christian Brauner
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the work to add centralized APIs for directory locking
operations.

This series is part of a larger effort to change directory operation
locking to allow multiple concurrent operations in a directory. The
ultimate goal is to lock the target dentry(s) rather than the whole
parent directory.

To help with changing the locking protocol, this series centralizes
locking and lookup in new helper functions. The helpers establish a
pattern where it is the dentry that is being locked and unlocked
(currently the lock is held on dentry->d_parent->d_inode, but that can
change in the future).

This also changes vfs_mkdir() to unlock the parent on failure, as well
as dput()ing the dentry. This allows end_creating() to only require the
target dentry (which may be IS_ERR() after vfs_mkdir()), not the parent.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline or other vfs branches
===================================================

[1] This contains a merge conflict with the directory delegation changes:

diff --cc fs/cachefiles/namei.c
index 50c0f9c76d1f,ef22ac19545b..000000000000
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@@ -129,10 -128,12 +128,12 @@@ retry
  		if (ret < 0)
  			goto mkdir_error;
  		ret = cachefiles_inject_write_error();
- 		if (ret == 0)
+ 		if (ret == 0) {
 -			subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700);
 +			subdir = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), subdir, 0700, NULL);
- 		else
+ 		} else {
+ 			end_creating(subdir);
  			subdir = ERR_PTR(ret);
+ 		}
  		if (IS_ERR(subdir)) {
  			trace_cachefiles_vfs_error(NULL, d_inode(dir), ret,
  						   cachefiles_trace_mkdir_error);
diff --cc fs/ecryptfs/inode.c
index dc3ee0cbd77a,2ad1db2cd2ec..000000000000
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@@ -186,9 -190,12 +190,11 @@@ ecryptfs_do_create(struct inode *direct
  	struct inode *lower_dir;
  	struct inode *inode;
  
- 	rc = lock_parent(ecryptfs_dentry, &lower_dentry, &lower_dir);
- 	if (!rc)
- 		rc = vfs_create(&nop_mnt_idmap, lower_dentry, mode, NULL);
+ 	lower_dentry = ecryptfs_start_creating_dentry(ecryptfs_dentry);
+ 	if (IS_ERR(lower_dentry))
+ 		return ERR_CAST(lower_dentry);
+ 	lower_dir = lower_dentry->d_parent->d_inode;
 -	rc = vfs_create(&nop_mnt_idmap, lower_dir,
 -			lower_dentry, mode, true);
++	rc = vfs_create(&nop_mnt_idmap, lower_dentry, mode, NULL);
  	if (rc) {
  		printk(KERN_ERR "%s: Failure to create dentry in lower fs; "
  		       "rc = [%d]\n", __func__, rc);
@@@ -500,14 -511,16 +510,16 @@@ static struct dentry *ecryptfs_mkdir(st
  {
  	int rc;
  	struct dentry *lower_dentry;
+ 	struct dentry *lower_dir_dentry;
  	struct inode *lower_dir;
  
- 	rc = lock_parent(dentry, &lower_dentry, &lower_dir);
- 	if (rc)
- 		goto out;
- 
+ 	lower_dentry = ecryptfs_start_creating_dentry(dentry);
+ 	if (IS_ERR(lower_dentry))
+ 		return lower_dentry;
+ 	lower_dir_dentry = dget(lower_dentry->d_parent);
+ 	lower_dir = lower_dir_dentry->d_inode;
  	lower_dentry = vfs_mkdir(&nop_mnt_idmap, lower_dir,
 -				 lower_dentry, mode);
 +				 lower_dentry, mode, NULL);
  	rc = PTR_ERR(lower_dentry);
  	if (IS_ERR(lower_dentry))
  		goto out;
@@@ -533,14 -546,12 +545,12 @@@ static int ecryptfs_rmdir(struct inode 
  	struct inode *lower_dir;
  	int rc;
  
- 	rc = lock_parent(dentry, &lower_dentry, &lower_dir);
- 	dget(lower_dentry);	// don't even try to make the lower negative
- 	if (!rc) {
- 		if (d_unhashed(lower_dentry))
- 			rc = -EINVAL;
- 		else
- 			rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry, NULL);
- 	}
+ 	lower_dentry = ecryptfs_start_removing_dentry(dentry);
+ 	if (IS_ERR(lower_dentry))
+ 		return PTR_ERR(lower_dentry);
+ 	lower_dir = lower_dentry->d_parent->d_inode;
+ 
 -	rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry);
++	rc = vfs_rmdir(&nop_mnt_idmap, lower_dir, lower_dentry, NULL);
  	if (!rc) {
  		clear_nlink(d_inode(dentry));
  		fsstack_copy_attr_times(dir, lower_dir);
@@@ -561,10 -571,12 +570,12 @@@ ecryptfs_mknod(struct mnt_idmap *idmap
  	struct dentry *lower_dentry;
  	struct inode *lower_dir;
  
- 	rc = lock_parent(dentry, &lower_dentry, &lower_dir);
- 	if (!rc)
- 		rc = vfs_mknod(&nop_mnt_idmap, lower_dir,
- 			       lower_dentry, mode, dev, NULL);
+ 	lower_dentry = ecryptfs_start_creating_dentry(dentry);
+ 	if (IS_ERR(lower_dentry))
+ 		return PTR_ERR(lower_dentry);
+ 	lower_dir = lower_dentry->d_parent->d_inode;
+ 
 -	rc = vfs_mknod(&nop_mnt_idmap, lower_dir, lower_dentry, mode, dev);
++	rc = vfs_mknod(&nop_mnt_idmap, lower_dir, lower_dentry, mode, dev, NULL);
  	if (rc || d_really_is_negative(lower_dentry))
  		goto out;
  	rc = ecryptfs_interpose(lower_dentry, dentry, dir->i_sb);
diff --cc fs/namei.c
index 13041756d941,d284ebae41bf..000000000000
--- a/fs/namei.c
+++ b/fs/namei.c
@@@ -4717,12 -5171,10 +5288,11 @@@ retry
  	error = security_path_rmdir(&path, dentry);
  	if (error)
  		goto exit4;
 -	error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode, dentry);
 +	error = vfs_rmdir(mnt_idmap(path.mnt), path.dentry->d_inode,
 +			  dentry, &delegated_inode);
  exit4:
- 	dput(dentry);
+ 	end_dirop(dentry);
  exit3:
- 	inode_unlock(path.dentry->d_inode);
  	mnt_drop_write(path.mnt);
  exit2:
  	path_put(&path);
@@@ -4845,31 -5289,33 +5415,33 @@@ retry
  
  	error = mnt_want_write(path.mnt);
  	if (error)
- 		goto exit2;
+ 		goto exit_path_put;
  retry_deleg:
- 	inode_lock_nested(path.dentry->d_inode, I_MUTEX_PARENT);
- 	dentry = lookup_one_qstr_excl(&last, path.dentry, lookup_flags);
+ 	dentry = start_dirop(path.dentry, &last, lookup_flags);
  	error = PTR_ERR(dentry);
- 	if (!IS_ERR(dentry)) {
+ 	if (IS_ERR(dentry))
+ 		goto exit_drop_write;
  
- 		/* Why not before? Because we want correct error value */
- 		if (last.name[last.len])
- 			goto slashes;
- 		inode = dentry->d_inode;
- 		ihold(inode);
- 		error = security_path_unlink(&path, dentry);
- 		if (error)
- 			goto exit3;
- 		error = vfs_unlink(mnt_idmap(path.mnt), path.dentry->d_inode,
- 				   dentry, &delegated_inode);
- exit3:
- 		dput(dentry);
+ 	/* Why not before? Because we want correct error value */
+ 	if (unlikely(last.name[last.len])) {
+ 		if (d_is_dir(dentry))
+ 			error = -EISDIR;
+ 		else
+ 			error = -ENOTDIR;
+ 		end_dirop(dentry);
+ 		goto exit_drop_write;
  	}
- 	inode_unlock(path.dentry->d_inode);
- 	if (inode)
- 		iput(inode);	/* truncate the inode here */
- 	inode = NULL;
+ 	inode = dentry->d_inode;
+ 	ihold(inode);
+ 	error = security_path_unlink(&path, dentry);
+ 	if (error)
+ 		goto exit_end_dirop;
+ 	error = vfs_unlink(mnt_idmap(path.mnt), path.dentry->d_inode,
+ 			   dentry, &delegated_inode);
+ exit_end_dirop:
+ 	end_dirop(dentry);
+ 	iput(inode);	/* truncate the inode here */
 -	if (delegated_inode) {
 +	if (is_delegated(&delegated_inode)) {
  		error = break_deleg_wait(&delegated_inode);
  		if (!error)
  			goto retry_deleg;
@@@ -5407,11 -5824,8 +5972,8 @@@ int do_renameat2(int olddfd, struct fil
  	struct path old_path, new_path;
  	struct qstr old_last, new_last;
  	int old_type, new_type;
 -	struct inode *delegated_inode = NULL;
 +	struct delegated_inode delegated_inode = { };
- 	unsigned int lookup_flags = 0, target_flags =
- 		LOOKUP_RENAME_TARGET | LOOKUP_CREATE;
+ 	unsigned int lookup_flags = 0;
  	bool should_retry = false;
  	int error = -EINVAL;
  
@@@ -5480,44 -5883,24 +6031,24 @@@ retry_deleg
  		}
  	}
  	/* unless the source is a directory trailing slashes give -ENOTDIR */
- 	if (!d_is_dir(old_dentry)) {
+ 	if (!d_is_dir(rd.old_dentry)) {
  		error = -ENOTDIR;
  		if (old_last.name[old_last.len])
- 			goto exit5;
+ 			goto exit_unlock;
  		if (!(flags & RENAME_EXCHANGE) && new_last.name[new_last.len])
- 			goto exit5;
+ 			goto exit_unlock;
  	}
- 	/* source should not be ancestor of target */
- 	error = -EINVAL;
- 	if (old_dentry == trap)
- 		goto exit5;
- 	/* target should not be an ancestor of source */
- 	if (!(flags & RENAME_EXCHANGE))
- 		error = -ENOTEMPTY;
- 	if (new_dentry == trap)
- 		goto exit5;
  
- 	error = security_path_rename(&old_path, old_dentry,
- 				     &new_path, new_dentry, flags);
+ 	error = security_path_rename(&old_path, rd.old_dentry,
+ 				     &new_path, rd.new_dentry, flags);
  	if (error)
- 		goto exit5;
+ 		goto exit_unlock;
  
- 	rd.old_parent	   = old_path.dentry;
- 	rd.old_dentry	   = old_dentry;
- 	rd.mnt_idmap	   = mnt_idmap(old_path.mnt);
- 	rd.new_parent	   = new_path.dentry;
- 	rd.new_dentry	   = new_dentry;
- 	rd.delegated_inode = &delegated_inode;
- 	rd.flags	   = flags;
  	error = vfs_rename(&rd);
- exit5:
- 	dput(new_dentry);
- exit4:
- 	dput(old_dentry);
- exit3:
- 	unlock_rename(new_path.dentry, old_path.dentry);
+ exit_unlock:
+ 	end_renaming(&rd);
  exit_lock_rename:
 -	if (delegated_inode) {
 +	if (is_delegated(&delegated_inode)) {
  		error = break_deleg_wait(&delegated_inode);
  		if (!error)
  			goto retry_deleg;
diff --cc fs/nfsd/nfs4recover.c
index 30bae93931d9,18c08395b273..000000000000
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@@ -212,15 -210,13 +210,13 @@@ nfsd4_create_clid_dir(struct nfs4_clien
  		 * In the 4.0 case, we should never get here; but we may
  		 * as well be forgiving and just succeed silently.
  		 */
- 		goto out_put;
+ 		goto out_end;
 -	dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, S_IRWXU);
 +	dentry = vfs_mkdir(&nop_mnt_idmap, d_inode(dir), dentry, 0700, NULL);
  	if (IS_ERR(dentry))
  		status = PTR_ERR(dentry);
- out_put:
- 	if (!status)
- 		dput(dentry);
- out_unlock:
- 	inode_unlock(d_inode(dir));
+ out_end:
+ 	end_creating(dentry);
+ out:
  	if (status == 0) {
  		if (nn->in_grace)
  			__nfsd4_create_reclaim_record_grace(clp, dname,
@@@ -328,20 -324,12 +324,12 @@@ nfsd4_unlink_clid_dir(char *name, struc
  	dprintk("NFSD: nfsd4_unlink_clid_dir. name %s\n", name);
  
  	dir = nn->rec_file->f_path.dentry;
- 	inode_lock_nested(d_inode(dir), I_MUTEX_PARENT);
- 	dentry = lookup_one(&nop_mnt_idmap, &QSTR(name), dir);
- 	if (IS_ERR(dentry)) {
- 		status = PTR_ERR(dentry);
- 		goto out_unlock;
- 	}
- 	status = -ENOENT;
- 	if (d_really_is_negative(dentry))
- 		goto out;
+ 	dentry = start_removing(&nop_mnt_idmap, dir, &QSTR(name));
+ 	if (IS_ERR(dentry))
+ 		return PTR_ERR(dentry);
+ 
 -	status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry);
 +	status = vfs_rmdir(&nop_mnt_idmap, d_inode(dir), dentry, NULL);
- out:
- 	dput(dentry);
- out_unlock:
- 	inode_unlock(d_inode(dir));
+ 	end_removing(dentry);
  	return status;
  }
  

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251121082731.0e39ee5d@canb.auug.org.au

[2]: https://lore.kernel.org/linux-next/20251121083333.48687f3e@canb.auug.org.au

[3]: https://lore.kernel.org/linux-next/20251121084211.7accff09@canb.auug.org.au

[4]: https://lore.kernel.org/linux-next/20251121084753.585ab636@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.directory.locking

for you to fetch changes up to eeec741ee0df36e79a847bb5423f9eef4ed96071:

  nfsd: fix end_creating() conversion (2025-11-28 09:51:16 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.directory.locking tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.directory.locking

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Create and use APIs to centralise locking for directory ops."

Neil Brown (1):
      nfsd: fix end_creating() conversion

NeilBrown (15):
      debugfs: rename end_creating() to debugfs_end_creating()
      VFS: introduce start_dirop() and end_dirop()
      VFS: tidy up do_unlinkat()
      VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()
      VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()
      VFS: introduce start_creating_noperm() and start_removing_noperm()
      smb/server: use end_removing_noperm for for target of smb2_create_link()
      VFS: introduce start_removing_dentry()
      VFS: add start_creating_killable() and start_removing_killable()
      VFS/nfsd/ovl: introduce start_renaming() and end_renaming()
      VFS/ovl/smb: introduce start_renaming_dentry()
      Add start_renaming_two_dentries()
      ecryptfs: use new start_creating/start_removing APIs
      VFS: change vfs_mkdir() to unlock on failure.
      VFS: introduce end_creating_keep()

 Documentation/filesystems/porting.rst |  13 +
 fs/btrfs/ioctl.c                      |  41 +-
 fs/cachefiles/interface.c             |  11 +-
 fs/cachefiles/namei.c                 |  96 +++--
 fs/cachefiles/volume.c                |   9 +-
 fs/debugfs/inode.c                    |  74 ++--
 fs/ecryptfs/inode.c                   | 153 ++++---
 fs/fuse/dir.c                         |  19 +-
 fs/internal.h                         |   3 +
 fs/libfs.c                            |  36 +-
 fs/namei.c                            | 747 +++++++++++++++++++++++++++++-----
 fs/nfsd/nfs3proc.c                    |  14 +-
 fs/nfsd/nfs4proc.c                    |  14 +-
 fs/nfsd/nfs4recover.c                 |  34 +-
 fs/nfsd/nfsproc.c                     |  14 +-
 fs/nfsd/vfs.c                         | 157 +++----
 fs/overlayfs/copy_up.c                |  73 ++--
 fs/overlayfs/dir.c                    | 241 ++++++-----
 fs/overlayfs/overlayfs.h              |  47 ++-
 fs/overlayfs/readdir.c                |   8 +-
 fs/overlayfs/super.c                  |  49 +--
 fs/overlayfs/util.c                   |  11 -
 fs/smb/server/smb2pdu.c               |   6 +-
 fs/smb/server/vfs.c                   | 114 ++----
 fs/smb/server/vfs.h                   |   8 +-
 fs/xfs/scrub/orphanage.c              |  11 +-
 include/linux/fs.h                    |   2 +
 include/linux/namei.h                 |  82 ++++
 ipc/mqueue.c                          |  32 +-
 security/apparmor/apparmorfs.c        |   8 +-
 security/selinux/selinuxfs.c          |  15 +-
 31 files changed, 1303 insertions(+), 839 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 14/17 for v6.19] overlayfs cred guards
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (12 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 13/17 for v6.19] vfs directory locking Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-02  3:19   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 15/17 for v6.19] autofs Christian Brauner
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This converts all of overlayfs to use credential guards, eliminating
manual credential management throughout the filesystem. It depends on
the directory locking changes, the kbuild -fms-extensions support, and
the credential guard infrastructure.

Complete Credential Guard Conversion

- Convert all of overlayfs to use credential guards, replacing the manual
  ovl_override_creds()/ovl_revert_creds() pattern with scoped guards. This
  makes credential handling visually explicit and eliminates a class of
  potential bugs from mismatched override/revert calls.

  (1) Basic credential guard (with_ovl_creds)
  (2) Creator credential guard (ovl_override_creator_creds):

      Introduced a specialized guard for file creation operations that handles
      the two-phase credential override (mounter credentials, then fs{g,u}id
      override). The new pattern is much clearer:

      with_ovl_creds(dentry->d_sb) {
              scoped_class(prepare_creds_ovl, cred, dentry, inode, mode) {
                      if (IS_ERR(cred))
                              return PTR_ERR(cred);
                      /* creation operations */
              }
      }

  (3) Copy-up credential guard (ovl_cu_creds):

      Introduced a specialized guard for copy-up operations, simplifying the
      previous struct ovl_cu_creds helper and associated functions.

      Ported ovl_copy_up_workdir() and ovl_copy_up_tmpfile() to this pattern.

Cleanups

- Remove ovl_revert_creds() after all callers converted to guards

- Remove struct ovl_cu_creds and associated functions

- Drop ovl_setup_cred_for_create() after conversion

- Refactor ovl_fill_super(), ovl_lookup(), ovl_iterate(), ovl_rename()
  for cleaner credential guard scope

- Introduce struct ovl_renamedata to simplify rename handling

- Don't override credentials for ovl_check_whiteouts() (unnecessary)

- Remove unneeded semicolon

Dependencies

- Directory locking changes

- Kbuild -fms-extensions support

- Kernel credential guard infrastructure

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

The following changes since commit dcb6fa37fd7bc9c3d2b066329b0d27dedf8becaa:

  Linux 6.18-rc3 (2025-10-26 15:59:49 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.ovl

for you to fetch changes up to 2579e21be532457742d4100bbda1c2a5b81cbdef:

  ovl: remove unneeded semicolon (2025-11-28 11:05:52 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.ovl tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.ovl

----------------------------------------------------------------
Chen Ni (1):
      ovl: remove unneeded semicolon

Christian Brauner (99):
      cleanup: fix scoped_class()
      cred: add kernel_cred() helper
      cred: make init_cred static
      cred: add scoped_with_kernel_creds()
      firmware: don't copy kernel creds
      nbd: don't copy kernel creds
      target: don't copy kernel creds
      unix: don't copy creds
      Merge patch series "creds: add {scoped_}with_kernel_creds()"
      cred: add scoped_with_creds() guards
      aio: use credential guards
      backing-file: use credential guards for reads
      backing-file: use credential guards for writes
      backing-file: use credential guards for splice read
      backing-file: use credential guards for splice write
      backing-file: use credential guards for mmap
      binfmt_misc: use credential guards
      erofs: use credential guards
      nfs: use credential guards in nfs_local_call_read()
      nfs: use credential guards in nfs_local_call_write()
      nfs: use credential guards in nfs_idmap_get_key()
      smb: use credential guards in cifs_get_spnego_key()
      act: use credential guards in acct_write_process()
      cgroup: use credential guards in cgroup_attach_permissions()
      net/dns_resolver: use credential guards in dns_query()
      Merge patch series "credentials guards: the easy cases"
      cred: add prepare credential guard
      sev-dev: use guard for path
      sev-dev: use prepare credential guard
      sev-dev: use override credential guards
      coredump: move revert_cred() before coredump_cleanup()
      coredump: pass struct linux_binfmt as const
      coredump: mark struct mm_struct as const
      coredump: split out do_coredump() from vfs_coredump()
      coredump: use prepare credential guard
      coredump: use override credential guard
      trace: use prepare credential guard
      trace: use override credential guard
      Merge patch series "credential guards: credential preparation"
      Merge patch "kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS"
      Merge patch series "Create and use APIs to centralise locking for directory ops."
      Merge branch 'kbuild-6.19.fms.extension'
      Merge branch 'vfs-6.19.directory.locking' into base.vfs-6.19.ovl
      ovl: add override_creds cleanup guard extension for overlayfs
      ovl: port ovl_copy_up_flags() to cred guards
      ovl: port ovl_create_or_link() to cred guard
      ovl: port ovl_set_link_redirect() to cred guard
      ovl: port ovl_do_remove() to cred guard
      ovl: port ovl_create_tmpfile() to cred guard
      ovl: port ovl_open_realfile() to cred guard
      ovl: port ovl_llseek() to cred guard
      ovl: port ovl_fsync() to cred guard
      ovl: port ovl_fallocate() to cred guard
      ovl: port ovl_fadvise() to cred guard
      ovl: port ovl_flush() to cred guard
      ovl: port ovl_setattr() to cred guard
      ovl: port ovl_getattr() to cred guard
      ovl: port ovl_permission() to cred guard
      ovl: port ovl_get_link() to cred guard
      ovl: port do_ovl_get_acl() to cred guard
      ovl: port ovl_set_or_remove_acl() to cred guard
      ovl: port ovl_fiemap() to cred guard
      ovl: port ovl_fileattr_set() to cred guard
      ovl: port ovl_fileattr_get() to cred guard
      ovl: port ovl_maybe_validate_verity() to cred guard
      ovl: port ovl_maybe_lookup_lowerdata() to cred guard
      ovl: don't override credentials for ovl_check_whiteouts()
      ovl: refactor ovl_iterate() and port to cred guard
      ovl: port ovl_dir_llseek() to cred guard
      ovl: port ovl_check_empty_dir() to cred guard
      ovl: port ovl_nlink_start() to cred guard
      ovl: port ovl_nlink_end() to cred guard
      ovl: port ovl_xattr_set() to cred guard
      ovl: port ovl_xattr_get() to cred guard
      ovl: port ovl_listxattr() to cred guard
      ovl: introduce struct ovl_renamedata
      ovl: refactor ovl_rename()
      ovl: port ovl_rename() to cred guard
      ovl: port ovl_copyfile() to cred guard
      ovl: refactor ovl_lookup()
      ovl: port ovl_lookup() to cred guard
      ovl: port ovl_lower_positive() to cred guard
      ovl: refactor ovl_fill_super()
      ovl: port ovl_fill_super() to cred guard
      ovl: remove ovl_revert_creds()
      Merge patch series "ovl: convert to cred guard"
      ovl: add ovl_override_creator_creds cred guard
      ovl: port ovl_create_tmpfile() to new ovl_override_creator_creds cleanup guard
      ovl: reflow ovl_create_or_link()
      ovl: mark ovl_setup_cred_for_create() as unused temporarily
      ovl: port ovl_create_or_link() to new ovl_override_creator_creds cleanup guard
      ovl: drop ovl_setup_cred_for_create()
      ovl: add copy up credential guard
      ovl: port ovl_copy_up_workdir() to cred guard
      ovl: mark *_cu_creds() as unused temporarily
      ovl: port ovl_copy_up_tmpfile() to cred guard
      ovl: remove struct ovl_cu_creds and associated functions
      Merge patch series "ovl: convert creation credential override to cred guard"
      Merge patch series "ovl: convert copyup credential override to cred guard"

Nathan Chancellor (2):
      jfs: Rename _inline to avoid conflict with clang's '-fms-extensions'
      kbuild: Add '-fms-extensions' to areas with dedicated CFLAGS

NeilBrown (15):
      debugfs: rename end_creating() to debugfs_end_creating()
      VFS: introduce start_dirop() and end_dirop()
      VFS: tidy up do_unlinkat()
      VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()
      VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()
      VFS: introduce start_creating_noperm() and start_removing_noperm()
      smb/server: use end_removing_noperm for for target of smb2_create_link()
      VFS: introduce start_removing_dentry()
      VFS: add start_creating_killable() and start_removing_killable()
      VFS/nfsd/ovl: introduce start_renaming() and end_renaming()
      VFS/ovl/smb: introduce start_renaming_dentry()
      Add start_renaming_two_dentries()
      ecryptfs: use new start_creating/start_removing APIs
      VFS: change vfs_mkdir() to unlock on failure.
      VFS: introduce end_creating_keep()

Rasmus Villemoes (1):
      Kbuild: enable -fms-extensions

 Documentation/filesystems/porting.rst |  13 +
 Makefile                              |   3 +
 arch/arm64/kernel/vdso32/Makefile     |   3 +-
 arch/loongarch/vdso/Makefile          |   2 +-
 arch/parisc/boot/compressed/Makefile  |   2 +-
 arch/powerpc/boot/Makefile            |   3 +-
 arch/s390/Makefile                    |   3 +-
 arch/s390/purgatory/Makefile          |   3 +-
 arch/x86/Makefile                     |   4 +-
 arch/x86/boot/compressed/Makefile     |   7 +-
 drivers/base/firmware_loader/main.c   |  59 ++-
 drivers/block/nbd.c                   |  54 +--
 drivers/crypto/ccp/sev-dev.c          |  17 +-
 drivers/firmware/efi/libstub/Makefile |   4 +-
 drivers/target/target_core_configfs.c |  14 +-
 fs/aio.c                              |   6 +-
 fs/backing-file.c                     | 147 +++----
 fs/binfmt_misc.c                      |   7 +-
 fs/btrfs/ioctl.c                      |  41 +-
 fs/cachefiles/interface.c             |  11 +-
 fs/cachefiles/namei.c                 |  96 +++--
 fs/cachefiles/volume.c                |   9 +-
 fs/coredump.c                         | 142 +++----
 fs/debugfs/inode.c                    |  74 ++--
 fs/ecryptfs/inode.c                   | 153 ++++---
 fs/erofs/fileio.c                     |   6 +-
 fs/fuse/dir.c                         |  19 +-
 fs/internal.h                         |   3 +
 fs/jfs/jfs_incore.h                   |   6 +-
 fs/libfs.c                            |  36 +-
 fs/namei.c                            | 747 +++++++++++++++++++++++++++++-----
 fs/nfs/localio.c                      |  59 +--
 fs/nfs/nfs4idmap.c                    |   7 +-
 fs/nfsd/nfs3proc.c                    |  14 +-
 fs/nfsd/nfs4proc.c                    |  14 +-
 fs/nfsd/nfs4recover.c                 |  34 +-
 fs/nfsd/nfsproc.c                     |  11 +-
 fs/nfsd/vfs.c                         | 151 +++----
 fs/overlayfs/copy_up.c                | 143 +++----
 fs/overlayfs/dir.c                    | 585 +++++++++++++-------------
 fs/overlayfs/file.c                   |  97 ++---
 fs/overlayfs/inode.c                  | 118 +++---
 fs/overlayfs/namei.c                  | 402 +++++++++---------
 fs/overlayfs/overlayfs.h              |  53 ++-
 fs/overlayfs/readdir.c                | 110 ++---
 fs/overlayfs/super.c                  | 138 +++----
 fs/overlayfs/util.c                   |  29 +-
 fs/overlayfs/xattrs.c                 |  35 +-
 fs/smb/client/cifs_spnego.c           |   6 +-
 fs/smb/server/smb2pdu.c               |   6 +-
 fs/smb/server/vfs.c                   | 114 ++----
 fs/smb/server/vfs.h                   |   8 +-
 fs/xfs/scrub/orphanage.c              |  11 +-
 include/linux/cleanup.h               |  15 +-
 include/linux/cred.h                  |  22 +
 include/linux/fs.h                    |   2 +
 include/linux/init_task.h             |   1 -
 include/linux/namei.h                 |  82 ++++
 include/linux/sched/coredump.h        |   2 +-
 init/init_task.c                      |  27 ++
 ipc/mqueue.c                          |  32 +-
 kernel/acct.c                         |  29 +-
 kernel/cgroup/cgroup.c                |  10 +-
 kernel/cred.c                         |  27 --
 kernel/trace/trace_events_user.c      |  22 +-
 net/dns_resolver/dns_query.c          |   6 +-
 net/unix/af_unix.c                    |  17 +-
 scripts/Makefile.extrawarn            |   4 +-
 security/apparmor/apparmorfs.c        |   8 +-
 security/keys/process_keys.c          |   2 +-
 security/selinux/selinuxfs.c          |  15 +-
 71 files changed, 2276 insertions(+), 1886 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 15/17 for v6.19] autofs
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (13 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 14/17 for v6.19] overlayfs cred guards Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-02  3:19   ` pr-tracker-bot
  2025-11-28 16:48 ` [GIT PULL 16/17 for v6.19] vfs fd prepare Christian Brauner
  2025-11-28 16:48 ` [GIT PULL 17/17 for v6.19] vfs fd prepare minimal Christian Brauner
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Prevent Futile Mount Triggers in Private Mount Namespaces

Fix a problematic loop in autofs when a mount namespace contains autofs
mounts that are propagation private and there is no namespace-specific
automount daemon to handle possible automounting.

Previously, attempted path resolution would loop until MAXSYMLINKS was
reached before failing, causing significant noise in the log.

The fix adds a check in autofs ->d_automount() so that the VFS can
immediately return EPERM in this case. Since the mount is propagation
private, EPERM is the most appropriate error code.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251121153059.48e3d2fa@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.autofs

for you to fetch changes up to 922a6f34c1756d2b0c35d9b2d915b8af19e85965:

  autofs: dont trigger mount if it cant succeed (2025-11-19 11:14:02 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.autofs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.autofs

----------------------------------------------------------------
Ian Kent (1):
      autofs: dont trigger mount if it cant succeed

 fs/autofs/autofs_i.h  | 5 +++++
 fs/autofs/dev-ioctl.c | 1 +
 fs/autofs/inode.c     | 1 +
 fs/autofs/root.c      | 8 ++++++++
 fs/namespace.c        | 6 ++++++
 include/linux/fs.h    | 1 +
 6 files changed, 22 insertions(+)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 16/17 for v6.19] vfs fd prepare
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (14 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 15/17 for v6.19] autofs Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-01 14:15   ` Al Viro
  2025-11-28 16:48 ` [GIT PULL 17/17 for v6.19] vfs fd prepare minimal Christian Brauner
  16 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Note: This work came late in the cycle but the series is quite nice and
worth doing. It removes roughly double the code that it adds and
eliminates a lot of convoluted cleanup logic across the kernel.

An alternative pull request (vfs-6.19-rc1.fd_prepare.fs) is available
that contains only the more simple filesystem-focused conversions in
case you'd like to pull something more conservative.

Note this branch also contains two reverts for the KVM FD_PREPARE()
conversions as the KVM maintainers have indicated they would like to
take those changes through the KVM tree in the next cycle. Also gets rid
of a merge conflict. I chose a revert to not rebase the branch
unnecessarily so close to the merge window.

This adds the FD_ADD() and FD_PREPARE() primitive. They simplify the
common pattern of get_unused_fd_flags() + create file + fd_install()
that is used extensively throughout the kernel and currently requires
cumbersome cleanup paths.

FD_ADD() - For simple cases where a file is installed immediately:

  fd = FD_ADD(O_CLOEXEC, vfio_device_open_file(device));
  if (fd < 0)
          vfio_device_put_registration(device);
  return fd;

FD_PREPARE() - For cases requiring access to the fd or file, or
additional work before publishing:

  FD_PREPARE(fdf, O_CLOEXEC, sync_file->file);
  if (fdf.err) {
          fput(sync_file->file);
          return fdf.err;
  }

  data.fence = fd_prepare_fd(fdf);
  if (copy_to_user((void __user *)arg, &data, sizeof(data)))
          return -EFAULT;

  return fd_publish(fdf);

The primitives are centered around struct fd_prepare. FD_PREPARE()
encapsulates all allocation and cleanup logic and must be followed by a
call to fd_publish() which associates the fd with the file and installs
it into the caller's fdtable. If fd_publish() isn't called, both are
deallocated automatically. FD_ADD() is a shorthand that does
fd_publish() immediately and never exposes the struct to the caller.

I've implemented this in a way that it's compatible with the cleanup
infrastructure while also being usable separately. IOW, it's centered
around struct fd_prepare which is aliased to class_fd_prepare_t and so
we can make use of all the basica guard infrastructure.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline or other vfs branches
===================================================

diff --cc include/linux/cleanup.h
index 19c7e475d3a4,361104bcfe92..b8bd2f15f91f
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@@ -290,16 -294,18 +294,19 @@@ static inline class_##_name##_t class_#
  	class_##_name##_t var __cleanup(class_##_name##_destructor) =	\
  		class_##_name##_constructor
  
+ #define CLASS_INIT(_name, _var, _init_expr)                             \
+         class_##_name##_t _var __cleanup(class_##_name##_destructor) = (_init_expr)
+ 
 -#define scoped_class(_name, var, args)                          \
 -	for (CLASS(_name, var)(args);                           \
 -	     __guard_ptr(_name)(&var) || !__is_cond_ptr(_name); \
 -	     ({ goto _label; }))                                \
 -		if (0) {                                        \
 -_label:                                                         \
 -			break;                                  \
 +#define __scoped_class(_name, var, _label, args...)        \
 +	for (CLASS(_name, var)(args); ; ({ goto _label; })) \
 +		if (0) {                                   \
 +_label:                                                    \
 +			break;                             \
  		} else
  
 +#define scoped_class(_name, var, args...) \
 +	__scoped_class(_name, var, __UNIQUE_ID(label), args)
 +
  /*
   * DEFINE_GUARD(name, type, lock, unlock):
   *	trivial wrapper around DEFINE_CLASS() above specifically
diff --cc ipc/mqueue.c
index 83d9466710d6,d3a588d0dcf6..c118ca2c377a
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@@ -892,15 -892,36 +892,34 @@@ static int prepare_open(struct dentry *
  	return inode_permission(&nop_mnt_idmap, d_inode(dentry), acc);
  }
  
+ static struct file *mqueue_file_open(struct filename *name,
+ 				     struct vfsmount *mnt, int oflag, bool ro,
+ 				     umode_t mode, struct mq_attr *attr)
+ {
 -	struct path path __free(path_put) = {};
+ 	struct dentry *dentry;
++	struct file *file;
+ 	int ret;
+ 
 -	dentry = lookup_noperm(&QSTR(name->name), mnt->mnt_root);
++	dentry = start_creating_noperm(mnt->mnt_root, &QSTR(name->name));
+ 	if (IS_ERR(dentry))
+ 		return ERR_CAST(dentry);
+ 
 -	path.dentry = dentry;
 -	path.mnt = mntget(mnt);
 -
 -	ret = prepare_open(path.dentry, oflag, ro, mode, name, attr);
++	ret = prepare_open(dentry, oflag, ro, mode, name, attr);
+ 	if (ret)
 -		return ERR_PTR(ret);
 -
 -	return dentry_open(&path, oflag, current_cred());
++		file = ERR_PTR(ret);
++	else
++		file = dentry_open(&(const struct path){ .mnt = mnt, .dentry = dentry },
++				   oflag, current_cred());
++	end_creating(dentry);
++	return file;
+ }
+ 
  static int do_mq_open(const char __user *u_name, int oflag, umode_t mode,
  		      struct mq_attr *attr)
  {
+ 	struct filename *name __free(putname) = NULL;;
  	struct vfsmount *mnt = current->nsproxy->ipc_ns->mq_mnt;
--	struct dentry *root = mnt->mnt_root;
- 	struct filename *name;
- 	struct path path;
- 	int fd, error;
 -	int fd;
--	int ro;
++	int fd, ro;
  
  	audit_mq_open(oflag, mode, attr);
  
@@@ -908,35 -929,12 +927,10 @@@
  	if (IS_ERR(name))
  		return PTR_ERR(name);
  
- 	fd = get_unused_fd_flags(O_CLOEXEC);
- 	if (fd < 0)
- 		goto out_putname;
- 
  	ro = mnt_want_write(mnt);	/* we'll drop it in any case */
- 	path.dentry = start_creating_noperm(root, &QSTR(name->name));
- 	if (IS_ERR(path.dentry)) {
- 		error = PTR_ERR(path.dentry);
- 		goto out_putfd;
- 	}
- 	path.mnt = mnt;
- 	error = prepare_open(path.dentry, oflag, ro, mode, name, attr);
- 	if (!error) {
- 		struct file *file = dentry_open(&path, oflag, current_cred());
- 		if (!IS_ERR(file))
- 			fd_install(fd, file);
- 		else
- 			error = PTR_ERR(file);
- 	}
- out_putfd:
- 	if (error) {
- 		put_unused_fd(fd);
- 		fd = error;
- 	}
- 	end_creating(path.dentry);
 -	inode_lock(d_inode(root));
+ 	fd = FD_ADD(O_CLOEXEC, mqueue_file_open(name, mnt, oflag, ro, mode, attr));
 -	inode_unlock(d_inode(root));
  	if (!ro)
  		mnt_drop_write(mnt);
- out_putname:
- 	putname(name);
  	return fd;
  }
  

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251125122934.36f75838@canb.auug.org.au

[2]: https://lore.kernel.org/linux-next/20251125171130.67ba74e1@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.fd_prepare

for you to fetch changes up to 65c2c221846eeb157ab7cecf5a26f24d42faafcc:

  Revert "kvm: FD_PREPARE() conversions" (2025-11-28 11:23:08 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.fd_prepare tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.fd_prepare

----------------------------------------------------------------
Christian Brauner (57):
      file: add FD_{ADD,PREPARE}()
      anon_inodes: convert to FD_ADD()
      eventfd: convert do_eventfd() to FD_PREPARE()
      fhandle: convert do_handle_open() to FD_ADD()
      namespace: convert open_tree() to FD_ADD()
      namespace: convert open_tree_attr() to FD_PREPARE()
      namespace: convert fsmount() to FD_PREPARE()
      fanotify: convert fanotify_init() to FD_PREPARE()
      nsfs: convert open_namespace() to FD_PREPARE()
      nsfs: convert ns_ioctl() to FD_PREPARE()
      autofs: convert autofs_dev_ioctl_open_mountpoint() to FD_ADD()
      eventpoll: convert do_epoll_create() to FD_PREPARE()
      open: convert do_sys_openat2() to FD_ADD()
      signalfd: convert do_signalfd4() to FD_ADD()
      timerfd: convert timerfd_create() to FD_ADD()
      userfaultfd: convert new_userfaultfd() to FD_PREPARE()
      xfs: convert xfs_open_by_handle() to FD_PREPARE()
      dma: convert dma_buf_fd() to FD_ADD()
      af_unix: convert unix_file_open() to FD_ADD()
      dma: convert sync_file_ioctl_merge() to FD_PREPARE()
      exec: convert begin_new_exec() to FD_PREPARE()
      ipc: convert do_mq_open() to FD_ADD()
      bpf: convert bpf_iter_new_fd() to FD_PREPARE()
      bpf: convert bpf_token_create() to FD_PREPARE()
      memfd: convert memfd_create() to FD_ADD()
      secretmem: convert memfd_secret() to FD_ADD()
      net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE()
      net/kcm: convert kcm_ioctl() to FD_PREPARE()
      net/sctp: convert sctp_getsockopt_peeloff_common() to FD_PREPARE()
      net/socket: convert sock_map_fd() to FD_ADD()
      net/socket: convert __sys_accept4_file() to FD_ADD()
      spufs: convert spufs_context_open() to FD_PREPARE()
      papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()
      spufs: convert spufs_gang_open() to FD_PREPARE()
      pseries: convert papr_platform_dump_create_handle() to FD_ADD()
      pseries: port papr_rtas_setup_file_interface() to FD_ADD()
      dma: port sw_sync_ioctl_create_fence() to FD_PREPARE()
      gpio: convert linehandle_create() to FD_PREPARE()
      hv: convert mshv_ioctl_create_partition() to FD_ADD()
      media: convert media_request_alloc() to FD_PREPARE()
      ntsync: convert ntsync_obj_get_fd() to FD_PREPARE()
      tty: convert ptm_open_peer() to FD_ADD()
      vfio: convert vfio_group_ioctl_get_device_fd() to FD_ADD()
      file: convert replace_fd() to FD_PREPARE()
      io_uring: convert io_create_mock_file() to FD_PREPARE()
      kvm: convert kvm_arch_supports_gmem_init_shared() to FD_PREPARE()
      kvm: convert kvm_vcpu_ioctl_get_stats_fd() to FD_PREPARE()
      Merge patch series "file: FD_{ADD,PREPARE}()"
      ipc: preserve original file opening pattern
      devpts: preserve original file opening pattern
      dma: return zero after fd_publish()
      exec: switch to FD_ADD()
      handshake: return zero after fd_publish()
      ntsync: only install fd on success
      io_uring: return zero after fd_publish()
      file: make struct fd_prepare a first-class citizen
      Revert "kvm: FD_PREPARE() conversions"

Deepanshu Kartikey (1):
      namespace: fix mntput of ERR_PTR in fsmount error path

Kuniyuki Iwashima (1):
      fanotify: Don't call fsnotify_destroy_group() when fsnotify_alloc_group() fails.

 arch/powerpc/platforms/cell/spufs/inode.c          |  42 ++-----
 arch/powerpc/platforms/pseries/papr-hvpipe.c       |  39 ++-----
 .../powerpc/platforms/pseries/papr-platform-dump.c |  30 ++---
 arch/powerpc/platforms/pseries/papr-rtas-common.c  |  27 +----
 drivers/dma-buf/dma-buf.c                          |  10 +-
 drivers/dma-buf/sw_sync.c                          |  39 +++----
 drivers/dma-buf/sync_file.c                        |  53 +++------
 drivers/gpio/gpiolib-cdev.c                        |  66 ++++-------
 drivers/hv/mshv_root_main.c                        |  30 +----
 drivers/media/mc/mc-request.c                      |  34 ++----
 drivers/misc/ntsync.c                              |  21 +---
 drivers/tty/pty.c                                  |  51 +++------
 drivers/vfio/group.c                               |  28 +----
 fs/anon_inodes.c                                   |  23 +---
 fs/autofs/dev-ioctl.c                              |  30 +----
 fs/eventfd.c                                       |  31 ++---
 fs/eventpoll.c                                     |  32 ++----
 fs/exec.c                                          |   3 +-
 fs/fhandle.c                                       |  30 +++--
 fs/file.c                                          |  19 ++--
 fs/namespace.c                                     | 103 ++++++-----------
 fs/notify/fanotify/fanotify_user.c                 |  60 ++++------
 fs/nsfs.c                                          |  47 +++-----
 fs/open.c                                          |  17 +--
 fs/signalfd.c                                      |  29 ++---
 fs/timerfd.c                                       |  29 ++---
 fs/userfaultfd.c                                   |  30 ++---
 fs/xfs/xfs_handle.c                                |  56 +++------
 include/linux/cleanup.h                            |   7 ++
 include/linux/file.h                               | 126 +++++++++++++++++++++
 io_uring/mock_file.c                               |  43 +++----
 ipc/mqueue.c                                       |  54 ++++-----
 kernel/bpf/bpf_iter.c                              |  29 ++---
 kernel/bpf/token.c                                 |  47 +++-----
 mm/memfd.c                                         |  29 +----
 mm/secretmem.c                                     |  20 +---
 net/handshake/netlink.c                            |  38 +++----
 net/kcm/kcmsock.c                                  |  22 ++--
 net/sctp/socket.c                                  |  90 ++++-----------
 net/socket.c                                       |  34 +-----
 net/unix/af_unix.c                                 |  16 +--
 41 files changed, 564 insertions(+), 1000 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 17/17 for v6.19] vfs fd prepare minimal
  2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
                   ` (15 preceding siblings ...)
  2025-11-28 16:48 ` [GIT PULL 16/17 for v6.19] vfs fd prepare Christian Brauner
@ 2025-11-28 16:48 ` Christian Brauner
  2025-12-02  1:35   ` Linus Torvalds
  2025-12-02  3:19   ` pr-tracker-bot
  16 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-11-28 16:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This is an alternative pull request for the FD_{ADD,PREPARE}() work containing
only parts of the conversion. Again, this work came late in the cycle but the
series is quite nice and worth doing. It removes roughly double the code that
it adds and eliminates a lot of convoluted cleanup logic across the kernel.

This adds the FD_ADD() and FD_PREPARE() primitive. They simplify the
common pattern of get_unused_fd_flags() + create file + fd_install()
that is used extensively throughout the kernel and currently requires
cumbersome cleanup paths.

FD_ADD() - For simple cases where a file is installed immediately:

  fd = FD_ADD(O_CLOEXEC, vfio_device_open_file(device));
  if (fd < 0)
          vfio_device_put_registration(device);
  return fd;

FD_PREPARE() - For cases requiring access to the fd or file, or
additional work before publishing:

  FD_PREPARE(fdf, O_CLOEXEC, sync_file->file);
  if (fdf.err) {
          fput(sync_file->file);
          return fdf.err;
  }

  data.fence = fd_prepare_fd(fdf);
  if (copy_to_user((void __user *)arg, &data, sizeof(data)))
          return -EFAULT;

  return fd_publish(fdf);

The primitives are centered around struct fd_prepare. FD_PREPARE()
encapsulates all allocation and cleanup logic and must be followed by a
call to fd_publish() which associates the fd with the file and installs
it into the caller's fdtable. If fd_publish() isn't called, both are
deallocated automatically. FD_ADD() is a shorthand that does
fd_publish() immediately and never exposes the struct to the caller.

I've implemented this in a way that it's compatible with the cleanup
infrastructure while also being usable separately. IOW, it's centered
around struct fd_prepare which is aliased to class_fd_prepare_t and so
we can make use of all the basica guard infrastructure.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3+b1)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline or other vfs branches
===================================================

diff --cc include/linux/cleanup.h
index 19c7e475d3a4,361104bcfe92..b8bd2f15f91f
--- a/include/linux/cleanup.h
+++ b/include/linux/cleanup.h
@@@ -290,16 -294,18 +294,19 @@@ static inline class_##_name##_t class_#
  	class_##_name##_t var __cleanup(class_##_name##_destructor) =	\
  		class_##_name##_constructor
  
+ #define CLASS_INIT(_name, _var, _init_expr)                             \
+         class_##_name##_t _var __cleanup(class_##_name##_destructor) = (_init_expr)
+ 
 -#define scoped_class(_name, var, args)                          \
 -	for (CLASS(_name, var)(args);                           \
 -	     __guard_ptr(_name)(&var) || !__is_cond_ptr(_name); \
 -	     ({ goto _label; }))                                \
 -		if (0) {                                        \
 -_label:                                                         \
 -			break;                                  \
 +#define __scoped_class(_name, var, _label, args...)        \
 +	for (CLASS(_name, var)(args); ; ({ goto _label; })) \
 +		if (0) {                                   \
 +_label:                                                    \
 +			break;                             \
  		} else
  
 +#define scoped_class(_name, var, args...) \
 +	__scoped_class(_name, var, __UNIQUE_ID(label), args)
 +
  /*
   * DEFINE_GUARD(name, type, lock, unlock):
   *	trivial wrapper around DEFINE_CLASS() above specifically
diff --cc ipc/mqueue.c
index 83d9466710d6,d3a588d0dcf6..c118ca2c377a
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@@ -892,15 -892,36 +892,34 @@@ static int prepare_open(struct dentry *
  	return inode_permission(&nop_mnt_idmap, d_inode(dentry), acc);
  }
  
+ static struct file *mqueue_file_open(struct filename *name,
+ 				     struct vfsmount *mnt, int oflag, bool ro,
+ 				     umode_t mode, struct mq_attr *attr)
+ {
 -	struct path path __free(path_put) = {};
+ 	struct dentry *dentry;
++	struct file *file;
+ 	int ret;
+ 
 -	dentry = lookup_noperm(&QSTR(name->name), mnt->mnt_root);
++	dentry = start_creating_noperm(mnt->mnt_root, &QSTR(name->name));
+ 	if (IS_ERR(dentry))
+ 		return ERR_CAST(dentry);
+ 
 -	path.dentry = dentry;
 -	path.mnt = mntget(mnt);
 -
 -	ret = prepare_open(path.dentry, oflag, ro, mode, name, attr);
++	ret = prepare_open(dentry, oflag, ro, mode, name, attr);
+ 	if (ret)
 -		return ERR_PTR(ret);
 -
 -	return dentry_open(&path, oflag, current_cred());
++		file = ERR_PTR(ret);
++	else
++		file = dentry_open(&(const struct path){ .mnt = mnt, .dentry = dentry },
++				   oflag, current_cred());
++	end_creating(dentry);
++	return file;
+ }
+ 
  static int do_mq_open(const char __user *u_name, int oflag, umode_t mode,
  		      struct mq_attr *attr)
  {
+ 	struct filename *name __free(putname) = NULL;;
  	struct vfsmount *mnt = current->nsproxy->ipc_ns->mq_mnt;
--	struct dentry *root = mnt->mnt_root;
- 	struct filename *name;
- 	struct path path;
- 	int fd, error;
 -	int fd;
--	int ro;
++	int fd, ro;
  
  	audit_mq_open(oflag, mode, attr);
  
@@@ -908,35 -929,12 +927,10 @@@
  	if (IS_ERR(name))
  		return PTR_ERR(name);
  
- 	fd = get_unused_fd_flags(O_CLOEXEC);
- 	if (fd < 0)
- 		goto out_putname;
- 
  	ro = mnt_want_write(mnt);	/* we'll drop it in any case */
- 	path.dentry = start_creating_noperm(root, &QSTR(name->name));
- 	if (IS_ERR(path.dentry)) {
- 		error = PTR_ERR(path.dentry);
- 		goto out_putfd;
- 	}
- 	path.mnt = mnt;
- 	error = prepare_open(path.dentry, oflag, ro, mode, name, attr);
- 	if (!error) {
- 		struct file *file = dentry_open(&path, oflag, current_cred());
- 		if (!IS_ERR(file))
- 			fd_install(fd, file);
- 		else
- 			error = PTR_ERR(file);
- 	}
- out_putfd:
- 	if (error) {
- 		put_unused_fd(fd);
- 		fd = error;
- 	}
- 	end_creating(path.dentry);
 -	inode_lock(d_inode(root));
+ 	fd = FD_ADD(O_CLOEXEC, mqueue_file_open(name, mnt, oflag, ro, mode, attr));
 -	inode_unlock(d_inode(root));
  	if (!ro)
  		mnt_drop_write(mnt);
- out_putname:
- 	putname(name);
  	return fd;
  }
  

Merge conflicts with other trees
================================

[1]: https://lore.kernel.org/linux-next/20251125122934.36f75838@canb.auug.org.au

[2]: https://lore.kernel.org/linux-next/20251125171130.67ba74e1@canb.auug.org.au

The following changes since commit 3a8660878839faadb4f1a6dd72c3179c1df56787:

  Linux 6.18-rc1 (2025-10-12 13:42:36 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.fd_prepare.fs

for you to fetch changes up to 0512bf9701f339c8fee2cc82b6fc35f0a8f6be7a:

  Merge patch series "file: FD_{ADD,PREPARE}()" (2025-11-28 12:42:36 +0100)

Please consider pulling these changes from the signed vfs-6.19-rc1.fd_prepare.fs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.19-rc1.fd_prepare.fs

----------------------------------------------------------------
Christian Brauner (43):
      file: add FD_{ADD,PREPARE}()
      anon_inodes: convert to FD_ADD()
      eventfd: convert do_eventfd() to FD_PREPARE()
      fhandle: convert do_handle_open() to FD_ADD()
      namespace: convert open_tree() to FD_ADD()
      namespace: convert open_tree_attr() to FD_PREPARE()
      namespace: convert fsmount() to FD_PREPARE()
      fanotify: convert fanotify_init() to FD_PREPARE()
      nsfs: convert open_namespace() to FD_PREPARE()
      nsfs: convert ns_ioctl() to FD_PREPARE()
      autofs: convert autofs_dev_ioctl_open_mountpoint() to FD_ADD()
      eventpoll: convert do_epoll_create() to FD_PREPARE()
      open: convert do_sys_openat2() to FD_ADD()
      signalfd: convert do_signalfd4() to FD_ADD()
      timerfd: convert timerfd_create() to FD_ADD()
      userfaultfd: convert new_userfaultfd() to FD_PREPARE()
      xfs: convert xfs_open_by_handle() to FD_PREPARE()
      dma: convert dma_buf_fd() to FD_ADD()
      af_unix: convert unix_file_open() to FD_ADD()
      exec: convert begin_new_exec() to FD_ADD()
      ipc: convert do_mq_open() to FD_ADD()
      bpf: convert bpf_iter_new_fd() to FD_PREPARE()
      bpf: convert bpf_token_create() to FD_PREPARE()
      memfd: convert memfd_create() to FD_ADD()
      secretmem: convert memfd_secret() to FD_ADD()
      net/handshake: convert handshake_nl_accept_doit() to FD_PREPARE()
      net/kcm: convert kcm_ioctl() to FD_PREPARE()
      net/socket: convert sock_map_fd() to FD_ADD()
      net/socket: convert __sys_accept4_file() to FD_ADD()
      spufs: convert spufs_context_open() to FD_PREPARE()
      papr-hvpipe: convert papr_hvpipe_dev_create_handle() to FD_PREPARE()
      spufs: convert spufs_gang_open() to FD_PREPARE()
      pseries: convert papr_platform_dump_create_handle() to FD_ADD()
      pseries: port papr_rtas_setup_file_interface() to FD_ADD()
      gpio: convert linehandle_create() to FD_PREPARE()
      hv: convert mshv_ioctl_create_partition() to FD_ADD()
      media: convert media_request_alloc() to FD_PREPARE()
      ntsync: convert ntsync_obj_get_fd() to FD_PREPARE()
      tty: convert ptm_open_peer() to FD_ADD()
      vfio: convert vfio_group_ioctl_get_device_fd() to FD_ADD()
      file: convert replace_fd() to FD_PREPARE()
      io_uring: convert io_create_mock_file() to FD_PREPARE()
      Merge patch series "file: FD_{ADD,PREPARE}()"

 arch/powerpc/platforms/cell/spufs/inode.c          |  42 ++-----
 arch/powerpc/platforms/pseries/papr-hvpipe.c       |  39 ++-----
 .../powerpc/platforms/pseries/papr-platform-dump.c |  30 ++---
 arch/powerpc/platforms/pseries/papr-rtas-common.c  |  27 +----
 drivers/dma-buf/dma-buf.c                          |  10 +-
 drivers/gpio/gpiolib-cdev.c                        |  66 ++++-------
 drivers/hv/mshv_root_main.c                        |  30 +----
 drivers/media/mc/mc-request.c                      |  34 ++----
 drivers/misc/ntsync.c                              |  21 +---
 drivers/tty/pty.c                                  |  51 +++------
 drivers/vfio/group.c                               |  28 +----
 fs/anon_inodes.c                                   |  23 +---
 fs/autofs/dev-ioctl.c                              |  30 +----
 fs/eventfd.c                                       |  31 ++---
 fs/eventpoll.c                                     |  32 ++----
 fs/exec.c                                          |   3 +-
 fs/fhandle.c                                       |  30 +++--
 fs/file.c                                          |  19 ++--
 fs/namespace.c                                     | 103 ++++++-----------
 fs/notify/fanotify/fanotify_user.c                 |  60 ++++------
 fs/nsfs.c                                          |  47 +++-----
 fs/open.c                                          |  17 +--
 fs/signalfd.c                                      |  29 ++---
 fs/timerfd.c                                       |  29 ++---
 fs/userfaultfd.c                                   |  30 ++---
 fs/xfs/xfs_handle.c                                |  56 +++------
 include/linux/cleanup.h                            |   7 ++
 include/linux/file.h                               | 126 +++++++++++++++++++++
 io_uring/mock_file.c                               |  43 +++----
 ipc/mqueue.c                                       |  54 ++++-----
 kernel/bpf/bpf_iter.c                              |  29 ++---
 kernel/bpf/token.c                                 |  47 +++-----
 mm/memfd.c                                         |  29 +----
 mm/secretmem.c                                     |  20 +---
 net/handshake/netlink.c                            |  38 +++----
 net/kcm/kcmsock.c                                  |  22 ++--
 net/socket.c                                       |  34 +-----
 net/unix/af_unix.c                                 |  16 +--
 38 files changed, 508 insertions(+), 874 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 16/17 for v6.19] vfs fd prepare
  2025-11-28 16:48 ` [GIT PULL 16/17 for v6.19] vfs fd prepare Christian Brauner
@ 2025-12-01 14:15   ` Al Viro
  2025-12-01 18:41     ` Sean Christopherson
  0 siblings, 1 reply; 44+ messages in thread
From: Al Viro @ 2025-12-01 14:15 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel

On Fri, Nov 28, 2025 at 05:48:27PM +0100, Christian Brauner wrote:
> Hey Linus,
> 
> /* Summary */
> Note: This work came late in the cycle but the series is quite nice and
> worth doing. It removes roughly double the code that it adds and
> eliminates a lot of convoluted cleanup logic across the kernel.
> 
> An alternative pull request (vfs-6.19-rc1.fd_prepare.fs) is available
> that contains only the more simple filesystem-focused conversions in
> case you'd like to pull something more conservative.
> 
> Note this branch also contains two reverts for the KVM FD_PREPARE()
> conversions as the KVM maintainers have indicated they would like to
> take those changes through the KVM tree in the next cycle. Also gets rid
> of a merge conflict. I chose a revert to not rebase the branch
> unnecessarily so close to the merge window.

Frankly, that hadn't gotten anywhere near enough exposure in -next and
it's far too large and invasive.  The same lack of exposure goes for
the alternative branch.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 16/17 for v6.19] vfs fd prepare
  2025-12-01 14:15   ` Al Viro
@ 2025-12-01 18:41     ` Sean Christopherson
  0 siblings, 0 replies; 44+ messages in thread
From: Sean Christopherson @ 2025-12-01 18:41 UTC (permalink / raw)
  To: Al Viro; +Cc: Christian Brauner, Linus Torvalds, linux-fsdevel, linux-kernel

On Mon, Dec 01, 2025, Al Viro wrote:
> On Fri, Nov 28, 2025 at 05:48:27PM +0100, Christian Brauner wrote:
> > Hey Linus,
> > 
> > /* Summary */
> > Note: This work came late in the cycle but the series is quite nice and
> > worth doing. It removes roughly double the code that it adds and
> > eliminates a lot of convoluted cleanup logic across the kernel.
> > 
> > An alternative pull request (vfs-6.19-rc1.fd_prepare.fs) is available
> > that contains only the more simple filesystem-focused conversions in
> > case you'd like to pull something more conservative.
> > 
> > Note this branch also contains two reverts for the KVM FD_PREPARE()
> > conversions as the KVM maintainers have indicated they would like to
> > take those changes through the KVM tree in the next cycle. Also gets rid
> > of a merge conflict. I chose a revert to not rebase the branch
> > unnecessarily so close to the merge window.
> 
> Frankly, that hadn't gotten anywhere near enough exposure in -next and
> it's far too large and invasive.

+1.  Saying that I want to take the KVM changes through the KVM tree is
technically true, but glosses over why I objected (or even noticed) in the first
place.

https://lore.kernel.org/all/20251125155455.31c53cf9@canb.auug.org.au

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 05/17 for v6.19] namespaces
  2025-11-28 16:48 ` [GIT PULL 05/17 for v6.19] namespaces Christian Brauner
@ 2025-12-01 19:06   ` Eric W. Biederman
  2025-12-02 17:00     ` Linus Torvalds
  2025-12-01 22:08   ` pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Eric W. Biederman @ 2025-12-01 19:06 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, linux-fsdevel, linux-kernel, Linux Containers

Christian Brauner <brauner@kernel.org> writes:

> Hey Linus,
>
> /* Summary */
> This contains substantial namespace infrastructure changes including a new
> system call, active reference counting, and extensive header cleanups.
> The branch depends on the shared kbuild branch for -fms-extensions
> support.

I am missing something.  From the description it looks like
you are making nested containers impossible once this feature
is adopted.  Because the container will be able to see all of
the other namespaces and thus to see outside of it's own namespace.

The reason such as system call has not been introduced in the past
is because it introduces the namespace of namespace problem.

How have you solved the namespace of namespaces problem?

If you want nesting of containers the listing of namespaces very
much must be incomplete.

I haven't looked at reviewed or looked at the code yet because
the code was not posted in any of the usual places for container
development, nor was I copied.

Can you please describe how you are avoiding the namespace of namespaces
problem?


Eric

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 08/17 for v6.19] cred guards
  2025-11-28 16:48 ` [GIT PULL 08/17 for v6.19] cred guards Christian Brauner
@ 2025-12-01 21:53   ` Linus Torvalds
  2025-12-02  1:26     ` Sasha Levin
  2025-12-01 22:08   ` [GIT PULL 08/17 for v6.19] cred guards pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-12-01 21:53 UTC (permalink / raw)
  To: Christian Brauner, Mike Snitzer; +Cc: linux-fsdevel, linux-kernel

On Fri, 28 Nov 2025 at 08:51, Christian Brauner <brauner@kernel.org> wrote:
>
> Merge conflicts with mainline
>
> diff --cc fs/nfs/localio.c

So I ended up merging this very differently from  how you did it.

I just wrapped 'nfs_local_call_read()' for the cred guarding the same
way the 'nfs_local_call_write()' side had been done.

That made it much easier to see that the changes by Mike were carried
over, and seems cleaner anyway.

But it would be good if people double-checked my change. It looks
"ObviouslyCorrect(tm)" to me, but...

                  Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 01/17 for v6.19] vfs iomap
  2025-11-28 16:48 ` [GIT PULL 01/17 for v6.19] vfs iomap Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:12 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.iomap

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1885cdbfbb51ede3637166c895d0b8040c9899cc

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 02/17 for v6.19] vfs misc
  2025-11-28 16:48 ` [GIT PULL 02/17 for v6.19] vfs misc Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:13 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.misc

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b04b2e7a61830cabd00c6f95308a8e2f5d82fa52

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 03/17 for v6.19] vfs inode
  2025-11-28 16:48 ` [GIT PULL 03/17 for v6.19] vfs inode Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:14 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.inode

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/9368f0f9419cde028a6e58331065900ff089bc36

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 05/17 for v6.19] namespaces
  2025-11-28 16:48 ` [GIT PULL 05/17 for v6.19] namespaces Christian Brauner
  2025-12-01 19:06   ` Eric W. Biederman
@ 2025-12-01 22:08   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:16 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/namespace-6.19-rc1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/415d34b92c1f921a9ff3c38f56319cbc5536f642

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 04/17 for v6.19] vfs writeback
  2025-11-28 16:48 ` [GIT PULL 04/17 for v6.19] vfs writeback Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:15 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.writeback

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ebaeabfa5ab711a9b69b686d58329e258fdae75f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 06/17 for v6.19] vfs coredump
  2025-11-28 16:48 ` [GIT PULL 06/17 for v6.19] vfs coredump Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:17 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.coredump

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/212c4053a1502e5117d8cbbbd1c15579ce1839bb

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 08/17 for v6.19] cred guards
  2025-11-28 16:48 ` [GIT PULL 08/17 for v6.19] cred guards Christian Brauner
  2025-12-01 21:53   ` Linus Torvalds
@ 2025-12-01 22:08   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:19 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/kernel-6.19-rc1.cred

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1d18101a644e6ece450d5b0a93f21a71a21b6222

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 07/17 for v6.19] vfs folio
  2025-11-28 16:48 ` [GIT PULL 07/17 for v6.19] vfs folio Christian Brauner
@ 2025-12-01 22:08   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 22:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:18 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.folio

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f2e74ecfba1b0d407f04b671a240cc65e309e529

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/17 for v6.19] vfs headers
  2025-11-28 16:48 ` [GIT PULL 09/17 for v6.19] vfs headers Christian Brauner
@ 2025-12-01 23:22   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 23:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:20 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.fs_header

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/afdf0fb340948a8c0f581ed1dc42828af89b80b6

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 10/17 for v6.19] vfs super guards
  2025-11-28 16:48 ` [GIT PULL 10/17 for v6.19] vfs super guards Christian Brauner
@ 2025-12-01 23:22   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 23:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:21 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.guards

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/978d337c2ed6e5313ee426871a410eddc796ccfd

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/17 for v6.19] minix
  2025-11-28 16:48 ` [GIT PULL 11/17 for v6.19] minix Christian Brauner
@ 2025-12-01 23:22   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-01 23:22 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:22 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.minix

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/4664fb427c8fd0080f40109f5e2b2090a6fb0c84

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 08/17 for v6.19] cred guards
  2025-12-01 21:53   ` Linus Torvalds
@ 2025-12-02  1:26     ` Sasha Levin
  2025-12-02  1:36       ` [PATCH] nfs/localio: make do_nfs_local_call_write() return void Sasha Levin
  0 siblings, 1 reply; 44+ messages in thread
From: Sasha Levin @ 2025-12-02  1:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christian Brauner, Mike Snitzer, linux-fsdevel, linux-kernel

On Mon, Dec 01, 2025 at 01:53:02PM -0800, Linus Torvalds wrote:
>On Fri, 28 Nov 2025 at 08:51, Christian Brauner <brauner@kernel.org> wrote:
>>
>> Merge conflicts with mainline
>>
>> diff --cc fs/nfs/localio.c
>
>So I ended up merging this very differently from  how you did it.
>
>I just wrapped 'nfs_local_call_read()' for the cred guarding the same
>way the 'nfs_local_call_write()' side had been done.
>
>That made it much easier to see that the changes by Mike were carried
>over, and seems cleaner anyway.
>
>But it would be good if people double-checked my change. It looks
>"ObviouslyCorrect(tm)" to me, but...

A minor nit:

	+ static void nfs_local_call_write(struct work_struct *work)
	+ {
	+       struct nfs_local_kiocb *iocb =
	+               container_of(work, struct nfs_local_kiocb, work);
	+       struct file *filp = iocb->kiocb.ki_filp;
	+       unsigned long old_flags = current->flags;
	+       ssize_t status;
	+ 
	+       current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO;
	+ 
	+       scoped_with_creds(filp->f_cred)
	+               status = do_nfs_local_call_write(iocb, filp);
	+ 
	        current->flags = old_flags;
	 -
	 -      if (status != -EIOCBQUEUED) {
	 -              nfs_local_write_done(iocb, status);
	 -              nfs_local_vfs_getattr(iocb);
	 -              nfs_local_pgio_release(iocb);
	 -      }
	  }

With the change above, `status` should have been dropped altogether.

I'll send a patch...

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 17/17 for v6.19] vfs fd prepare minimal
  2025-11-28 16:48 ` [GIT PULL 17/17 for v6.19] vfs fd prepare minimal Christian Brauner
@ 2025-12-02  1:35   ` Linus Torvalds
  2025-12-02  9:42     ` Christian Brauner
  2025-12-02  3:19   ` pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-12-02  1:35 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, linux-kernel

On Fri, 28 Nov 2025 at 08:51, Christian Brauner <brauner@kernel.org> wrote:
>
> This is an alternative pull request for the FD_{ADD,PREPARE}() work containing
> only parts of the conversion.

Ok, I'm nto super happy with how thsi all looks, partly because
there's been a lot of conflicts.  I don't t hink this was well done,
with multiple different areas getting cleaned up in the same release.

I considered leaving some stuff entirely for the next go-around, but
I've taken it all, although I only took this smaller version of the
FD_ADD().

Not because I think anything was particularly bad, but simply because
I feel it was too much churn for one release. This is all old code
that didn't need to be changed all at once.

Please don't do this again. We're not in that kind of a hurry, and
hurried cleanups aren't great.

Also, I don't love your mqueue merge resolution with the cast to
create the path argument to dentry_open(). So I did that differently.

That said, I don't love mine *either*. It all feels a bit hacky. I get
the feeling that maybe the mqueue case should just have used
FD_PREPARE() / fd_publish() after all.

Anyway, please check that I didn't miss anything. It is entirely possible I did.

             Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH] nfs/localio: make do_nfs_local_call_write() return void
  2025-12-02  1:26     ` Sasha Levin
@ 2025-12-02  1:36       ` Sasha Levin
  0 siblings, 0 replies; 44+ messages in thread
From: Sasha Levin @ 2025-12-02  1:36 UTC (permalink / raw)
  To: sashal; +Cc: brauner, linux-fsdevel, linux-kernel, snitzer, torvalds

do_nfs_local_call_write() does not need to return status because
completion handling is done internally via nfs_local_pgio_done()
and nfs_local_write_iocb_done().

This makes it consistent with do_nfs_local_call_read(), which
already returns void for the same reason.

Fixes: 1d18101a644e ("Merge tag 'kernel-6.19-rc1.cred' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/nfs/localio.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/fs/nfs/localio.c b/fs/nfs/localio.c
index 49ed90c6b9f22..b45bf3fbe491c 100644
--- a/fs/nfs/localio.c
+++ b/fs/nfs/localio.c
@@ -822,8 +822,8 @@ static void nfs_local_write_aio_complete(struct kiocb *kiocb, long ret)
 	nfs_local_pgio_aio_complete(iocb); /* Calls nfs_local_write_aio_complete_work */
 }
 
-static ssize_t do_nfs_local_call_write(struct nfs_local_kiocb *iocb,
-				       struct file *filp)
+static void do_nfs_local_call_write(struct nfs_local_kiocb *iocb,
+				    struct file *filp)
 {
 	bool force_done = false;
 	ssize_t status;
@@ -853,8 +853,6 @@ static ssize_t do_nfs_local_call_write(struct nfs_local_kiocb *iocb,
 		}
 	}
 	file_end_write(filp);
-
-	return status;
 }
 
 static void nfs_local_call_write(struct work_struct *work)
@@ -863,12 +861,11 @@ static void nfs_local_call_write(struct work_struct *work)
 		container_of(work, struct nfs_local_kiocb, work);
 	struct file *filp = iocb->kiocb.ki_filp;
 	unsigned long old_flags = current->flags;
-	ssize_t status;
 
 	current->flags |= PF_LOCAL_THROTTLE | PF_MEMALLOC_NOIO;
 
 	scoped_with_creds(filp->f_cred)
-		status = do_nfs_local_call_write(iocb, filp);
+		do_nfs_local_call_write(iocb, filp);
 
 	current->flags = old_flags;
 }
-- 
2.51.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 12/17 for v6.19] vfs directory delegations
  2025-11-28 16:48 ` [GIT PULL 12/17 for v6.19] vfs directory delegations Christian Brauner
@ 2025-12-02  3:19   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-02  3:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:23 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.directory.delegations

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/db74a7d02ae244ec0552d18f51054f9ae0d921ad

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 13/17 for v6.19] vfs directory locking
  2025-11-28 16:48 ` [GIT PULL 13/17 for v6.19] vfs directory locking Christian Brauner
@ 2025-12-02  3:19   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-02  3:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:24 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.directory.locking

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/a8058f8442df3150fa58154672f4a62a13e833e5

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 15/17 for v6.19] autofs
  2025-11-28 16:48 ` [GIT PULL 15/17 for v6.19] autofs Christian Brauner
@ 2025-12-02  3:19   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-02  3:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:26 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.autofs

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ffbf700df204dd25a48a19979a126e37f5dd1e6a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 14/17 for v6.19] overlayfs cred guards
  2025-11-28 16:48 ` [GIT PULL 14/17 for v6.19] overlayfs cred guards Christian Brauner
@ 2025-12-02  3:19   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-02  3:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:25 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.ovl

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/d0deeb803cd65c41c37ac106063c46c51d5d43ab

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 17/17 for v6.19] vfs fd prepare minimal
  2025-11-28 16:48 ` [GIT PULL 17/17 for v6.19] vfs fd prepare minimal Christian Brauner
  2025-12-02  1:35   ` Linus Torvalds
@ 2025-12-02  3:19   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-12-02  3:19 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 28 Nov 2025 17:48:28 +0100:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.19-rc1.fd_prepare.fs

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/1b5dd29869b1e63f7e5c37d7552e2dcf22de3c26

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 17/17 for v6.19] vfs fd prepare minimal
  2025-12-02  1:35   ` Linus Torvalds
@ 2025-12-02  9:42     ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-12-02  9:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, linux-kernel

On Mon, Dec 01, 2025 at 05:35:39PM -0800, Linus Torvalds wrote:
> On Fri, 28 Nov 2025 at 08:51, Christian Brauner <brauner@kernel.org> wrote:
> >
> > This is an alternative pull request for the FD_{ADD,PREPARE}() work containing
> > only parts of the conversion.
> 
> Ok, I'm nto super happy with how thsi all looks, partly because
> there's been a lot of conflicts.  I don't t hink this was well done,
> with multiple different areas getting cleaned up in the same release.
> 
> I considered leaving some stuff entirely for the next go-around, but
> I've taken it all, although I only took this smaller version of the
> FD_ADD().
> 
> Not because I think anything was particularly bad, but simply because
> I feel it was too much churn for one release. This is all old code
> that didn't need to be changed all at once.
> 
> Please don't do this again. We're not in that kind of a hurry, and
> hurried cleanups aren't great.

I understand. I'm sorry if I rushed this. I was excited about the series
and I thought I'd leave the decision to you. I'll be more conservative
next time.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 05/17 for v6.19] namespaces
  2025-12-01 19:06   ` Eric W. Biederman
@ 2025-12-02 17:00     ` Linus Torvalds
  2025-12-03 10:07       ` Christian Brauner
  0 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-12-02 17:00 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Christian Brauner, linux-fsdevel, linux-kernel, Linux Containers

On Mon, 1 Dec 2025 at 11:06, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> The reason such as system call has not been introduced in the past
> is because it introduces the namespace of namespace problem.
>
> How have you solved the namespace of namespaces problem?

So I think Christian would be better at answering this, but to a first
approximation I think the explanation from commit 76b6f5dfb3fd
("nstree: add listns()") gives some high-level rules:

    listns() respects namespace isolation and capabilities:

    (1) Global listing (user_ns_id = 0):
        - Requires CAP_SYS_ADMIN in the namespace's owning user namespace
        - OR the namespace must be in the caller's namespace context (e.g.,
          a namespace the caller is currently using)
        - User namespaces additionally allow listing if the caller has
          CAP_SYS_ADMIN in that user namespace itself
    (2) Owner-filtered listing (user_ns_id != 0):
        - Requires CAP_SYS_ADMIN in the specified owner user namespace
        - OR the namespace must be in the caller's namespace context
        - This allows unprivileged processes to enumerate namespaces they own
    (3) Visibility:
        - Only "active" namespaces are listed
        - A namespace is active if it has a non-zero __ns_ref_active count
        - This includes namespaces used by running processes, held by open
          file descriptors, or kept active by bind mounts
        - Inactive namespaces (kept alive only by internal kernel
          references) are not visible via listns()

but it would be very nice if you were to take a closer look at the
whole thing and make sure you're satisfied with it all.. Even just a
"overview scan" would be lovely.

            Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 05/17 for v6.19] namespaces
  2025-12-02 17:00     ` Linus Torvalds
@ 2025-12-03 10:07       ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-12-03 10:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Eric W. Biederman, linux-fsdevel, linux-kernel, Linux Containers

On Tue, Dec 02, 2025 at 09:00:57AM -0800, Linus Torvalds wrote:
> On Mon, 1 Dec 2025 at 11:06, Eric W. Biederman <ebiederm@xmission.com> wrote:
> >
> > The reason such as system call has not been introduced in the past
> > is because it introduces the namespace of namespace problem.
> >
> > How have you solved the namespace of namespaces problem?
> 
> So I think Christian would be better at answering this, but to a first
> approximation I think the explanation from commit 76b6f5dfb3fd
> ("nstree: add listns()") gives some high-level rules:

After last year's round I've caught another lung infection so I'm a bit
incapacitated and not working. Visibility is currently based on the user
namespace. It's possible to list all namespaces that are owned by a
given user namespaces. So a caller in an unprivileged container is only
able to list namespaces that they directly own or namespaces owned by
descendant namespaces. That's tracked in the namespace tree. The
self-tests verify this as well. So it is not possible to break out of
that hierarchy. As this is expressily an introspection system call it
allows to list sibling namespaces owned by the same user namespace ofc
as its tailored for container managers. This will be used in
high-privileged container managers and in systemd for service
supervision so if there's any concerns that the current standard access
regulation and seccomp() isn't enough I'm more than happy to require
global CAP_SYS_ADMIN.

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2025-12-03 10:07 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-28 16:48 [GIT PULL 00/17 for v6.19] v6.19 Christian Brauner
2025-11-28 16:48 ` [GIT PULL 01/17 for v6.19] vfs iomap Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 02/17 for v6.19] vfs misc Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 03/17 for v6.19] vfs inode Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 04/17 for v6.19] vfs writeback Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 05/17 for v6.19] namespaces Christian Brauner
2025-12-01 19:06   ` Eric W. Biederman
2025-12-02 17:00     ` Linus Torvalds
2025-12-03 10:07       ` Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 06/17 for v6.19] vfs coredump Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 07/17 for v6.19] vfs folio Christian Brauner
2025-12-01 22:08   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 08/17 for v6.19] cred guards Christian Brauner
2025-12-01 21:53   ` Linus Torvalds
2025-12-02  1:26     ` Sasha Levin
2025-12-02  1:36       ` [PATCH] nfs/localio: make do_nfs_local_call_write() return void Sasha Levin
2025-12-01 22:08   ` [GIT PULL 08/17 for v6.19] cred guards pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 09/17 for v6.19] vfs headers Christian Brauner
2025-12-01 23:22   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 10/17 for v6.19] vfs super guards Christian Brauner
2025-12-01 23:22   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 11/17 for v6.19] minix Christian Brauner
2025-12-01 23:22   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 12/17 for v6.19] vfs directory delegations Christian Brauner
2025-12-02  3:19   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 13/17 for v6.19] vfs directory locking Christian Brauner
2025-12-02  3:19   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 14/17 for v6.19] overlayfs cred guards Christian Brauner
2025-12-02  3:19   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 15/17 for v6.19] autofs Christian Brauner
2025-12-02  3:19   ` pr-tracker-bot
2025-11-28 16:48 ` [GIT PULL 16/17 for v6.19] vfs fd prepare Christian Brauner
2025-12-01 14:15   ` Al Viro
2025-12-01 18:41     ` Sean Christopherson
2025-11-28 16:48 ` [GIT PULL 17/17 for v6.19] vfs fd prepare minimal Christian Brauner
2025-12-02  1:35   ` Linus Torvalds
2025-12-02  9:42     ` Christian Brauner
2025-12-02  3:19   ` pr-tracker-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).