linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GIT PULL 00/14 for v6.17] vfs 6.17
@ 2025-07-25 11:27 Christian Brauner
  2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
                   ` (14 more replies)
  0 siblings, 15 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

This is the batch of pull requests for the v6.17 merge window!

I'm trying something new where I'm attaching a cover letter with a short
summary of all the various pull requests flowing to you during this
cycle.

Lucky for me the v6.17 merge window coincides with me moving. IOW, I'm
currently getting squashed by moving boxes and disassembled furniture.
I'm just happy that I did find my laptop in this mess and I hope there's
no notable effects due to the last couple of weeks.

In any case, this cycle was pretty usual for us given the past years.
We have two new system call additions in core vfs file_getattr() and
file_setattr() which are exensible successors to the legacy ioctl()s.

There's further work in the form of preparatory changes to the directory
locking scheme we currently have; both on the vfs level and for
overlayfs specificall. I want to stress that no actual locking changes
have happened yet and that there's not yet any commitment by us to
actually land any of this.

We have a new bpf kfunc extension for reading extended attributes from
cgroups. This is the first time we're routing bpf patches but I will do
this for all future vfs bpf extensions so we know exactly how and when
something is happening.

There's another round of extensive coredump work. Not just an extension
to the coredump socket but also a rework of the coredump code to just be
more readable and maintainable. I'm somewhat afraid of what I've gotten
myself into by touching that code but hey, that's part of the deal.

We have some work at the intersection of the block and vfs layer in the
form of the new FS_IOC_GETLBMD_CAP ioctl() which returns information
about the files integrity profile for userspace applications that need
to understand a files end-to-end data protection support and configure
the I/O accordingly.

Iomap has been quite active as well with some refactoring and changes to
the infrastucture to extend the abilities of fuse and support large
folios. Hell, if this keeps going on every filesystem will move to fuse
and we'll all be out of a job soon.

There's the usual pile of miscellaneous changes to the vfs layer and
filesystems. No need to cover this in detail here.

We also have some work at the intersection of mm and the vfs by porting
a good chunk of filesystems from f_op->mmap() to the new and better
f_op->mmap_prepare(). I'm going to haunt the relevant developers to
finish this conversion asap because I have no appetite of running around
with yet more duplicated methods than we already have. I mean, we've
just gotten rid of f_op->readdir() last year or so - actually you did.

I'm also routing the usual namespace work. This time in the form of some
minor nsfs extensions by exposing a bunch of uapi symbols that a lot of
userspace already relies on and so we can't change those constants
anyway. That's the root inode number of procfs and the inode numbers of
the initial set of namespaces.

We've also been very active in pidfs which gains a bunch of new features
such as persisent exit and coredump information, extended attributes,
autonomous file handles, and pidfd for reaped task from SCM_PDIFD
messages.

A few minor Rust updates are also in there but they're really not that
interesting at all.

And at last a new struct super_operations method that allows
multi-device filesystems such as btrfs to be informed when a block
device is removed. Since btrfs can survive surprise device removal this
complements the usual ->shutdown() call nicely.

That's all! Expect some slight delay in responses as I'm going to be
preoccupied with the move over the weekend.

Thanks!
Christian

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 05/14 for v6.17] vfs async dir
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains preparatory changes for the asynchronous directory locking
scheme. While the locking scheme is still very much controversial and
we're still far away from landing any actual changes in that area the
preparatory work that we've been upstreaming for a while now has been
very useful. This is another set of minor changes and cleanups.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.async.dir

for you to fetch changes up to d4db71038ff592aa4bc954d6bbd10be23954bb98:

  Merge patch series "Minor cleanup preparation for some dir-locking API changes" (2025-06-11 13:44:21 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.async.dir tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.async.dir

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "Minor cleanup preparation for some dir-locking API changes"

NeilBrown (4):
      VFS: merge lookup_one_qstr_excl_raw() back into lookup_one_qstr_excl()
      VFS: Minor fixes for porting.rst
      coda: use iterate_dir() in coda_readdir()
      exportfs: use lookup_one_unlocked()

 Documentation/filesystems/porting.rst |  3 ---
 fs/coda/dir.c                         | 12 ++----------
 fs/exportfs/expfs.c                   |  4 +---
 fs/namei.c                            | 37 +++++++++++++----------------------
 4 files changed, 17 insertions(+), 39 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
  2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-29 18:15   ` Alexei Starovoitov
  2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
                   ` (12 subsequent siblings)
  14 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
These changes allow bpf to read extended attributes from cgroupfs.
This is useful in redirecting AF_UNIX socket connections based on cgroup
membership of the socket. One use-case is the ability to implement log
namespaces in systemd so services and containers are redirected to
different journals.

Please note that I plan on merging bpf changes related to the vfs
exclusively via vfs trees.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.bpf

for you to fetch changes up to 70619d40e8307b4b2ce1d08405e7b827c61ba4a8:

  selftests/kernfs: test xattr retrieval (2025-07-02 14:18:22 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.bpf tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.bpf

----------------------------------------------------------------
Christian Brauner (3):
      kernfs: remove iattr_mutex
      Merge patch series "Introduce bpf_cgroup_read_xattr"
      selftests/kernfs: test xattr retrieval

Song Liu (3):
      bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
      bpf: Mark cgroup_subsys_state->cgroup RCU safe
      selftests/bpf: Add tests for bpf_cgroup_read_xattr

 fs/bpf_fs_kfuncs.c                                 |  34 +++++
 fs/kernfs/inode.c                                  |  70 ++++-----
 kernel/bpf/helpers.c                               |   3 +
 kernel/bpf/verifier.c                              |   5 +
 tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
 .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
 .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
 .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++
 tools/testing/selftests/filesystems/.gitignore     |   1 +
 tools/testing/selftests/filesystems/Makefile       |   2 +-
 tools/testing/selftests/filesystems/kernfs_test.c  |  38 +++++
 11 files changed, 486 insertions(+), 33 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/cgroup_xattr.c
 create mode 100644 tools/testing/selftests/bpf/progs/cgroup_read_xattr.c
 create mode 100644 tools/testing/selftests/bpf/progs/read_cgroupfs_xattr.c
 create mode 100644 tools/testing/selftests/filesystems/kernfs_test.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 02/14 for v6.17] vfs coredump
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
  2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
  2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 18:57   ` Linus Torvalds
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 06/14 for v6.17] vfs fallocate Christian Brauner
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains an extension to the coredump socket and a proper rework
of the coredump code.

- This extends the coredump socket to allow the coredump server to tell
  the kernel how to process individual coredumps. This allows for
  fine-grained coredump management. Userspace can decide to just let the
  kernel write out the coredump, or generate the coredump itself, or
  just reject it.

  * COREDUMP_KERNEL
    The kernel will write the coredump data to the socket.

  * COREDUMP_USERSPACE
    The kernel will not write coredump data but will indicate to the
    parent that a coredump has been generated. This is used when
    userspace generates its own coredumps.

  * COREDUMP_REJECT
    The kernel will skip generating a coredump for this task.

  * COREDUMP_WAIT
    The kernel will prevent the task from exiting until the coredump
    server has shutdown the socket connection.

  The flexible coredump socket can be enabled by using the "@@" prefix
  instead of the single "@" prefix for the regular coredump socket:

    @@/run/systemd/coredump.socket

- Cleanup the coredump code properly while we have to touch it anyway.
  Split out each coredump mode in a separate helper so it's easy to
  grasp what is going on and make the code easier to follow. The core
  coredump function should now be very trivial to follow.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

This will have a merge conflict with mainline that can be resolved as follows:

diff --cc tools/testing/selftests/coredump/stackdump_test.c
index 68f8e479ac36,a4ac80bb1003..000000000000
--- a/tools/testing/selftests/coredump/stackdump_test.c
+++ b/tools/testing/selftests/coredump/stackdump_test.c
@@@ -418,59 -430,31 +430,35 @@@ TEST_F(coredump, socket_detect_userspac
                close(ipc_sockets[1]);

                fd_coredump = accept4(fd_server, NULL, NULL, SOCK_CLOEXEC);
-               if (fd_coredump < 0) {
-                       fprintf(stderr, "Failed to accept coredump socket connection\n");
-                       close(fd_server);
-                       _exit(EXIT_FAILURE);
-               }
+               if (fd_coredump < 0)
+                       goto out;

-               fd_peer_pidfd_len = sizeof(fd_peer_pidfd);
-               ret = getsockopt(fd_coredump, SOL_SOCKET, SO_PEERPIDFD,
-                                &fd_peer_pidfd, &fd_peer_pidfd_len);
-               if (ret < 0) {
-                       fprintf(stderr, "%m - Failed to retrieve peer pidfd for coredump socket connection\n");
-                       close(fd_coredump);
-                       close(fd_server);
-                       _exit(EXIT_FAILURE);
-               }
+               fd_peer_pidfd = get_peer_pidfd(fd_coredump);
+               if (fd_peer_pidfd < 0)
+                       goto out;

-               memset(&info, 0, sizeof(info));
-               info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP;
-               ret = ioctl(fd_peer_pidfd, PIDFD_GET_INFO, &info);
-               if (ret < 0) {
-                       fprintf(stderr, "Failed to retrieve pidfd info from peer pidfd for coredump socket connection\n");
-                       close(fd_coredump);
-                       close(fd_server);
-                       close(fd_peer_pidfd);
-                       _exit(EXIT_FAILURE);
-               }
+               if (!get_pidfd_info(fd_peer_pidfd, &info))
+                       goto out;

-               if (!(info.mask & PIDFD_INFO_COREDUMP)) {
-                       fprintf(stderr, "Missing coredump information from coredumping task\n");
-                       close(fd_coredump);
-                       close(fd_server);
-                       close(fd_peer_pidfd);
-                       _exit(EXIT_FAILURE);
-               }
+               if (!(info.mask & PIDFD_INFO_COREDUMP))
+                       goto out;

-               if (info.coredump_mask & PIDFD_COREDUMPED) {
-                       fprintf(stderr, "Received unexpected connection from coredumping task\n");
-                       close(fd_coredump);
-                       close(fd_server);
-                       close(fd_peer_pidfd);
-                       _exit(EXIT_FAILURE);
-               }
+               if (info.coredump_mask & PIDFD_COREDUMPED)
+                       goto out;

 +              ret = read(fd_coredump, &c, 1);
 +
-               close(fd_coredump);
-               close(fd_server);
-               close(fd_peer_pidfd);
-               close(fd_core_file);
-
+               exit_code = EXIT_SUCCESS;
+ out:
+               if (fd_peer_pidfd >= 0)
+                       close(fd_peer_pidfd);
+               if (fd_coredump >= 0)
+                       close(fd_coredump);
+               if (fd_server >= 0)
+                       close(fd_server);
 +              if (ret < 1)
 +                      _exit(EXIT_FAILURE);
-               _exit(EXIT_SUCCESS);
+               _exit(exit_code);
        }
        self->pid_coredump_server = pid_coredump_server;

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.coredump

for you to fetch changes up to 5c21c5f22d0701ac6c1cafc0e8de4bf42e5c53e5:

  cleanup: add a scoped version of CLASS() (2025-07-11 16:01:07 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.coredump tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.coredump

----------------------------------------------------------------
Christian Brauner (33):
      coredump: allow for flexible coredump handling
      selftests/coredump: fix build
      selftests/coredump: cleanup coredump tests
      tools: add coredump.h header
      selftests/coredump: add coredump server selftests
      Merge patch series "coredump: allow for flexible coredump handling"
      coredump: cleanup coredump socket functions
      coredump: rename format_corename()
      coredump: make coredump_parse() return bool
      coredump: fix socket path validation
      coredump: validate that path doesn't exceed UNIX_PATH_MAX
      fs: move name_contains_dotdot() to header
      coredump: don't allow ".." in coredump socket path
      coredump: validate socket path in coredump_parse()
      selftests/coredump: make sure invalid paths are rejected
      coredump: rename do_coredump() to vfs_coredump()
      coredump: split file coredumping into coredump_file()
      coredump: prepare to simplify exit paths
      coredump: move core_pipe_count to global variable
      coredump: split pipe coredumping into coredump_pipe()
      coredump: move pipe specific file check into coredump_pipe()
      coredump: use a single helper for the socket
      coredump: add coredump_write()
      coredump: auto cleanup argv
      coredump: directly return
      cred: add auto cleanup method
      coredump: auto cleanup prepare_creds()
      coredump: add coredump_cleanup()
      coredump: order auto cleanup variables at the top
      coredump: avoid pointless variable
      coredump: add coredump_skip() helper
      Merge patch series "coredump: further cleanups"
      cleanup: add a scoped version of CLASS()

 Documentation/security/credentials.rst             |    2 +-
 .../translations/zh_CN/security/credentials.rst    |    2 +-
 drivers/base/firmware_loader/main.c                |   31 +-
 fs/coredump.c                                      |  868 ++++++----
 include/linux/cleanup.h                            |    8 +
 include/linux/coredump.h                           |    4 +-
 include/linux/cred.h                               |    2 +
 include/linux/fs.h                                 |   16 +
 include/uapi/linux/coredump.h                      |  104 ++
 kernel/signal.c                                    |    2 +-
 tools/include/uapi/linux/coredump.h                |  104 ++
 tools/testing/selftests/coredump/Makefile          |    2 +-
 tools/testing/selftests/coredump/config            |    3 +
 tools/testing/selftests/coredump/stackdump_test.c  | 1689 +++++++++++++++++---
 14 files changed, 2239 insertions(+), 598 deletions(-)
 create mode 100644 include/uapi/linux/coredump.h
 create mode 100644 tools/include/uapi/linux/coredump.h
 create mode 100644 tools/testing/selftests/coredump/config

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 06/14 for v6.17] vfs fallocate
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (2 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
fallocate() currently supports creating preallocated files efficiently.
However, on most filesystems fallocate() will preallocate blocks in an
unwriten state even if FALLOC_FL_ZERO_RANGE is specified.

The extent state must later be converted to a written state when the
user writes data into this range, which can trigger numerous metadata
changes and journal I/O. This may leads to significant write
amplification and performance degradation in synchronous write mode.

At the moment, the only method to avoid this is to create an empty file
and write zero data into it (for example, using 'dd' with a large block
size). However, this method is slow and consumes a considerable amount
of disk bandwidth.

Now that more and more flash-based storage devices are available it is
possible to efficiently write zeros to SSDs using the unmap write zeroes
command if the devices do not write physical zeroes to the media.

For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support the
DEAC bit[1], the write zeroes command does not write actual data to the
device, instead, NVMe converts the zeroed range to a deallocated state,
which works fast and consumes almost no disk write bandwidth.

This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and
BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and
device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and
STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices.

fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES
flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a way
that subsequent writes to that range do not require further changes to
the file mapping metadata. This flag is beneficial for subsequent pure
overwriting within this range, as it can save on block allocation and,
consequently, significant metadata changes.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit e04c78d86a9699d136910cfc0bdcf01087e3267e:

  Linux 6.16-rc2 (2025-06-15 13:49:41 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate

for you to fetch changes up to 4f984fe7b4d9aea332c7ff59827a4e168f0e4e1b:

  Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag" (2025-06-23 12:45:32 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.fallocate tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.fallocate

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag"

Zhang Yi (9):
      block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
      nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit
      nvmet: set WZDS and DRB if device enables unmap write zeroes operation
      scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP
      dm: clear unmap write zeroes limits when disabling write zeroes
      fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate
      block: factor out common part in blkdev_fallocate()
      block: add FALLOC_FL_WRITE_ZEROES support
      ext4: add FALLOC_FL_WRITE_ZEROES support

 Documentation/ABI/stable/sysfs-block | 33 ++++++++++++++++++
 block/blk-settings.c                 | 20 +++++++++--
 block/blk-sysfs.c                    | 26 ++++++++++++++
 block/fops.c                         | 44 +++++++++++++-----------
 drivers/md/dm-table.c                |  4 ++-
 drivers/nvme/host/core.c             | 20 ++++++-----
 drivers/nvme/target/io-cmd-bdev.c    |  4 +++
 drivers/scsi/sd.c                    |  5 +++
 fs/ext4/extents.c                    | 66 ++++++++++++++++++++++++++++++------
 fs/open.c                            |  1 +
 include/linux/blkdev.h               | 10 ++++++
 include/linux/falloc.h               |  3 +-
 include/trace/events/ext4.h          |  3 +-
 include/uapi/linux/falloc.h          | 17 ++++++++++
 14 files changed, 212 insertions(+), 44 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 12/14 for v6.17] vfs fileattr
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (3 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 06/14 for v6.17] vfs fallocate Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This introduces the new file_getattr() and file_setattr() system calls
after lengthy discussions. Both system calls serve as successors and
extensible companions to the FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR
system calls which have started to show their age in addition to being
named in a way that makes it easy to conflate them with extended
attribute related operations.

These syscalls allow userspace to set filesystem inode attributes on
special files. One of the usage examples is the XFS quota projects.

XFS has project quotas which could be attached to a directory. All new
inodes in these directories inherit project ID set on parent directory.

The project is created from userspace by opening and calling
FS_IOC_FSSETXATTR on each inode. This is not possible for special files
such as FIFO, SOCK, BLK etc. Therefore, some inodes are left with empty
project ID. Those inodes then are not shown in the quota accounting but
still exist in the directory. This is not critical but in the case when
special files are created in the directory with already existing project
quota, these new inodes inherit extended attributes. This creates a mix
of special files with and without attributes. Moreover, special files
with attributes don't have a possibility to become clear or change the
attributes. This, in turn, prevents userspace from re-creating quota
project on these existing files.

In addition, these new system calls allow the implementation of
additional attributes that we couldn't or didn't want to fit into the
legacy ioctls anymore.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fileattr

for you to fetch changes up to e85931d1cd699307e6a3f1060cbe4c42748f3fff:

  fs: tighten a sanity check in file_attr_to_fileattr() (2025-07-16 10:22:01 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.fileattr tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.fileattr

----------------------------------------------------------------
Amir Goldstein (1):
      fs: prepare for extending file_get/setattr()

Andrey Albershteyn (5):
      fs: split fileattr related helpers into separate file
      lsm: introduce new hooks for setting/getting inode fsxattr
      selinux: implement inode_file_[g|s]etattr hooks
      fs: make vfs_fileattr_[get|set] return -EOPNOTSUPP
      fs: introduce file_getattr and file_setattr syscalls

Christian Brauner (2):
      Merge patch series "fs: introduce file_getattr and file_setattr syscalls"
      tree-wide: s/struct fileattr/struct file_kattr/g

Dan Carpenter (1):
      fs: tighten a sanity check in file_attr_to_fileattr()

 Documentation/filesystems/locking.rst       |   4 +-
 Documentation/filesystems/vfs.rst           |   4 +-
 arch/alpha/kernel/syscalls/syscall.tbl      |   2 +
 arch/arm/tools/syscall.tbl                  |   2 +
 arch/arm64/tools/syscall_32.tbl             |   2 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   2 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   2 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   2 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   2 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   2 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   2 +
 arch/s390/kernel/syscalls/syscall.tbl       |   2 +
 arch/sh/kernel/syscalls/syscall.tbl         |   2 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   2 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   2 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   2 +
 fs/Makefile                                 |   3 +-
 fs/bcachefs/fs.c                            |   4 +-
 fs/btrfs/ioctl.c                            |   4 +-
 fs/btrfs/ioctl.h                            |   6 +-
 fs/ecryptfs/inode.c                         |   4 +-
 fs/efivarfs/inode.c                         |   4 +-
 fs/ext2/ext2.h                              |   4 +-
 fs/ext2/ioctl.c                             |   4 +-
 fs/ext4/ext4.h                              |   4 +-
 fs/ext4/ioctl.c                             |   4 +-
 fs/f2fs/f2fs.h                              |   4 +-
 fs/f2fs/file.c                              |   4 +-
 fs/file_attr.c                              | 498 ++++++++++++++++++++++++++++
 fs/fuse/fuse_i.h                            |   4 +-
 fs/fuse/ioctl.c                             |   8 +-
 fs/gfs2/file.c                              |   4 +-
 fs/gfs2/inode.h                             |   4 +-
 fs/hfsplus/hfsplus_fs.h                     |   4 +-
 fs/hfsplus/inode.c                          |   4 +-
 fs/ioctl.c                                  | 309 -----------------
 fs/jfs/ioctl.c                              |   4 +-
 fs/jfs/jfs_inode.h                          |   4 +-
 fs/nilfs2/ioctl.c                           |   4 +-
 fs/nilfs2/nilfs.h                           |   4 +-
 fs/ocfs2/ioctl.c                            |   4 +-
 fs/ocfs2/ioctl.h                            |   4 +-
 fs/orangefs/inode.c                         |   4 +-
 fs/overlayfs/copy_up.c                      |   6 +-
 fs/overlayfs/inode.c                        |  17 +-
 fs/overlayfs/overlayfs.h                    |  10 +-
 fs/overlayfs/util.c                         |   2 +-
 fs/ubifs/ioctl.c                            |   4 +-
 fs/ubifs/ubifs.h                            |   4 +-
 fs/xfs/xfs_ioctl.c                          |  18 +-
 fs/xfs/xfs_ioctl.h                          |   4 +-
 include/linux/fileattr.h                    |  38 ++-
 include/linux/fs.h                          |   6 +-
 include/linux/lsm_hook_defs.h               |   2 +
 include/linux/security.h                    |  16 +
 include/linux/syscalls.h                    |   7 +
 include/uapi/asm-generic/unistd.h           |   8 +-
 include/uapi/linux/fs.h                     |  18 +
 mm/shmem.c                                  |   4 +-
 scripts/syscall.tbl                         |   2 +
 security/security.c                         |  30 ++
 security/selinux/hooks.c                    |  14 +
 64 files changed, 752 insertions(+), 410 deletions(-)
 create mode 100644 fs/file_attr.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (4 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28  1:29   ` Hugh Dickins
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
                   ` (8 subsequent siblings)
  14 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This adds the new FS_IOC_GETLBMD_CAP ioctl() to query metadata and
protection info (PI) capabilities. This ioctl returns information about
the files integrity profile. This is useful for userspace applications
to understand a files end-to-end data protection support and configure
the I/O accordingly.

For now this interface is only supported by block devices. However the
design and placement of this ioctl in generic FS ioctl space allows us
to extend it to work over files as well. This maybe useful when
filesystems start supporting PI-aware layouts.

A new structure struct logical_block_metadata_cap is introduced, which
contains the following fields:

- lbmd_flags:
  bitmask of logical block metadata capability flags

- lbmd_interval:
  the amount of data described by each unit of logical block metadata

- lbmd_size:
  size in bytes of the logical block metadata associated with each
  interval

- lbmd_opaque_size:
  size in bytes of the opaque block tag associated with each interval

- lbmd_opaque_offset:
  offset in bytes of the opaque block tag within the logical block
  metadata

- lbmd_pi_size:
  size in bytes of the T10 PI tuple associated with each interval

- lbmd_pi_offset:
  offset in bytes of T10 PI tuple within the logical block metadata

- lbmd_pi_guard_tag_type:
  T10 PI guard tag type
    
- lbmd_pi_app_tag_size:
   size in bytes of the T10 PI application tag

- lbmd_pi_ref_tag_size:
   size in bytes of the T10 PI reference tag

- lbmd_pi_storage_tag_size:
  size in bytes of the T10 PI storage tag

The internal logic to fetch the capability is encapsulated in a helper
function blk_get_meta_cap(), which uses the blk_integrity profile
associated with the device. The ioctl returns -EOPNOTSUPP, if
CONFIG_BLK_DEV_INTEGRITY is not enabled.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity

for you to fetch changes up to bc5b0c8febccbeabfefc9b59083b223ec7c7b53a:

  block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP (2025-07-23 14:55:51 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.integrity tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.integrity

----------------------------------------------------------------
Anuj Gupta (5):
      block: rename tuple_size field in blk_integrity to metadata_size
      block: introduce pi_tuple_size field in blk_integrity
      nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
      fs: add ioctl to query metadata and protection info capabilities
      block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP

Arnd Bergmann (1):
      block: fix FS_IOC_GETLBMD_CAP parsing in blkdev_common_ioctl()

Christian Brauner (1):
      Merge patch series "add ioctl to query metadata and protection info capabilities"

 block/bio-integrity-auto.c        |  4 +--
 block/blk-integrity.c             | 70 ++++++++++++++++++++++++++++++++++++++-
 block/blk-settings.c              | 44 ++++++++++++++++++++++--
 block/ioctl.c                     |  6 ++++
 block/t10-pi.c                    | 16 ++++-----
 drivers/md/dm-crypt.c             |  4 +--
 drivers/md/dm-integrity.c         | 12 +++----
 drivers/nvdimm/btt.c              |  2 +-
 drivers/nvme/host/core.c          |  7 ++--
 drivers/nvme/target/io-cmd-bdev.c |  2 +-
 drivers/scsi/sd_dif.c             |  3 +-
 include/linux/blk-integrity.h     | 11 ++++--
 include/linux/blkdev.h            |  3 +-
 include/uapi/linux/fs.h           | 59 +++++++++++++++++++++++++++++++++
 14 files changed, 213 insertions(+), 30 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 14/14 for v6.17] vfs iomap
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (5 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-27 13:10   ` Sasha Levin
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the iomap updates for this cycle:

- Refactor the iomap writeback code and split the generic and ioend/bio
  based writeback code. There are two methods that define the split
  between the generic writeback code, and the implemementation of it,
  and all knowledge of ioends and bios now sits below that layer.

- This series adds fuse iomap support for buffered writes and dirty
  folio writeback. This is needed so that granular uptodate and dirty
  tracking can be used in fuse when large folios are enabled. This has
  two big advantages. For writes, instead of the entire folio needing to
  be read into the page cache, only the relevant portions need to be.
  For writeback, only the dirty portions need to be written back instead
  of the entire folio.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

This contains a merge conflict with mainline that can be resolved as follows:

diff --cc fs/fuse/file.c
index 2ddfb3bb6483,f16426fd2bf5..000000000000
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.iomap

for you to fetch changes up to d5212d819e02313f27c867e6d365e71f1fdaaca4:

  Merge patch series "fuse: use iomap for buffered writes + writeback" (2025-07-17 09:55:23 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.iomap tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.iomap

----------------------------------------------------------------
Christian Brauner (2):
      Merge patch series "refactor the iomap writeback code v5"
      Merge patch series "fuse: use iomap for buffered writes + writeback"

Christoph Hellwig (11):
      iomap: header diet
      iomap: pass more arguments using the iomap writeback context
      iomap: refactor the writeback interface
      iomap: hide ioends from the generic writeback code
      iomap: move all ioend handling to ioend.c
      iomap: rename iomap_writepage_map to iomap_writeback_folio
      iomap: export iomap_writeback_folio
      iomap: replace iomap_folio_ops with iomap_write_ops
      iomap: improve argument passing to iomap_read_folio_sync
      iomap: add read_folio_range() handler for buffered writes
      iomap: build the writeback code without CONFIG_BLOCK

Joanne Koong (8):
      iomap: cleanup the pending writeback tracking in iomap_writepage_map_blocks
      iomap: add public helpers for uptodate state manipulation
      iomap: move folio_unlock out of iomap_writeback_folio
      fuse: use iomap for buffered writes
      fuse: use iomap for writeback
      fuse: use iomap for folio laundering
      fuse: hook into iomap for invalidating and checking partial uptodateness
      fuse: refactor writeback to use iomap_writepage_ctx inode

 Documentation/filesystems/iomap/design.rst     |   3 -
 Documentation/filesystems/iomap/operations.rst |  57 ++-
 block/fops.c                                   |  37 +-
 fs/fuse/Kconfig                                |   1 +
 fs/fuse/file.c                                 | 345 +++++++--------
 fs/gfs2/aops.c                                 |   8 +-
 fs/gfs2/bmap.c                                 |  48 ++-
 fs/gfs2/bmap.h                                 |   1 +
 fs/gfs2/file.c                                 |   3 +-
 fs/iomap/Makefile                              |   6 +-
 fs/iomap/buffered-io.c                         | 553 ++++++++-----------------
 fs/iomap/direct-io.c                           |   5 -
 fs/iomap/fiemap.c                              |   3 -
 fs/iomap/internal.h                            |   1 -
 fs/iomap/ioend.c                               | 220 +++++++++-
 fs/iomap/iter.c                                |   1 -
 fs/iomap/seek.c                                |   4 -
 fs/iomap/swapfile.c                            |   3 -
 fs/iomap/trace.c                               |   1 -
 fs/iomap/trace.h                               |   4 +-
 fs/xfs/xfs_aops.c                              | 212 ++++++----
 fs/xfs/xfs_file.c                              |   6 +-
 fs/xfs/xfs_iomap.c                             |  12 +-
 fs/xfs/xfs_iomap.h                             |   1 +
 fs/xfs/xfs_reflink.c                           |   3 +-
 fs/zonefs/file.c                               |  40 +-
 include/linux/iomap.h                          |  82 ++--
 27 files changed, 859 insertions(+), 801 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 01/14 for v6.17] vfs misc
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (6 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains the usual selections of misc updates for this cycle.

Features:

- Add ext4 IOCB_DONTCACHE support

  This refactors the address_space_operations write_begin() and
  write_end() callbacks to take const struct kiocb * as their first
  argument, allowing IOCB flags such as IOCB_DONTCACHE to propagate to
  the filesystem's buffered I/O path.

  Ext4 is updated to implement handling of the IOCB_DONTCACHE flag and
  advertises support via the FOP_DONTCACHE file operation flag.

  Additionally, the i915 driver's shmem write paths are updated to
  bypass the legacy write_begin/write_end interface in favor of directly
  calling write_iter() with a constructed synchronous kiocb. Another
  i915 change replaces a manual write loop with kernel_write() during
  GEM shmem object creation.

Cleanups:

- don't duplicate vfs_open() in kernel_file_open()

- proc_fd_getattr(): don't bother with S_ISDIR() check

- fs/ecryptfs: replace snprintf with sysfs_emit in show function

- vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()

- filelock: add new locks_wake_up_waiter() helper

- fs: Remove three arguments from block_write_end()

- VFS: change old_dir and new_dir in struct renamedata to dentrys

- netfs: Remove unused declaration netfs_queue_write_request()

Fixes:

- eventpoll: Fix semi-unbounded recursion

- eventpoll: fix sphinx documentation build warning

- fs/read_write: Fix spelling typo

- fs: annotate data race between poll_schedule_timeout() and pollwake()

- fs/pipe: set FMODE_NOWAIT in create_pipe_files()

- docs/vfs: update references to i_mutex to i_rwsem

- fs/buffer: remove comment about hard sectorsize

- fs/buffer: remove the min and max limit checks in __getblk_slow()

- fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable

- fs_context: fix parameter name in infofc() macro

- fs: Prevent file descriptor table allocations exceeding INT_MAX

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.misc

for you to fetch changes up to 4e8fc4f7208b032674ef8a4977b96484c328515c:

  netfs: Remove unused declaration netfs_queue_write_request() (2025-07-23 15:08:36 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.misc tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.misc

----------------------------------------------------------------
Al Viro (2):
      don't duplicate vfs_open() in kernel_file_open()
      proc_fd_getattr(): don't bother with S_ISDIR() check

Andy Shevchenko (1):
      fs/read_write: Fix spelling typo

Ankit Chauhan (1):
      fs/ecryptfs: replace snprintf with sysfs_emit in show function

Christian Brauner (1):
      Merge patch series "fs: refactor write_begin/write_end and add ext4 IOCB_DONTCACHE support"

Dmitry Antipov (1):
      fs: annotate suspected data race between poll_schedule_timeout() and pollwake()

Jan Kara (1):
      vfs: Remove unnecessary list_for_each_entry_safe() from evict_inodes()

Jann Horn (2):
      eventpoll: Fix semi-unbounded recursion
      eventpoll: fix sphinx documentation build warning

Jeff Layton (1):
      filelock: add new locks_wake_up_waiter() helper

Jens Axboe (1):
      fs/pipe: set FMODE_NOWAIT in create_pipe_files()

Junxuan Liao (1):
      docs/vfs: update references to i_mutex to i_rwsem

Matthew Wilcox (Oracle) (1):
      fs: Remove three arguments from block_write_end()

NeilBrown (1):
      VFS: change old_dir and new_dir in struct renamedata to dentrys

Pankaj Raghav (3):
      fs/buffer: remove comment about hard sectorsize
      fs/buffer: remove the min and max limit checks in __getblk_slow()
      fs/libfs: don't assume blocksize <= PAGE_SIZE in generic_check_addressable

RubenKelevra (1):
      fs_context: fix parameter name in infofc() macro

Sasha Levin (1):
      fs: Prevent file descriptor table allocations exceeding INT_MAX

Taotao Chen (5):
      drm/i915: Use kernel_write() in shmem object create
      drm/i915: Refactor shmem_pwrite() to use kiocb and write_iter
      fs: change write_begin/write_end interface to take struct kiocb *
      mm/pagemap: add write_begin_get_folio() helper function
      ext4: support uncached buffered I/O

Yue Haibing (1):
      netfs: Remove unused declaration netfs_queue_write_request()

 Documentation/filesystems/locking.rst     |   4 +-
 Documentation/filesystems/vfs.rst         |  11 +--
 block/fops.c                              |  15 ++--
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 115 ++++++++----------------------
 fs/adfs/inode.c                           |   9 +--
 fs/affs/file.c                            |  26 ++++---
 fs/attr.c                                 |  10 +--
 fs/bcachefs/fs-io-buffered.c              |   4 +-
 fs/bcachefs/fs-io-buffered.h              |   4 +-
 fs/bfs/file.c                             |   7 +-
 fs/buffer.c                               |  47 ++++++------
 fs/cachefiles/namei.c                     |   4 +-
 fs/ceph/addr.c                            |  10 ++-
 fs/dcache.c                               |  10 +--
 fs/direct-io.c                            |   8 +--
 fs/ecryptfs/inode.c                       |   4 +-
 fs/ecryptfs/main.c                        |   3 +-
 fs/ecryptfs/mmap.c                        |  10 +--
 fs/eventpoll.c                            |  58 +++++++++++----
 fs/exfat/file.c                           |  11 ++-
 fs/exfat/inode.c                          |  16 +++--
 fs/ext2/dir.c                             |   2 +-
 fs/ext2/inode.c                           |  11 +--
 fs/ext4/file.c                            |   3 +-
 fs/ext4/inode.c                           |  35 ++++-----
 fs/f2fs/data.c                            |   8 ++-
 fs/fat/inode.c                            |  18 ++---
 fs/file.c                                 |  15 ++++
 fs/fuse/file.c                            |  14 ++--
 fs/hfs/hfs_fs.h                           |   2 +-
 fs/hfs/inode.c                            |   4 +-
 fs/hfsplus/hfsplus_fs.h                   |   6 +-
 fs/hfsplus/inode.c                        |   8 ++-
 fs/hostfs/hostfs_kern.c                   |   8 ++-
 fs/hpfs/file.c                            |  18 ++---
 fs/hugetlbfs/inode.c                      |   9 +--
 fs/inode.c                                |  13 ++--
 fs/iomap/buffered-io.c                    |   3 +-
 fs/jffs2/file.c                           |  28 ++++----
 fs/jfs/inode.c                            |  16 +++--
 fs/libfs.c                                |  26 ++++---
 fs/locks.c                                |   4 +-
 fs/minix/dir.c                            |   2 +-
 fs/minix/inode.c                          |   7 +-
 fs/namei.c                                |  29 ++++----
 fs/namespace.c                            |   2 +-
 fs/nfs/file.c                             |   8 ++-
 fs/nfsd/vfs.c                             |   7 +-
 fs/nilfs2/dir.c                           |   2 +-
 fs/nilfs2/inode.c                         |   8 ++-
 fs/nilfs2/recovery.c                      |   3 +-
 fs/ntfs3/file.c                           |   4 +-
 fs/ntfs3/inode.c                          |   7 +-
 fs/ntfs3/ntfs_fs.h                        |  10 +--
 fs/ocfs2/aops.c                           |   6 +-
 fs/omfs/file.c                            |   7 +-
 fs/open.c                                 |   5 +-
 fs/orangefs/inode.c                       |  16 +++--
 fs/overlayfs/copy_up.c                    |   6 +-
 fs/overlayfs/dir.c                        |  16 ++---
 fs/overlayfs/overlayfs.h                  |  16 ++---
 fs/overlayfs/readdir.c                    |   2 +-
 fs/overlayfs/super.c                      |   2 +-
 fs/overlayfs/util.c                       |   2 +-
 fs/pipe.c                                 |   8 ++-
 fs/proc/fd.c                              |  11 +--
 fs/read_write.c                           |   2 +-
 fs/select.c                               |   4 +-
 fs/smb/server/vfs.c                       |   4 +-
 fs/stack.c                                |   4 +-
 fs/ubifs/file.c                           |   8 ++-
 fs/udf/inode.c                            |  11 +--
 fs/ufs/dir.c                              |   2 +-
 fs/ufs/inode.c                            |  16 +++--
 fs/vboxsf/file.c                          |   5 +-
 fs/xattr.c                                |   2 +-
 include/linux/buffer_head.h               |   8 +--
 include/linux/exportfs.h                  |   4 +-
 include/linux/filelock.h                  |   7 +-
 include/linux/fs.h                        |  25 +++----
 include/linux/fs_context.h                |   2 +-
 include/linux/fs_stack.h                  |   2 +-
 include/linux/netfs.h                     |   1 -
 include/linux/pagemap.h                   |  27 +++++++
 include/linux/quotaops.h                  |   2 +-
 io_uring/openclose.c                      |   2 -
 mm/filemap.c                              |   4 +-
 mm/shmem.c                                |  12 ++--
 88 files changed, 520 insertions(+), 457 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 07/14 for v6.17] vfs mmap
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (7 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Last cycle we introduce f_op->mmap_prepare() in c84bf6dd2b83 ("mm:
introduce new .mmap_prepare() file callback").

This is preferred to the existing f_op->mmap() hook as it does require a
VMA to be established yet, thus allowing the mmap logic to invoke this
hook far, far earlier, prior to inserting a VMA into the virtual address
space, or performing any other heavy handed operations.

This allows for much simpler unwinding on error, and for there to be a
single attempt at merging a VMA rather than having to possibly reattempt
a merge based on potentially altered VMA state.

Far more importantly, it prevents inappropriate manipulation of
incompletely initialised VMA state, which is something that has been the
cause of bugs and complexity in the past.

The intent is to gradually deprecate f_op->mmap, and in that vein this
series coverts the majority of file systems to using f_op->mmap_prepare.

Prerequisite steps are taken - firstly ensuring all checks for mmap
capabilities use the file_has_valid_mmap_hooks() helper rather than
directly checking for f_op->mmap (which is now not a valid check) and
secondly updating daxdev_mapping_supported() to not require a VMA
parameter to allow ext4 and xfs to be converted.

Commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for
nested file systems") handles the nasty edge-case of nested file systems
like overlayfs, which introduces a compatibility shim to allow
f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.

This allows for nested filesystems to continue to function correctly
with all file systems regardless of which callback is used. Once we
finally convert all file systems, this shim can be removed.

As a result, ecryptfs, fuse, and overlayfs remain unaltered so they can
nest all other file systems.

We additionally do not update resctl - as this requires an update to
remap_pfn_range() (or an alternative to it) which we defer to a later
series, equally we do not update cramfs which needs a mixed mapping
insertion with the same issue, nor do we update procfs, hugetlbfs, syfs
or kernfs all of which require VMAs for internal state and hooks. We
shall return to all of these later.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

This will have a merge conflict with mainline that can be resolved as follows:

diff --cc Documentation/filesystems/porting.rst
index 200226bfd6cf,48fff4c407f3..000000000000
--- a/Documentation/filesystems/porting.rst
+++ b/Documentation/filesystems/porting.rst
@@@ -1249,9 -1252,12 +1249,21 @@@ an extra reference to new mount - it sh

  ---

 +collect_mounts()/drop_collected_mounts()/iterate_mounts() are gone now.
 +Replacement is collect_paths()/drop_collected_path(), with no special
 +iterator needed.  Instead of a cloned mount tree, the new interface returns
 +an array of struct path, one for each mount collect_mounts() would've
 +created.  These struct path point to locations in the caller's namespace
 +that would be roots of the cloned mounts.
++
++---
++
+ **highly recommended**
+
+ The file operations mmap() callback is deprecated in favour of
+ mmap_prepare(). This passes a pointer to a vm_area_desc to the callback
+ rather than a VMA, as the VMA at this stage is not yet valid.
+
+ The vm_area_desc provides the minimum required information for a filesystem
+ to initialise state upon memory mapping of a file-backed region, and output
+ parameters for the file system to set this state.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit e04c78d86a9699d136910cfc0bdcf01087e3267e:

  Linux 6.16-rc2 (2025-06-15 13:49:41 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.mmap_prepare

for you to fetch changes up to 425c8bb39b032bfb338857476eff5bbee324343e:

  doc: update porting, vfs documentation to describe mmap_prepare() (2025-07-23 15:09:14 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.mmap_prepare tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.mmap_prepare

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "convert the majority of file systems to mmap_prepare"

Lorenzo Stoakes (11):
      mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
      mm/nommu: use file_has_valid_mmap_hooks() helper
      fs: consistently use can_mmap_file() helper
      fs/dax: make it possible to check dev dax support without a VMA
      fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
      fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
      mm/filemap: introduce generic_file_*_mmap_prepare() helpers
      fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
      fs: convert most other generic_file_*mmap() users to .mmap_prepare()
      fs: replace mmap hook with .mmap_prepare for simple mappings
      doc: update porting, vfs documentation to describe mmap_prepare()

 Documentation/filesystems/porting.rst      | 12 +++++++++++
 Documentation/filesystems/vfs.rst          | 22 +++++++++++++++----
 block/fops.c                               | 12 +++++------
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c |  2 +-
 fs/9p/vfs_file.c                           | 13 ++++++------
 fs/adfs/file.c                             |  2 +-
 fs/affs/file.c                             |  2 +-
 fs/afs/file.c                              | 12 +++++------
 fs/aio.c                                   |  8 +++----
 fs/backing-file.c                          |  4 ++--
 fs/bcachefs/fs.c                           |  8 +++----
 fs/bfs/file.c                              |  2 +-
 fs/binfmt_elf.c                            |  4 ++--
 fs/binfmt_elf_fdpic.c                      |  2 +-
 fs/btrfs/file.c                            |  7 +++---
 fs/ceph/addr.c                             |  6 +++---
 fs/ceph/file.c                             |  2 +-
 fs/ceph/super.h                            |  2 +-
 fs/coda/file.c                             |  6 +++---
 fs/ecryptfs/file.c                         |  2 +-
 fs/erofs/data.c                            | 16 +++++++-------
 fs/exfat/file.c                            | 10 +++++----
 fs/ext2/file.c                             | 12 ++++++-----
 fs/ext4/file.c                             | 13 ++++++------
 fs/f2fs/file.c                             |  7 +++---
 fs/fat/file.c                              |  2 +-
 fs/hfs/inode.c                             |  2 +-
 fs/hfsplus/inode.c                         |  2 +-
 fs/hostfs/hostfs_kern.c                    |  2 +-
 fs/hpfs/file.c                             |  2 +-
 fs/jffs2/file.c                            |  2 +-
 fs/jfs/file.c                              |  2 +-
 fs/minix/file.c                            |  2 +-
 fs/nfs/file.c                              | 13 ++++++------
 fs/nfs/internal.h                          |  2 +-
 fs/nfs/nfs4file.c                          |  2 +-
 fs/nilfs2/file.c                           |  8 +++----
 fs/ntfs3/file.c                            | 15 +++++++------
 fs/ocfs2/file.c                            |  4 ++--
 fs/ocfs2/mmap.c                            |  5 +++--
 fs/ocfs2/mmap.h                            |  2 +-
 fs/omfs/file.c                             |  2 +-
 fs/orangefs/file.c                         | 10 +++++----
 fs/ramfs/file-mmu.c                        |  2 +-
 fs/ramfs/file-nommu.c                      | 12 +++++------
 fs/read_write.c                            |  2 +-
 fs/romfs/mmap-nommu.c                      |  6 +++---
 fs/smb/client/cifsfs.c                     | 12 +++++------
 fs/smb/client/cifsfs.h                     |  4 ++--
 fs/smb/client/file.c                       | 16 +++++++-------
 fs/ubifs/file.c                            | 10 ++++-----
 fs/ufs/file.c                              |  2 +-
 fs/vboxsf/file.c                           |  8 +++----
 fs/xfs/xfs_file.c                          | 15 +++++++------
 fs/zonefs/file.c                           | 10 +++++----
 include/linux/dax.h                        | 16 ++++++++------
 include/linux/fs.h                         | 13 ++++++------
 ipc/shm.c                                  |  2 +-
 mm/filemap.c                               | 29 +++++++++++++++++++++++++
 mm/internal.h                              |  2 +-
 mm/mmap.c                                  |  2 +-
 mm/nommu.c                                 |  2 +-
 mm/vma.c                                   |  2 +-
 tools/testing/vma/vma_internal.h           | 34 ++++++++++++++++++++++++------
 64 files changed, 281 insertions(+), 187 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 04/14 for v6.17] namespace updates
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (8 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains namespace updates. This time specifically for nsfs:

- Userspace heavily relies on the root inode numbers for namespaces to
  identify the initial namespaces. That's already a hard dependency. So
  we cannot change that anymore. Move the initial inode numbers to a
  public header and align the only two namespaces that currently don't
  do that with all the other namespaces.

- The root inode of /proc having a fixed inode number has been part of
  the core kernel ABI since its inception, and recently some userspace
  programs (mainly container runtimes) have started to explicitly depend
  on this behaviour.

  The main reason this is useful to userspace is that by checking that a
  suspect /proc handle has fstype PROC_SUPER_MAGIC and is
  PROCFS_ROOT_INO, they can then use
  openat2(RESOLVE_{NO_{XDEV,MAGICLINK},BENEATH}) to ensure that there
  isn't a bind-mount that replaces some procfs file with a different
  one. This kind of attack has lead to security issues in container
  runtimes in the past (such as CVE-2019-19921) and libraries like
  libpathrs[1] use this feature of procfs to provide safe procfs
  handling functions.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.nsfs

for you to fetch changes up to 76fdb7eb4e1c91086ce9c3db6972c2ed48c96afb:

  uapi: export PROCFS_ROOT_INO (2025-07-10 09:39:18 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.nsfs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.nsfs

----------------------------------------------------------------
Aleksa Sarai (1):
      uapi: export PROCFS_ROOT_INO

Christian Brauner (4):
      nsfs: move root inode number to uapi
      netns: use stable inode number for initial mount ns
      mntns: use stable inode number for initial mount ns
      Merge patch series "nsfs: expose the stable inode numbers in a public header"

 fs/namespace.c            |  4 +++-
 fs/proc/root.c            | 10 +++++-----
 include/linux/proc_ns.h   | 16 +++++++++-------
 include/uapi/linux/fs.h   | 11 +++++++++++
 include/uapi/linux/nsfs.h | 11 +++++++++++
 net/core/net_namespace.c  |  8 ++++++++
 6 files changed, 47 insertions(+), 13 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 03/14 for v6.17] overlayfs
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (9 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains overlayfs updates for this cycle. Note that some of the
changes depend on parts of the vfs misc pull request this cycle.

They're shown in the diffstat for clarity but will obviously be already
included in the vfs misc pull request that I'm pretty sure you're going
to merge before anyway.

The changes for overlayfs in here are primarily focussed on preparing
for some proposed changes to directory locking.

Overlayfs currently will sometimes lock a directory on the upper
filesystem and do a few different things while holding the lock. This is
incompatible with the new potential scheme.

This series narrows the region of code protected by the directory lock,
taking it multiple times when necessary. This theoretically opens up
the possibilty of other changes happening on the upper filesytem between
the unlock and the lock. To some extent the patches guard against that
by checking the dentries still have the expect parent after retaking the
lock. In general, concurrent changes to the upper and lower filesystems
aren't supported properly anyway.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.ovl

for you to fetch changes up to 672820a070ea5e6ae114f6109726a4e18313a527:

  ovl: properly print correct variable (2025-07-25 10:20:36 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.ovl tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.ovl

----------------------------------------------------------------
Al Viro (2):
      don't duplicate vfs_open() in kernel_file_open()
      proc_fd_getattr(): don't bother with S_ISDIR() check

Amir Goldstein (3):
      fs: constify file ptr in backing_file accessor helpers
      ovl: remove unneeded non-const conversion
      ovl: support layers on case-folding capable filesystems

Andy Shevchenko (1):
      fs/read_write: Fix spelling typo

Antonio Quartulli (1):
      ovl: properly print correct variable

Christian Brauner (2):
      Merge patch series "backing_file accessors cleanup"
      Merge patch series "ovl: narrow regions protected by i_rw_sem"

Jeff Layton (1):
      filelock: add new locks_wake_up_waiter() helper

Jens Axboe (1):
      fs/pipe: set FMODE_NOWAIT in create_pipe_files()

NeilBrown (22):
      VFS: change old_dir and new_dir in struct renamedata to dentrys
      ovl: simplify an error path in ovl_copy_up_workdir()
      ovl: change ovl_create_index() to take dir locks
      ovl: Call ovl_create_temp() without lock held.
      ovl: narrow the locked region in ovl_copy_up_workdir()
      ovl: narrow locking in ovl_create_upper()
      ovl: narrow locking in ovl_clear_empty()
      ovl: narrow locking in ovl_create_over_whiteout()
      ovl: simplify gotos in ovl_rename()
      ovl: narrow locking in ovl_rename()
      ovl: narrow locking in ovl_cleanup_whiteouts()
      ovl: narrow locking in ovl_cleanup_index()
      ovl: narrow locking in ovl_workdir_create()
      ovl: narrow locking in ovl_indexdir_cleanup()
      ovl: narrow locking in ovl_workdir_cleanup_recurse()
      ovl: change ovl_workdir_cleanup() to take dir lock as needed.
      ovl: narrow locking on ovl_remove_and_whiteout()
      ovl: change ovl_cleanup_and_whiteout() to take rename lock as needed
      ovl: narrow locking in ovl_whiteout()
      ovl: narrow locking in ovl_check_rename_whiteout()
      ovl: change ovl_create_real() to receive dentry parent
      ovl: rename ovl_cleanup_unlocked() to ovl_cleanup()

 fs/backing-file.c        |   4 +-
 fs/cachefiles/namei.c    |   4 +-
 fs/ecryptfs/inode.c      |   4 +-
 fs/file_table.c          |  13 ++-
 fs/internal.h            |   1 +
 fs/locks.c               |   2 +-
 fs/namei.c               |   7 +-
 fs/nfsd/vfs.c            |   7 +-
 fs/open.c                |   5 +-
 fs/overlayfs/copy_up.c   |  52 +++++-----
 fs/overlayfs/dir.c       | 260 +++++++++++++++++++++++++----------------------
 fs/overlayfs/file.c      |   2 +-
 fs/overlayfs/namei.c     |  31 +++++-
 fs/overlayfs/overlayfs.h |  45 +++++---
 fs/overlayfs/ovl_entry.h |   1 +
 fs/overlayfs/params.c    |  12 +--
 fs/overlayfs/readdir.c   |  44 ++++----
 fs/overlayfs/super.c     |  50 ++++-----
 fs/overlayfs/util.c      |  46 ++++++---
 fs/pipe.c                |   8 +-
 fs/proc/fd.c             |  11 +-
 fs/read_write.c          |   2 +-
 fs/smb/server/vfs.c      |   4 +-
 include/linux/filelock.h |   7 +-
 include/linux/fs.h       |  14 +--
 io_uring/openclose.c     |   2 -
 26 files changed, 353 insertions(+), 285 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 08/14 for v6.17] vfs pidfs
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (10 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains updates for pidfs:

- persistent info

  Persist exit and coredump information independent of whether anyone
  currently holds a pidfd for the struct pid.

  The current scheme allocated pidfs dentries on-demand repeatedly. This
  scheme is reaching it's limits as it makes it impossible to pin
  information that needs to be available after the task has exited or
  coredumped and that should not be lost simply because the pidfd got
  closed temporarily. The next opener should still see the stashed
  information.

  This is also a prerequisite for supporting extended attributes on
  pidfds to allow attaching meta information to them.

  If someone opens a pidfd for a struct pid a pidfs dentry is allocated
  and stashed in pid->stashed. Once the last pidfd for the struct pid is
  closed the pidfs dentry is released and removed from pid->stashed.

  So if 10 callers create a pidfs dentry for the same struct pid
  sequentially, i.e., each closing the pidfd before the other creates a
  new one then a new pidfs dentry is allocated every time.

  Because multiple tasks acquiring and releasing a pidfd for the same
  struct pid can race with each another a task may still find a valid
  pidfs entry from the previous task in pid->stashed and reuse it. Or it
  might find a dead dentry in there and fail to reuse it and so stashes
  a new pidfs dentry. Multiple tasks may race to stash a new pidfs
  dentry but only one will succeed, the other ones will put their
  dentry.

  The current scheme aims to ensure that a pidfs dentry for a struct pid
  can only be created if the task is still alive or if a pidfs dentry
  already existed before the task was reaped and so exit information has
  been was stashed in the pidfs inode.

  That's great except that it's buggy. If a pidfs dentry is stashed in
  pid->stashed after pidfs_exit() but before __unhash_process() is
  called we will return a pidfd for a reaped task without exit
  information being available.

  The pidfds_pid_valid() check does not guard against this race as it
  doens't sync at all with pidfs_exit(). The pid_has_task() check might
  be successful simply because we're before __unhash_process() but after
  pidfs_exit().

  Introduce a new scheme where the lifetime of information associated
  with a pidfs entry (coredump and exit information) isn't bound to the
  lifetime of the pidfs inode but the struct pid itself.

  The first time a pidfs dentry is allocated for a struct pid a struct
  pidfs_attr will be allocated which will be used to store exit and
  coredump information.

  If all pidfs for the pidfs dentry are closed the dentry and inode can
  be cleaned up but the struct pidfs_attr will stick until the struct
  pid itself is freed. This will ensure minimal memory usage while
  persisting relevant information.

  The new scheme has various advantages. First, it allows to close the
  race where we end up handing out a pidfd for a reaped task for which
  no exit information is available. Second, it minimizes memory usage.
  Third, it allows to remove complex lifetime tracking via dentries when
  registering a struct pid with pidfs. There's no need to get or put a
  reference. Instead, the lifetime of exit and coredump information
  associated with a struct pid is bound to the lifetime of struct pid
  itself.

- extended attributes

  Now that we have a way to persist information for pidfs dentries we
  can start supporting extended attributes on pidfds. This will allow
  userspace to attach meta information to tasks.

  One natural extension would be to introduce a custom pidfs.* extended
  attribute space and allow for the inheritance of extended attributes
  across fork() and exec().

  The first simple scheme will allow privileged userspace to set trusted
  extended attributes on pidfs inodes.

- Allow autonomous pidfs file handles 

  Various filesystems such as pidfs and drm support opening file handles
  without having to require a file descriptor to identify the
  filesystem. The filesystem are global single instances and can be
  trivially identified solely on the information encoded in the file
  handle.

  This makes it possible to not have to keep or acquire a sentinal file
  descriptor just to pass it to open_by_handle_at() to identify the
  filesystem. That's especially useful when such sentinel file
  descriptor cannot or should not be acquired.

  For pidfs this means a file handle can function as full replacement
  for storing a pid in a file. Instead a file handle can be stored and
  reopened purely based on the file handle.

  Such autonomous file handles can be opened with or without specifying
  a a file descriptor. If no proper file descriptor is used the
  FD_PIDFS_ROOT sentinel must be passed. This allows us to define
  further special negative fd sentinels in the future.

  Userspace can trivially test for support by trying to open the file
  handle with an invalid file descriptor.

- Allow pidfds for reaped tasks with SCM_PIDFD messages

  This is a logical continuation of the earlier work to create pidfds
  for reaped tasks through the SO_PEERPIDFD socket option merged in
  923ea4d4482b ("Merge patch series "net, pidfs: enable handing out
  pidfds for reaped sk->sk_peer_pid"").

- Two minor fixes:

  * Fold fs_struct->{lock,seq} into a seqlock

  * Don't bother with path_{get,put}() in unix_open_file()

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

This will have a merge conflict with the vfs coredump pull request for this
cycle which can be resolved as follows. This is a bit larger than usual. Feel
free to ask me to resend a combined pull request of pidfs and coredumping:

diff --cc fs/coredump.c
index fadf9d4be2e1,55d6a713a0fb..000000000000
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@@ -662,439 -632,8 +662,433 @@@ static int umh_coredump_setup(struct su
  	return 0;
  }
  
 -void do_coredump(const kernel_siginfo_t *siginfo)
 +#ifdef CONFIG_UNIX
 +static bool coredump_sock_connect(struct core_name *cn, struct coredump_params *cprm)
 +{
 +	struct file *file __free(fput) = NULL;
 +	struct sockaddr_un addr = {
 +		.sun_family = AF_UNIX,
 +	};
 +	ssize_t addr_len;
 +	int retval;
 +	struct socket *socket;
 +
 +	addr_len = strscpy(addr.sun_path, cn->corename);
 +	if (addr_len < 0)
 +		return false;
 +	addr_len += offsetof(struct sockaddr_un, sun_path) + 1;
 +
 +	/*
 +	 * It is possible that the userspace process which is supposed
 +	 * to handle the coredump and is listening on the AF_UNIX socket
 +	 * coredumps. Userspace should just mark itself non dumpable.
 +	 */
 +
 +	retval = sock_create_kern(&init_net, AF_UNIX, SOCK_STREAM, 0, &socket);
 +	if (retval < 0)
 +		return false;
 +
 +	file = sock_alloc_file(socket, 0, NULL);
 +	if (IS_ERR(file))
 +		return false;
 +
 +	/*
 +	 * Set the thread-group leader pid which is used for the peer
 +	 * credentials during connect() below. Then immediately register
 +	 * it in pidfs...
 +	 */
 +	cprm->pid = task_tgid(current);
 +	retval = pidfs_register_pid(cprm->pid);
 +	if (retval)
 +		return false;
 +
 +	/*
 +	 * ... and set the coredump information so userspace has it
 +	 * available after connect()...
 +	 */
 +	pidfs_coredump(cprm);
 +
 +	retval = kernel_connect(socket, (struct sockaddr *)(&addr), addr_len,
 +				O_NONBLOCK | SOCK_COREDUMP);
- 	/*
- 	 * ... Make sure to only put our reference after connect() took
- 	 * its own reference keeping the pidfs entry alive ...
- 	 */
- 	pidfs_put_pid(cprm->pid);
- 
 +	if (retval) {
 +		if (retval == -EAGAIN)
 +			coredump_report_failure("Coredump socket %s receive queue full", addr.sun_path);
 +		else
 +			coredump_report_failure("Coredump socket connection %s failed %d", addr.sun_path, retval);
 +		return false;
 +	}
 +
 +	/* ... and validate that @sk_peer_pid matches @cprm.pid. */
 +	if (WARN_ON_ONCE(unix_peer(socket->sk)->sk_peer_pid != cprm->pid))
 +		return false;
 +
 +	cprm->limit = RLIM_INFINITY;
 +	cprm->file = no_free_ptr(file);
 +
 +	return true;
 +}
 +
 +static inline bool coredump_sock_recv(struct file *file, struct coredump_ack *ack, size_t size, int flags)
 +{
 +	struct msghdr msg = {};
 +	struct kvec iov = { .iov_base = ack, .iov_len = size };
 +	ssize_t ret;
 +
 +	memset(ack, 0, size);
 +	ret = kernel_recvmsg(sock_from_file(file), &msg, &iov, 1, size, flags);
 +	return ret == size;
 +}
 +
 +static inline bool coredump_sock_send(struct file *file, struct coredump_req *req)
 +{
 +	struct msghdr msg = { .msg_flags = MSG_NOSIGNAL };
 +	struct kvec iov = { .iov_base = req, .iov_len = sizeof(*req) };
 +	ssize_t ret;
 +
 +	ret = kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(*req));
 +	return ret == sizeof(*req);
 +}
 +
 +static_assert(sizeof(enum coredump_mark) == sizeof(__u32));
 +
 +static inline bool coredump_sock_mark(struct file *file, enum coredump_mark mark)
 +{
 +	struct msghdr msg = { .msg_flags = MSG_NOSIGNAL };
 +	struct kvec iov = { .iov_base = &mark, .iov_len = sizeof(mark) };
 +	ssize_t ret;
 +
 +	ret = kernel_sendmsg(sock_from_file(file), &msg, &iov, 1, sizeof(mark));
 +	return ret == sizeof(mark);
 +}
 +
 +static inline void coredump_sock_wait(struct file *file)
 +{
 +	ssize_t n;
 +
 +	/*
 +	 * We use a simple read to wait for the coredump processing to
 +	 * finish. Either the socket is closed or we get sent unexpected
 +	 * data. In both cases, we're done.
 +	 */
 +	n = __kernel_read(file, &(char){ 0 }, 1, NULL);
 +	if (n > 0)
 +		coredump_report_failure("Coredump socket had unexpected data");
 +	else if (n < 0)
 +		coredump_report_failure("Coredump socket failed");
 +}
 +
 +static inline void coredump_sock_shutdown(struct file *file)
 +{
 +	struct socket *socket;
 +
 +	socket = sock_from_file(file);
 +	if (!socket)
 +		return;
 +
 +	/* Let userspace know we're done processing the coredump. */
 +	kernel_sock_shutdown(socket, SHUT_WR);
 +}
 +
 +static bool coredump_sock_request(struct core_name *cn, struct coredump_params *cprm)
 +{
 +	struct coredump_req req = {
 +		.size		= sizeof(struct coredump_req),
 +		.mask		= COREDUMP_KERNEL | COREDUMP_USERSPACE |
 +				  COREDUMP_REJECT | COREDUMP_WAIT,
 +		.size_ack	= sizeof(struct coredump_ack),
 +	};
 +	struct coredump_ack ack = {};
 +	ssize_t usize;
 +
 +	if (cn->core_type != COREDUMP_SOCK_REQ)
 +		return true;
 +
 +	/* Let userspace know what we support. */
 +	if (!coredump_sock_send(cprm->file, &req))
 +		return false;
 +
 +	/* Peek the size of the coredump_ack. */
 +	if (!coredump_sock_recv(cprm->file, &ack, sizeof(ack.size),
 +				MSG_PEEK | MSG_WAITALL))
 +		return false;
 +
 +	/* Refuse unknown coredump_ack sizes. */
 +	usize = ack.size;
 +	if (usize < COREDUMP_ACK_SIZE_VER0) {
 +		coredump_sock_mark(cprm->file, COREDUMP_MARK_MINSIZE);
 +		return false;
 +	}
 +
 +	if (usize > sizeof(ack)) {
 +		coredump_sock_mark(cprm->file, COREDUMP_MARK_MAXSIZE);
 +		return false;
 +	}
 +
 +	/* Now retrieve the coredump_ack. */
 +	if (!coredump_sock_recv(cprm->file, &ack, usize, MSG_WAITALL))
 +		return false;
 +	if (ack.size != usize)
 +		return false;
 +
 +	/* Refuse unknown coredump_ack flags. */
 +	if (ack.mask & ~req.mask) {
 +		coredump_sock_mark(cprm->file, COREDUMP_MARK_UNSUPPORTED);
 +		return false;
 +	}
 +
 +	/* Refuse mutually exclusive options. */
 +	if (hweight64(ack.mask & (COREDUMP_USERSPACE | COREDUMP_KERNEL |
 +				  COREDUMP_REJECT)) != 1) {
 +		coredump_sock_mark(cprm->file, COREDUMP_MARK_CONFLICTING);
 +		return false;
 +	}
 +
 +	if (ack.spare) {
 +		coredump_sock_mark(cprm->file, COREDUMP_MARK_UNSUPPORTED);
 +		return false;
 +	}
 +
 +	cn->mask = ack.mask;
 +	return coredump_sock_mark(cprm->file, COREDUMP_MARK_REQACK);
 +}
 +
 +static bool coredump_socket(struct core_name *cn, struct coredump_params *cprm)
 +{
 +	if (!coredump_sock_connect(cn, cprm))
 +		return false;
 +
 +	return coredump_sock_request(cn, cprm);
 +}
 +#else
 +static inline void coredump_sock_wait(struct file *file) { }
 +static inline void coredump_sock_shutdown(struct file *file) { }
 +static inline bool coredump_socket(struct core_name *cn, struct coredump_params *cprm) { return false; }
 +#endif
 +
 +/* cprm->mm_flags contains a stable snapshot of dumpability flags. */
 +static inline bool coredump_force_suid_safe(const struct coredump_params *cprm)
 +{
 +	/* Require nonrelative corefile path and be extra careful. */
 +	return __get_dumpable(cprm->mm_flags) == SUID_DUMP_ROOT;
 +}
 +
 +static bool coredump_file(struct core_name *cn, struct coredump_params *cprm,
 +			  const struct linux_binfmt *binfmt)
 +{
 +	struct mnt_idmap *idmap;
 +	struct inode *inode;
 +	struct file *file __free(fput) = NULL;
 +	int open_flags = O_CREAT | O_WRONLY | O_NOFOLLOW | O_LARGEFILE | O_EXCL;
 +
 +	if (cprm->limit < binfmt->min_coredump)
 +		return false;
 +
 +	if (coredump_force_suid_safe(cprm) && cn->corename[0] != '/') {
 +		coredump_report_failure("this process can only dump core to a fully qualified path, skipping core dump");
 +		return false;
 +	}
 +
 +	/*
 +	 * Unlink the file if it exists unless this is a SUID
 +	 * binary - in that case, we're running around with root
 +	 * privs and don't want to unlink another user's coredump.
 +	 */
 +	if (!coredump_force_suid_safe(cprm)) {
 +		/*
 +		 * If it doesn't exist, that's fine. If there's some
 +		 * other problem, we'll catch it at the filp_open().
 +		 */
 +		do_unlinkat(AT_FDCWD, getname_kernel(cn->corename));
 +	}
 +
 +	/*
 +	 * There is a race between unlinking and creating the
 +	 * file, but if that causes an EEXIST here, that's
 +	 * fine - another process raced with us while creating
 +	 * the corefile, and the other process won. To userspace,
 +	 * what matters is that at least one of the two processes
 +	 * writes its coredump successfully, not which one.
 +	 */
 +	if (coredump_force_suid_safe(cprm)) {
 +		/*
 +		 * Using user namespaces, normal user tasks can change
 +		 * their current->fs->root to point to arbitrary
 +		 * directories. Since the intention of the "only dump
 +		 * with a fully qualified path" rule is to control where
 +		 * coredumps may be placed using root privileges,
 +		 * current->fs->root must not be used. Instead, use the
 +		 * root directory of init_task.
 +		 */
 +		struct path root;
 +
 +		task_lock(&init_task);
 +		get_fs_root(init_task.fs, &root);
 +		task_unlock(&init_task);
 +		file = file_open_root(&root, cn->corename, open_flags, 0600);
 +		path_put(&root);
 +	} else {
 +		file = filp_open(cn->corename, open_flags, 0600);
 +	}
 +	if (IS_ERR(file))
 +		return false;
 +
 +	inode = file_inode(file);
 +	if (inode->i_nlink > 1)
 +		return false;
 +	if (d_unhashed(file->f_path.dentry))
 +		return false;
 +	/*
 +	 * AK: actually i see no reason to not allow this for named
 +	 * pipes etc, but keep the previous behaviour for now.
 +	 */
 +	if (!S_ISREG(inode->i_mode))
 +		return false;
 +	/*
 +	 * Don't dump core if the filesystem changed owner or mode
 +	 * of the file during file creation. This is an issue when
 +	 * a process dumps core while its cwd is e.g. on a vfat
 +	 * filesystem.
 +	 */
 +	idmap = file_mnt_idmap(file);
 +	if (!vfsuid_eq_kuid(i_uid_into_vfsuid(idmap, inode), current_fsuid())) {
 +		coredump_report_failure("Core dump to %s aborted: cannot preserve file owner", cn->corename);
 +		return false;
 +	}
 +	if ((inode->i_mode & 0677) != 0600) {
 +		coredump_report_failure("Core dump to %s aborted: cannot preserve file permissions", cn->corename);
 +		return false;
 +	}
 +	if (!(file->f_mode & FMODE_CAN_WRITE))
 +		return false;
 +	if (do_truncate(idmap, file->f_path.dentry, 0, 0, file))
 +		return false;
 +
 +	cprm->file = no_free_ptr(file);
 +	return true;
 +}
 +
 +static bool coredump_pipe(struct core_name *cn, struct coredump_params *cprm,
 +			  size_t *argv, int argc)
 +{
 +	int argi;
 +	char **helper_argv __free(kfree) = NULL;
 +	struct subprocess_info *sub_info;
 +
 +	if (cprm->limit == 1) {
 +		/* See umh_coredump_setup() which sets RLIMIT_CORE = 1.
 +		 *
 +		 * Normally core limits are irrelevant to pipes, since
 +		 * we're not writing to the file system, but we use
 +		 * cprm.limit of 1 here as a special value, this is a
 +		 * consistent way to catch recursive crashes.
 +		 * We can still crash if the core_pattern binary sets
 +		 * RLIM_CORE = !1, but it runs as root, and can do
 +		 * lots of stupid things.
 +		 *
 +		 * Note that we use task_tgid_vnr here to grab the pid
 +		 * of the process group leader.  That way we get the
 +		 * right pid if a thread in a multi-threaded
 +		 * core_pattern process dies.
 +		 */
 +		coredump_report_failure("RLIMIT_CORE is set to 1, aborting core");
 +		return false;
 +	}
 +	cprm->limit = RLIM_INFINITY;
 +
 +	cn->core_pipe_limit = atomic_inc_return(&core_pipe_count);
 +	if (core_pipe_limit && (core_pipe_limit < cn->core_pipe_limit)) {
 +		coredump_report_failure("over core_pipe_limit, skipping core dump");
 +		return false;
 +	}
 +
 +	helper_argv = kmalloc_array(argc + 1, sizeof(*helper_argv), GFP_KERNEL);
 +	if (!helper_argv) {
 +		coredump_report_failure("%s failed to allocate memory", __func__);
 +		return false;
 +	}
 +	for (argi = 0; argi < argc; argi++)
 +		helper_argv[argi] = cn->corename + argv[argi];
 +	helper_argv[argi] = NULL;
 +
 +	sub_info = call_usermodehelper_setup(helper_argv[0], helper_argv, NULL,
 +					     GFP_KERNEL, umh_coredump_setup,
 +					     NULL, cprm);
 +	if (!sub_info)
 +		return false;
 +
 +	if (call_usermodehelper_exec(sub_info, UMH_WAIT_EXEC)) {
 +		coredump_report_failure("|%s pipe failed", cn->corename);
 +		return false;
 +	}
 +
 +	/*
 +	 * umh disabled with CONFIG_STATIC_USERMODEHELPER_PATH="" would
 +	 * have this set to NULL.
 +	 */
 +	if (!cprm->file) {
 +		coredump_report_failure("Core dump to |%s disabled", cn->corename);
 +		return false;
 +	}
 +
 +	return true;
 +}
 +
 +static bool coredump_write(struct core_name *cn,
 +			  struct coredump_params *cprm,
 +			  struct linux_binfmt *binfmt)
  {
 +
 +	if (dump_interrupted())
 +		return true;
 +
 +	if (!dump_vma_snapshot(cprm))
 +		return false;
 +
 +	file_start_write(cprm->file);
 +	cn->core_dumped = binfmt->core_dump(cprm);
 +	/*
 +	 * Ensures that file size is big enough to contain the current
 +	 * file postion. This prevents gdb from complaining about
 +	 * a truncated file if the last "write" to the file was
 +	 * dump_skip.
 +	 */
 +	if (cprm->to_skip) {
 +		cprm->to_skip--;
 +		dump_emit(cprm, "", 1);
 +	}
 +	file_end_write(cprm->file);
 +	free_vma_snapshot(cprm);
 +	return true;
 +}
 +
 +static void coredump_cleanup(struct core_name *cn, struct coredump_params *cprm)
 +{
 +	if (cprm->file)
 +		filp_close(cprm->file, NULL);
 +	if (cn->core_pipe_limit) {
 +		VFS_WARN_ON_ONCE(cn->core_type != COREDUMP_PIPE);
 +		atomic_dec(&core_pipe_count);
 +	}
 +	kfree(cn->corename);
 +	coredump_finish(cn->core_dumped);
 +}
 +
 +static inline bool coredump_skip(const struct coredump_params *cprm,
 +				 const struct linux_binfmt *binfmt)
 +{
 +	if (!binfmt)
 +		return true;
 +	if (!binfmt->core_dump)
 +		return true;
 +	if (!__get_dumpable(cprm->mm_flags))
 +		return true;
 +	return false;
 +}
 +
 +void vfs_coredump(const kernel_siginfo_t *siginfo)
 +{
 +	struct cred *cred __free(put_cred) = NULL;
 +	size_t *argv __free(kfree) = NULL;
  	struct core_state core_state;
  	struct core_name cn;
  	struct mm_struct *mm = current->mm;
diff --cc net/unix/af_unix.c
index 52b155123985,c247fb9ac761..000000000000
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@@ -1975,18 -1970,30 +1981,31 @@@ static void unix_skb_to_scm(struct sk_b
   * Some apps rely on write() giving SCM_CREDENTIALS
   * We include credentials if source or destination socket
   * asserted SOCK_PASSCRED.
+  *
+  * Context: May sleep.
+  * Return: On success zero, on error a negative error code is returned.
   */
- static void unix_maybe_add_creds(struct sk_buff *skb, const struct sock *sk,
- 				 const struct sock *other)
+ static int unix_maybe_add_creds(struct sk_buff *skb, const struct sock *sk,
+ 				const struct sock *other)
  {
  	if (UNIXCB(skb).pid)
- 		return;
+ 		return 0;
  
 -	if (unix_may_passcred(sk) || unix_may_passcred(other)) {
 +	if (unix_may_passcred(sk) || unix_may_passcred(other) ||
 +	    !other->sk_socket) {
- 		UNIXCB(skb).pid = get_pid(task_tgid(current));
+ 		struct pid *pid;
+ 		int err;
+ 
+ 		pid = task_tgid(current);
+ 		err = pidfs_register_pid(pid);
+ 		if (unlikely(err))
+ 			return err;
+ 
+ 		UNIXCB(skb).pid = get_pid(pid);
  		current_uid_gid(&UNIXCB(skb).uid, &UNIXCB(skb).gid);
  	}
+ 
+ 	return 0;
  }
  
  static bool unix_skb_scm_eq(struct sk_buff *skb,

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.pidfs

for you to fetch changes up to 1f531e35c146cca22dc6f4a1bc657098f146f358:

  don't bother with path_get()/path_put() in unix_open_file() (2025-07-14 10:22:47 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.pidfs tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.pidfs

----------------------------------------------------------------
Al Viro (2):
      fold fs_struct->{lock,seq} into a seqlock
      don't bother with path_get()/path_put() in unix_open_file()

Alexander Mikhalitsyn (7):
      af_unix: rework unix_maybe_add_creds() to allow sleep
      af_unix: introduce unix_skb_to_scm helper
      af_unix: introduce and use scm_replace_pid() helper
      af_unix/scm: fix whitespace errors
      af_unix: stash pidfs dentry when needed
      af_unix: enable handing out pidfds for reaped tasks in SCM_PIDFD
      selftests: net: extend SCM_PIDFD test to cover stale pidfds

Christian Brauner (31):
      pidfs: raise SB_I_NODEV and SB_I_NOEXEC
      libfs: massage path_from_stashed() to allow custom stashing behavior
      libfs: massage path_from_stashed()
      pidfs: move to anonymous struct
      pidfs: persist information
      pidfs: remove unused members from struct pidfs_inode
      pidfs: remove custom inode allocation
      pidfs: remove pidfs_{get,put}_pid()
      pidfs: remove pidfs_pid_valid()
      libfs: prepare to allow for non-immutable pidfd inodes
      pidfs: make inodes mutable
      pidfs: support xattrs on pidfds
      selftests/pidfd: test extended attribute support
      selftests/pidfd: test extended attribute support
      selftests/pidfd: test setattr support
      pidfs: add some CONFIG_DEBUG_VFS asserts
      Merge patch series "pidfs: persistent info & xattrs"
      pidfs: fix pidfs_free_pid()
      fhandle: raise FILEID_IS_DIR in handle_type
      fhandle: hoist copy_from_user() above get_path_from_fd()
      fhandle: rename to get_path_anchor()
      pidfs: add pidfs_root_path() helper
      fhandle: reflow get_path_anchor()
      uapi/fcntl: mark range as reserved
      fcntl/pidfd: redefine PIDFD_SELF_THREAD_GROUP
      uapi/fcntl: add FD_INVALID
      uapi/fcntl: add FD_PIDFS_ROOT
      fhandle, pidfs: support open_by_handle_at() purely based on file handle
      selftests/pidfd: decode pidfd file handles withou having to specify an fd
      Merge patch series "fhandle, pidfs: allow open_by_handle_at() purely based on file handle"
      Merge patch series "allow to create pidfds for reaped tasks with SCM_PIDFD"

 fs/coredump.c                                      |   6 -
 fs/d_path.c                                        |   8 +-
 fs/exec.c                                          |   4 +-
 fs/fhandle.c                                       |  62 ++-
 fs/fs_struct.c                                     |  36 +-
 fs/internal.h                                      |   4 +
 fs/libfs.c                                         |  34 +-
 fs/namei.c                                         |   8 +-
 fs/pidfs.c                                         | 436 ++++++++++++---------
 include/linux/fs_struct.h                          |  11 +-
 include/linux/pid.h                                |  14 +-
 include/linux/pidfs.h                              |   3 +-
 include/net/scm.h                                  |   4 +-
 include/uapi/linux/fcntl.h                         |  18 +
 include/uapi/linux/pidfd.h                         |  15 -
 kernel/fork.c                                      |  10 +-
 kernel/pid.c                                       |   2 +-
 net/core/scm.c                                     |  32 +-
 net/unix/af_unix.c                                 |  78 ++--
 tools/testing/selftests/net/af_unix/scm_pidfd.c    | 217 +++++++---
 tools/testing/selftests/pidfd/.gitignore           |   2 +
 tools/testing/selftests/pidfd/Makefile             |   5 +-
 tools/testing/selftests/pidfd/pidfd.h              |   6 +-
 .../selftests/pidfd/pidfd_file_handle_test.c       |  60 +++
 tools/testing/selftests/pidfd/pidfd_setattr_test.c |  69 ++++
 tools/testing/selftests/pidfd/pidfd_xattr_test.c   | 132 +++++++
 26 files changed, 894 insertions(+), 382 deletions(-)
 create mode 100644 tools/testing/selftests/pidfd/pidfd_setattr_test.c
 create mode 100644 tools/testing/selftests/pidfd/pidfd_xattr_test.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 10/14 for v6.17] vfs rust
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (11 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
  2025-07-31  9:40 ` [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
This contains vfs rust updates for this cycle:

- Allow poll_table pointers to be NULL.

- Add Rust files to vfs MAINTAINERS entry.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.rust

for you to fetch changes up to 3ccc82e31d6a66600f14f6622a944f580b04da43:

  vfs: add Rust files to MAINTAINERS (2025-07-15 11:50:15 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.rust tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.rust

----------------------------------------------------------------
Alice Ryhl (2):
      poll: rust: allow poll_table ptrs to be null
      vfs: add Rust files to MAINTAINERS

 MAINTAINERS              |  4 +++
 rust/helpers/helpers.c   |  1 +
 rust/helpers/poll.c      | 10 +++++++
 rust/kernel/sync/poll.rs | 68 ++++++++++++++++++------------------------------
 4 files changed, 41 insertions(+), 42 deletions(-)
 create mode 100644 rust/helpers/poll.c

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [GIT PULL 13/14 for v6.17] vfs super
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (12 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
@ 2025-07-25 11:27 ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  2025-07-31  9:40 ` [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
  14 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-25 11:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Christian Brauner, linux-fsdevel, linux-kernel

Hey Linus,

/* Summary */
Currently all filesystems which implement super_operations::shutdown()
can not afford losing a device.

Thus fs_bdev_mark_dead() will just call the ->shutdown() callback for the
involved filesystem.

But it will no longer be the case, as multi-device filesystems like
btrfs can handle certain device loss without the need to shutdown the
whole filesystem.

To allow those multi-device filesystems to be integrated to use
fs_holder_ops:

- Add a new super_operations::remove_bdev() callback

- Try ->remove_bdev() callback first inside fs_bdev_mark_dead()
  If the callback returned 0, meaning the fs can handling the device
  loss, then exit without doing anything else.

  If there is no such callback or the callback returned non-zero value,
  continue to shutdown the filesystem as usual.

This means the new remove_bdev() should only do the check on whether the
operation can continue, and if so do the fs specific handlings. The
shutdown handling should still be handled by the existing ->shutdown()
callback.

For all existing filesystems with shutdown callback, there is no change
to the code nor behavior.

Btrfs is going to implement both the ->remove_bdev() and ->shutdown()
callbacks soon.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:

  Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.super

for you to fetch changes up to d9c37a4904ec21ef7d45880fe023c11341869c28:

  fs: add a new remove_bdev() callback (2025-07-15 13:36:40 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.super tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.super

----------------------------------------------------------------
Qu Wenruo (1):
      fs: add a new remove_bdev() callback

 fs/super.c         | 11 +++++++++++
 include/linux/fs.h |  9 +++++++++
 2 files changed, 20 insertions(+)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 14/14 for v6.17] vfs iomap
  2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
@ 2025-07-27 13:10   ` Sasha Levin
  2025-07-28 16:39     ` Joanne Koong
  2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Sasha Levin @ 2025-07-27 13:10 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel

Hey Christian,

On Fri, Jul 25, 2025 at 01:27:20PM +0200, Christian Brauner wrote:
>Hey Linus,
>
>/* Summary */
>This contains the iomap updates for this cycle:
>
>- Refactor the iomap writeback code and split the generic and ioend/bio
>  based writeback code. There are two methods that define the split
>  between the generic writeback code, and the implemementation of it,
>  and all knowledge of ioends and bios now sits below that layer.
>
>- This series adds fuse iomap support for buffered writes and dirty
>  folio writeback. This is needed so that granular uptodate and dirty
>  tracking can be used in fuse when large folios are enabled. This has
>  two big advantages. For writes, instead of the entire folio needing to
>  be read into the page cache, only the relevant portions need to be.
>  For writeback, only the dirty portions need to be written back instead
>  of the entire folio.

While testing with the linus-next tree, it appears that LKFT can trigger
the following warning, but only on arm64 tests (both on real HW as well
as qemu):

[ 333.129662] WARNING: CPU: 1 PID: 2580 at fs/fuse/file.c:2158 fuse_iomap_writeback_range+0x478/0x558 fuse
[  333.132010] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sha3_ce sha512_ce fuse drm backlight ip_tables x_tables
[  333.133982] CPU: 1 UID: 0 PID: 2580 Comm: msync04 Tainted: G        W           6.16.0-rc7 #1 PREEMPT
[  333.134997] Tainted: [W]=WARN
[  333.135497] Hardware name: linux,dummy-virt (DT)
[  333.136114] pstate: 03402009 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.137090] pc : fuse_iomap_writeback_range+0x478/0x558 fuse
[ 333.138009] lr : iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
[  333.138510] sp : ffff80008be8f8c0
[  333.138653] x29: ffff80008be8f8c0 x28: fff00000c5198c00 x27: 0000000000000000
[  333.138975] x26: fff00000d32b8c00 x25: 0000000000000000 x24: 0000000000000000
[  333.139309] x23: 0000000000000000 x22: fffffc1fc039ba40 x21: 0000000000001000
[  333.139600] x20: ffff80008be8f9f0 x19: 0000000000000000 x18: 0000000000000000
[  333.139917] x17: 0000000000000000 x16: ffffbb40f61c3a48 x15: 0000000000000000
[  333.142199] x14: ffffbb40f6924788 x13: 0000ffff8e8effff x12: 0000000000000000
[  333.142739] x11: 1ffe0000199a9241 x10: fff00000ccd4920c x9 : ffffbb40f50bba18
[  333.143466] x8 : ffff80008be8f778 x7 : ffffbb40ee180b68 x6 : ffffbb40f76c9000
[  333.143718] x5 : 0000000000000000 x4 : 000000000000000a x3 : 0000000000001000
[  333.143957] x2 : fff00000c0b6e600 x1 : 000000000000ffff x0 : 0bfffe000000400b
[  333.144993] Call trace:
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.145466] fuse_iomap_writeback_range+0x478/0x558 fuse (P)
[ 333.146136] iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
[ 333.146444] iomap_writepages (fs/iomap/buffered-io.c:1762)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.146590] fuse_writepages+0xa0/0xe8 fuse
[ 333.146774] do_writepages (mm/page-writeback.c:2636)
[ 333.146915] filemap_fdatawrite_wbc (mm/filemap.c:386 mm/filemap.c:376)
[ 333.147788] __filemap_fdatawrite_range (mm/filemap.c:420)
[ 333.148440] file_write_and_wait_range (mm/filemap.c:794)
WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
[ 333.149054] fuse_fsync+0x6c/0x138 fuse
[ 333.149578] vfs_fsync_range (fs/sync.c:188)
[ 333.149892] __arm64_sys_msync (mm/msync.c:96 mm/msync.c:32 mm/msync.c:32)
[ 333.150095] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
[ 333.150330] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
[ 333.150461] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:768 (discriminator 1))
[ 333.150583] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:787)
[ 333.150729] el0t_64_sync (arch/arm64/kernel/entry.S:600)
[  333.150862] ---[ end trace 0000000000000000 ]---

I think that this is because the arm64 tests run on
CONFIG_PAGE_SIZE_64KB=y build, but I'm not sure why we don't see it with
4KB pages at all.

An example link to a failing test that has the full log and more
information: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-44385-g8a03a07bad83/testrun/29269158/suite/log-parser-test/test/exception-warning-cpu-pid-at-fsfusefile-fuse_iomap_writeback_range/details/

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
@ 2025-07-28  1:29   ` Hugh Dickins
  2025-07-28 22:21     ` Linus Torvalds
  2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Hugh Dickins @ 2025-07-28  1:29 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Klara Modin, Arnd Bergmann, Anuj Gupta,
	linux-fsdevel, linux-kernel

On Fri, 25 Jul 2025, Christian Brauner wrote:

> Hey Linus,
> 
> /* Summary */
> This adds the new FS_IOC_GETLBMD_CAP ioctl() to query metadata and
> protection info (PI) capabilities. This ioctl returns information about
> the files integrity profile. This is useful for userspace applications
> to understand a files end-to-end data protection support and configure
> the I/O accordingly.
> 
> For now this interface is only supported by block devices. However the
> design and placement of this ioctl in generic FS ioctl space allows us
> to extend it to work over files as well. This maybe useful when
> filesystems start supporting PI-aware layouts.
> 
> A new structure struct logical_block_metadata_cap is introduced, which
> contains the following fields:
> 
> - lbmd_flags:
>   bitmask of logical block metadata capability flags
> 
> - lbmd_interval:
>   the amount of data described by each unit of logical block metadata
> 
> - lbmd_size:
>   size in bytes of the logical block metadata associated with each
>   interval
> 
> - lbmd_opaque_size:
>   size in bytes of the opaque block tag associated with each interval
> 
> - lbmd_opaque_offset:
>   offset in bytes of the opaque block tag within the logical block
>   metadata
> 
> - lbmd_pi_size:
>   size in bytes of the T10 PI tuple associated with each interval
> 
> - lbmd_pi_offset:
>   offset in bytes of T10 PI tuple within the logical block metadata
> 
> - lbmd_pi_guard_tag_type:
>   T10 PI guard tag type
>     
> - lbmd_pi_app_tag_size:
>    size in bytes of the T10 PI application tag
> 
> - lbmd_pi_ref_tag_size:
>    size in bytes of the T10 PI reference tag
> 
> - lbmd_pi_storage_tag_size:
>   size in bytes of the T10 PI storage tag
> 
> The internal logic to fetch the capability is encapsulated in a helper
> function blk_get_meta_cap(), which uses the blk_integrity profile
> associated with the device. The ioctl returns -EOPNOTSUPP, if
> CONFIG_BLK_DEV_INTEGRITY is not enabled.
> 
> /* Testing */
> 
> gcc (Debian 14.2.0-19) 14.2.0
> Debian clang version 19.1.7 (3)
> 
> No build failures or warnings were observed.
> 
> /* Conflicts */
> 
> Merge conflicts with mainline
> =============================
> 
> No known conflicts.
> 
> Merge conflicts with other trees
> ================================
> 
> No known conflicts.
> 
> The following changes since commit 19272b37aa4f83ca52bdf9c16d5d81bdd1354494:
> 
>   Linux 6.16-rc1 (2025-06-08 13:44:43 -0700)
> 
> are available in the Git repository at:
> 
>   git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity
> 
> for you to fetch changes up to bc5b0c8febccbeabfefc9b59083b223ec7c7b53a:
> 
>   block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP (2025-07-23 14:55:51 +0200)
> 
> Please consider pulling these changes from the signed vfs-6.17-rc1.integrity tag.
> 
> Thanks!
> Christian
> 
> ----------------------------------------------------------------
> vfs-6.17-rc1.integrity
> 
> ----------------------------------------------------------------
> Anuj Gupta (5):
>       block: rename tuple_size field in blk_integrity to metadata_size
>       block: introduce pi_tuple_size field in blk_integrity
>       nvme: set pi_offset only when checksum type is not BLK_INTEGRITY_CSUM_NONE
>       fs: add ioctl to query metadata and protection info capabilities
>       block: fix lbmd_guard_tag_type assignment in FS_IOC_GETLBMD_CAP
> 
> Arnd Bergmann (1):
>       block: fix FS_IOC_GETLBMD_CAP parsing in blkdev_common_ioctl()
> 
> Christian Brauner (1):
>       Merge patch series "add ioctl to query metadata and protection info capabilities"
> 
>  block/bio-integrity-auto.c        |  4 +--
>  block/blk-integrity.c             | 70 ++++++++++++++++++++++++++++++++++++++-
>  block/blk-settings.c              | 44 ++++++++++++++++++++++--
>  block/ioctl.c                     |  6 ++++
>  block/t10-pi.c                    | 16 ++++-----
>  drivers/md/dm-crypt.c             |  4 +--
>  drivers/md/dm-integrity.c         | 12 +++----
>  drivers/nvdimm/btt.c              |  2 +-
>  drivers/nvme/host/core.c          |  7 ++--
>  drivers/nvme/target/io-cmd-bdev.c |  2 +-
>  drivers/scsi/sd_dif.c             |  3 +-
>  include/linux/blk-integrity.h     | 11 ++++--
>  include/linux/blkdev.h            |  3 +-
>  include/uapi/linux/fs.h           | 59 +++++++++++++++++++++++++++++++++
>  14 files changed, 213 insertions(+), 30 deletions(-)

It would be great if Klara's patch at
https://lore.kernel.org/lkml/20250725164334.9606-1-klarasmodin@gmail.com/
could follow just after this pull: I had been bisecting -next to find out
why "losetup /dev/loop0 tmpfsfile" was failing, and that patch fixes it -
and presumably other odd failures for anyone without BLK_DEV_INTEGRITY=y.

Thanks,
Hugh

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 14/14 for v6.17] vfs iomap
  2025-07-27 13:10   ` Sasha Levin
@ 2025-07-28 16:39     ` Joanne Koong
  2025-07-31  8:29       ` Christian Brauner
  0 siblings, 1 reply; 44+ messages in thread
From: Joanne Koong @ 2025-07-28 16:39 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Christian Brauner, Linus Torvalds, linux-fsdevel, linux-kernel

On Sun, Jul 27, 2025 at 6:10 AM Sasha Levin <sashal@kernel.org> wrote:
>
> Hey Christian,
>
> On Fri, Jul 25, 2025 at 01:27:20PM +0200, Christian Brauner wrote:
> >Hey Linus,
> >
> >/* Summary */
> >This contains the iomap updates for this cycle:
> >
> >- Refactor the iomap writeback code and split the generic and ioend/bio
> >  based writeback code. There are two methods that define the split
> >  between the generic writeback code, and the implemementation of it,
> >  and all knowledge of ioends and bios now sits below that layer.
> >
> >- This series adds fuse iomap support for buffered writes and dirty
> >  folio writeback. This is needed so that granular uptodate and dirty
> >  tracking can be used in fuse when large folios are enabled. This has
> >  two big advantages. For writes, instead of the entire folio needing to
> >  be read into the page cache, only the relevant portions need to be.
> >  For writeback, only the dirty portions need to be written back instead
> >  of the entire folio.
>
> While testing with the linus-next tree, it appears that LKFT can trigger
> the following warning, but only on arm64 tests (both on real HW as well
> as qemu):
>
> [ 333.129662] WARNING: CPU: 1 PID: 2580 at fs/fuse/file.c:2158 fuse_iomap_writeback_range+0x478/0x558 fuse
> [  333.132010] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sha3_ce sha512_ce fuse drm backlight ip_tables x_tables
> [  333.133982] CPU: 1 UID: 0 PID: 2580 Comm: msync04 Tainted: G        W           6.16.0-rc7 #1 PREEMPT
> [  333.134997] Tainted: [W]=WARN
> [  333.135497] Hardware name: linux,dummy-virt (DT)
> [  333.136114] pstate: 03402009 (nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.137090] pc : fuse_iomap_writeback_range+0x478/0x558 fuse
> [ 333.138009] lr : iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
> [  333.138510] sp : ffff80008be8f8c0
> [  333.138653] x29: ffff80008be8f8c0 x28: fff00000c5198c00 x27: 0000000000000000
> [  333.138975] x26: fff00000d32b8c00 x25: 0000000000000000 x24: 0000000000000000
> [  333.139309] x23: 0000000000000000 x22: fffffc1fc039ba40 x21: 0000000000001000
> [  333.139600] x20: ffff80008be8f9f0 x19: 0000000000000000 x18: 0000000000000000
> [  333.139917] x17: 0000000000000000 x16: ffffbb40f61c3a48 x15: 0000000000000000
> [  333.142199] x14: ffffbb40f6924788 x13: 0000ffff8e8effff x12: 0000000000000000
> [  333.142739] x11: 1ffe0000199a9241 x10: fff00000ccd4920c x9 : ffffbb40f50bba18
> [  333.143466] x8 : ffff80008be8f778 x7 : ffffbb40ee180b68 x6 : ffffbb40f76c9000
> [  333.143718] x5 : 0000000000000000 x4 : 000000000000000a x3 : 0000000000001000
> [  333.143957] x2 : fff00000c0b6e600 x1 : 000000000000ffff x0 : 0bfffe000000400b
> [  333.144993] Call trace:
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.145466] fuse_iomap_writeback_range+0x478/0x558 fuse (P)
> [ 333.146136] iomap_writeback_folio (fs/iomap/buffered-io.c:1586 fs/iomap/buffered-io.c:1710)
> [ 333.146444] iomap_writepages (fs/iomap/buffered-io.c:1762)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.146590] fuse_writepages+0xa0/0xe8 fuse
> [ 333.146774] do_writepages (mm/page-writeback.c:2636)
> [ 333.146915] filemap_fdatawrite_wbc (mm/filemap.c:386 mm/filemap.c:376)
> [ 333.147788] __filemap_fdatawrite_range (mm/filemap.c:420)
> [ 333.148440] file_write_and_wait_range (mm/filemap.c:794)
> WARNING! No debugging info in module fuse, rebuild with DEBUG_KERNEL and DEBUG_INFO
> [ 333.149054] fuse_fsync+0x6c/0x138 fuse
> [ 333.149578] vfs_fsync_range (fs/sync.c:188)
> [ 333.149892] __arm64_sys_msync (mm/msync.c:96 mm/msync.c:32 mm/msync.c:32)
> [ 333.150095] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54)
> [ 333.150330] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2))
> [ 333.150461] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:768 (discriminator 1))
> [ 333.150583] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:787)
> [ 333.150729] el0t_64_sync (arch/arm64/kernel/entry.S:600)
> [  333.150862] ---[ end trace 0000000000000000 ]---
>
> I think that this is because the arm64 tests run on
> CONFIG_PAGE_SIZE_64KB=y build, but I'm not sure why we don't see it with
> 4KB pages at all.
>
> An example link to a failing test that has the full log and more
> information: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-44385-g8a03a07bad83/testrun/29269158/suite/log-parser-test/test/exception-warning-cpu-pid-at-fsfusefile-fuse_iomap_writeback_range/details/
>

This was reported last week as well in [1]. The fix for this is in
https://lore.kernel.org/linux-fsdevel/20250723230850.2395561-1-joannelkoong@gmail.com/

Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/CA+G9fYs5AdVM-T2Tf3LciNCwLZEHetcnSkHsjZajVwwpM2HmJw@mail.gmail.com/

> --
> Thanks,
> Sasha
>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 02/14 for v6.17] vfs coredump
  2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
@ 2025-07-28 18:57   ` Linus Torvalds
  2025-07-31  9:37     ` Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-07-28 18:57 UTC (permalink / raw)
  To: Christian Brauner; +Cc: linux-fsdevel, linux-kernel

On Fri, 25 Jul 2025 at 04:27, Christian Brauner <brauner@kernel.org> wrote:
>
> This will have a merge conflict with mainline that can be resolved as follows:

Bah. Mine looks very different, but the end result should be the same.
I just made that final 'read()' match the same failure pattern as the
other parts of the test.

Holler if I screwed up.

            Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-28  1:29   ` Hugh Dickins
@ 2025-07-28 22:21     ` Linus Torvalds
  2025-07-29  7:49       ` Christoph Hellwig
  0 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-07-28 22:21 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Christian Brauner, Klara Modin, Arnd Bergmann, Anuj Gupta,
	linux-fsdevel, linux-kernel

On Sun, 27 Jul 2025 at 18:29, Hugh Dickins <hughd@google.com> wrote:
>
> It would be great if Klara's patch at
> https://lore.kernel.org/lkml/20250725164334.9606-1-klarasmodin@gmail.com/
> could follow just after this pull: I had been bisecting -next to find out
> why "losetup /dev/loop0 tmpfsfile" was failing, and that patch fixes it -
> and presumably other odd failures for anyone without BLK_DEV_INTEGRITY=y.

Bah. I *hate* this "call blk_get_meta_cap() first" approach. There is
absolutely *NO* way it is valid for that strange specialized ioctl to
override any proper traditional ioctl numbers, so calling that code
first and relying on magic error numbers is simply not acceptable.

I'm going to fix this in my merge by just putting the call to
blk_get_meta_cap() inside the "default:" case for *after* the other
ioctl numbers have been checked.

Please don't introduce new "magic error number" logic in the ioctl
path. The fact that the traditional case of "I don't support this" is
ENOTTY should damn well tell everybody that we have about SIX DECADES
of problems in this area. Don't repeat that mistake.

And don't let new random unimportant ioctls *EVER* override the normal
default ones.

               Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 05/14 for v6.17] vfs async dir
  2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:14 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.async.dir

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0c4ec4a339b435381bc998f74862bd7a23d33f79

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  2025-07-29 18:15   ` Alexei Starovoitov
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:15 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.bpf

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7e7bc8335b1486e5b157e844c248925a763baf16

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 02/14 for v6.17] vfs coredump
  2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
  2025-07-28 18:57   ` Linus Torvalds
@ 2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:16 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.coredump

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/117eab5c6e31815649d952f6da03f67aa247d29b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 06/14 for v6.17] vfs fallocate
  2025-07-25 11:27 ` [GIT PULL 06/14 for v6.17] vfs fallocate Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:17 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/278c7d9b5e0ca73a75e5151c22fb05c91cb4495f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 12/14 for v6.17] vfs fileattr
  2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:18 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fileattr

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/57fcb7d930d8f00f383e995aeebdcd2b416a187a

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
  2025-07-28  1:29   ` Hugh Dickins
@ 2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:19 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.integrity

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/cec40a7c80e8b0ef03667708ea2660bc1a99b464

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 14/14 for v6.17] vfs iomap
  2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
  2025-07-27 13:10   ` Sasha Levin
@ 2025-07-28 23:40   ` pr-tracker-bot
  1 sibling, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:20 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.iomap

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b5d760d53ac2e36825fbbb8d1f54ad9ce6138f7b

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 01/14 for v6.17] vfs misc
  2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:21 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.misc

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7879d7aff0ffd969fcb1a59e3f87ebb353e47b7f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 07/14 for v6.17] vfs mmap
  2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:22 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.mmap_prepare

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/7031769e102b768b3fa0c4c726faf532cb31e973

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 04/14 for v6.17] namespace updates
  2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:23 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.nsfs

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/f70d24c230bcaa1e95f66252133068a98c895200

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 03/14 for v6.17] overlayfs
  2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:24 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.ovl

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/934600daa7bcce8ad6d5efe05cce4811c8d2f464

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 08/14 for v6.17] vfs pidfs
  2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:25 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.pidfs

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/672dcda246071e1940eab8bb5a03d04ea026f46e

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 10/14 for v6.17] vfs rust
  2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:26 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.rust

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/add07519ea6b6c2ba2b7842225eb87e0f08f2b0f

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 13/14 for v6.17] vfs super
  2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
@ 2025-07-28 23:40   ` pr-tracker-bot
  0 siblings, 0 replies; 44+ messages in thread
From: pr-tracker-bot @ 2025-07-28 23:40 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Christian Brauner, linux-fsdevel, linux-kernel

The pull request you sent on Fri, 25 Jul 2025 13:27:27 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.super

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/0965549d6f5f23e9250cd9c642f4ea5fd682eddb

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-28 22:21     ` Linus Torvalds
@ 2025-07-29  7:49       ` Christoph Hellwig
  2025-07-29  8:39         ` Linus Torvalds
  0 siblings, 1 reply; 44+ messages in thread
From: Christoph Hellwig @ 2025-07-29  7:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Hugh Dickins, Christian Brauner, Klara Modin, Arnd Bergmann,
	Anuj Gupta, linux-fsdevel, linux-kernel

On Mon, Jul 28, 2025 at 03:21:21PM -0700, Linus Torvalds wrote:
> Bah. I *hate* this "call blk_get_meta_cap() first" approach. There is
> absolutely *NO* way it is valid for that strange specialized ioctl to
> override any proper traditional ioctl numbers, so calling that code
> first and relying on magic error numbers is simply not acceptable.
> 
> I'm going to fix this in my merge by just putting the call to
> blk_get_meta_cap() inside the "default:" case for *after* the other
> ioctl numbers have been checked.
> 
> Please don't introduce new "magic error number" logic in the ioctl
> path. The fact that the traditional case of "I don't support this" is
> ENOTTY should damn well tell everybody that we have about SIX DECADES
> of problems in this area. Don't repeat that mistake.
> 
> And don't let new random unimportant ioctls *EVER* override the normal
> default ones.

I don't think overrides are intentional here.  The problem is that
Christian asked for the flexible size growing decoding here, which
makes it impossible to use the simple and proven ioctl dispatch by
just using another case statement in the switch.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-29  7:49       ` Christoph Hellwig
@ 2025-07-29  8:39         ` Linus Torvalds
  2025-07-31  8:00           ` Christian Brauner
  0 siblings, 1 reply; 44+ messages in thread
From: Linus Torvalds @ 2025-07-29  8:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hugh Dickins, Christian Brauner, Klara Modin, Arnd Bergmann,
	Anuj Gupta, linux-fsdevel, linux-kernel

On Tue, 29 Jul 2025 at 00:49, Christoph Hellwig <hch@infradead.org> wrote:
>
> I don't think overrides are intentional here.  The problem is that
> Christian asked for the flexible size growing decoding here, which
> makes it impossible to use the simple and proven ioctl dispatch by
> just using another case statement in the switch.

Right. Which is why I put it in the default: branch.

IOW, just handle the important real and normal cases first - the ones
that *can* be handled with simple switch statements.

So putting it at the *top*, and then saying "if it returns this
special error code that isn't standardized we do the normal ones" is
wrong.

It's wrong because we literally have over half a century of confusion
about error codes in this area, predating Linux.

And it's also wrong because that new ioctl simply shouldn't be
prioritized over existing ones.

So I'm just saying "don't do that then".

               Linus

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
  2025-07-28 23:40   ` pr-tracker-bot
@ 2025-07-29 18:15   ` Alexei Starovoitov
  2025-07-31  8:27     ` Christian Brauner
  1 sibling, 1 reply; 44+ messages in thread
From: Alexei Starovoitov @ 2025-07-29 18:15 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel, bpf

On Fri, Jul 25, 2025 at 01:27:15PM +0200, Christian Brauner wrote:
> Hey Linus,
> 
> /* Summary */
> These changes allow bpf to read extended attributes from cgroupfs.
> This is useful in redirecting AF_UNIX socket connections based on cgroup
> membership of the socket. One use-case is the ability to implement log
> namespaces in systemd so services and containers are redirected to
> different journals.
> 
> Please note that I plan on merging bpf changes related to the vfs
> exclusively via vfs trees.

That was not discussed and agreed upon.

> /* Testing */

The selftests/bpf had bugs flagged by BPF CI.

> /* Conflicts */
> 
> Merge conflicts with mainline
> =============================
> 
> No known conflicts.
> 
> Merge conflicts with other trees
> ================================
> 
> No known conflicts.

You were told a month ago that there are conflicts
and you were also told that the branch shouldn't be rebased,
yet you ignored it.

> Christian Brauner (3):
>       kernfs: remove iattr_mutex
>       Merge patch series "Introduce bpf_cgroup_read_xattr"
>       selftests/kernfs: test xattr retrieval
> 
> Song Liu (3):
>       bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
>       bpf: Mark cgroup_subsys_state->cgroup RCU safe
>       selftests/bpf: Add tests for bpf_cgroup_read_xattr
> 
>  fs/bpf_fs_kfuncs.c                                 |  34 +++++
>  fs/kernfs/inode.c                                  |  70 ++++-----
>  kernel/bpf/helpers.c                               |   3 +
>  kernel/bpf/verifier.c                              |   5 +
>  tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
>  .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
>  .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
>  .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++

Now Linus needs to resolve the conflicts again.
More details in bpf-next PR:
https://lore.kernel.org/bpf/20250729180626.35057-1-alexei.starovoitov@gmail.com/


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 11/14 for v6.17] vfs integrity
  2025-07-29  8:39         ` Linus Torvalds
@ 2025-07-31  8:00           ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-31  8:00 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Christoph Hellwig, Hugh Dickins, Klara Modin, Arnd Bergmann,
	Anuj Gupta, linux-fsdevel, linux-kernel

> Right. Which is why I put it in the default: branch.

Thanks for fixing that up!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-29 18:15   ` Alexei Starovoitov
@ 2025-07-31  8:27     ` Christian Brauner
  2025-07-31 21:57       ` Alexei Starovoitov
  0 siblings, 1 reply; 44+ messages in thread
From: Christian Brauner @ 2025-07-31  8:27 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Linus Torvalds, linux-fsdevel, linux-kernel, bpf

On Tue, Jul 29, 2025 at 11:15:56AM -0700, Alexei Starovoitov wrote:
> On Fri, Jul 25, 2025 at 01:27:15PM +0200, Christian Brauner wrote:
> > Hey Linus,
> > 
> > /* Summary */
> > These changes allow bpf to read extended attributes from cgroupfs.
> > This is useful in redirecting AF_UNIX socket connections based on cgroup
> > membership of the socket. One use-case is the ability to implement log
> > namespaces in systemd so services and containers are redirected to
> > different journals.
> > 
> > Please note that I plan on merging bpf changes related to the vfs
> > exclusively via vfs trees.
> 
> That was not discussed and agreed upon.
> 
> > /* Testing */
> 
> The selftests/bpf had bugs flagged by BPF CI.
> 
> > /* Conflicts */
> > 
> > Merge conflicts with mainline
> > =============================
> > 
> > No known conflicts.
> > 
> > Merge conflicts with other trees
> > ================================
> > 
> > No known conflicts.
> 
> You were told a month ago that there are conflicts
> and you were also told that the branch shouldn't be rebased,
> yet you ignored it.
> 
> > Christian Brauner (3):
> >       kernfs: remove iattr_mutex
> >       Merge patch series "Introduce bpf_cgroup_read_xattr"
> >       selftests/kernfs: test xattr retrieval
> > 
> > Song Liu (3):
> >       bpf: Introduce bpf_cgroup_read_xattr to read xattr of cgroup's node
> >       bpf: Mark cgroup_subsys_state->cgroup RCU safe
> >       selftests/bpf: Add tests for bpf_cgroup_read_xattr
> > 
> >  fs/bpf_fs_kfuncs.c                                 |  34 +++++
> >  fs/kernfs/inode.c                                  |  70 ++++-----
> >  kernel/bpf/helpers.c                               |   3 +
> >  kernel/bpf/verifier.c                              |   5 +
> >  tools/testing/selftests/bpf/bpf_experimental.h     |   3 +
> >  .../selftests/bpf/prog_tests/cgroup_xattr.c        | 145 +++++++++++++++++++
> >  .../selftests/bpf/progs/cgroup_read_xattr.c        | 158 +++++++++++++++++++++
> >  .../selftests/bpf/progs/read_cgroupfs_xattr.c      |  60 ++++++++
> 
> Now Linus needs to resolve the conflicts again.
> More details in bpf-next PR:
> https://lore.kernel.org/bpf/20250729180626.35057-1-alexei.starovoitov@gmail.com/

As many times before you seem to conveniently misremember the facts.

Every tree that has meaningful VFS changes such as adding new helpers
uses a shared branch. Such as in this case that touched kernfs and the
VFS.

The conflict arises from the fact that somehow you manage to maintain
all of the complexities of bpf but you refuse to make shared branches
work due to a simple merge conflict:

  "imo this shared branch experience wasn't good.
  We should have applied the series to bpf-next only.
  It was more bpf material than vfs. I wouldn't do this again."

  https://lore.kernel.org/r/CAADnVQ+pPt7Zt8gS0aW75WGrwjmcUcn3s37Ahd9bnLyzOfB=3g@mail.gmail.com

Something that we succesfully manage with all other subsystems. Is it
perfect? Of course not.

But instead of trying to come to a simple solution you just stop
replying. That's not how this works.

The branch had a bug and I informed you and told you how I would resolve
it in:

  https://lore.kernel.org/r/20250702-hochmoderne-abklatsch-af9c605b57b2@brauner

It's been in -next a few days. Instead of slapping some hotfix on top
that leaves the tree in a broken state the fix was squashed. In other
words you would have to reapply the series anyway.

I also explicitly told you as a reply to the very issue in the same thread:

  "Anything that touches VFS will go through VFS. Shared
  branches work just fine. We manage to do this with everyone else in the
  kernel so bpf is able to do this as well. If you'd just asked this would
  not have been an issue. Merge conflicts are a fact of kernel
  development, we all deal with it you can too."

  https://lore.kernel.org/r/20250702-anhaften-postleitzahl-06a4d4771641@brauner

For the record, I don't have a problem with some stuff going through
other trees. For example, if Jens wanted to do that I'd go "hell yeah,
let's try and make this work."

The reason I'm hesitant to do it here is because of continuous mails
like the one you sent here where you aggressively spin a story and then
try to make someone take the blame.

I mean, your mail is very short of "Linus, I'm subtly telling you what
mean Christian did wrong and that he's rebased, which I know you hate
and you have to resolve merge conflicts so please yell at him.". Come
on.

I work hard to effectively cooperate with you but until there is a
good-faith mutual relationship on-list I don't want meaningful VFS work
going through the bpf tree. You can take it or leave it and I would
kindly ask Linus to respect that if he agrees.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 14/14 for v6.17] vfs iomap
  2025-07-28 16:39     ` Joanne Koong
@ 2025-07-31  8:29       ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-31  8:29 UTC (permalink / raw)
  To: Joanne Koong; +Cc: Sasha Levin, Linus Torvalds, linux-fsdevel, linux-kernel

> This was reported last week as well in [1]. The fix for this is in
> https://lore.kernel.org/linux-fsdevel/20250723230850.2395561-1-joannelkoong@gmail.com/

Thanks Joanne!

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 02/14 for v6.17] vfs coredump
  2025-07-28 18:57   ` Linus Torvalds
@ 2025-07-31  9:37     ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-31  9:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, linux-kernel

On Mon, Jul 28, 2025 at 11:57:58AM -0700, Linus Torvalds wrote:
> On Fri, 25 Jul 2025 at 04:27, Christian Brauner <brauner@kernel.org> wrote:
> >
> > This will have a merge conflict with mainline that can be resolved as follows:
> 
> Bah. Mine looks very different, but the end result should be the same.
> I just made that final 'read()' match the same failure pattern as the
> other parts of the test.
> 
> Holler if I screwed up.

Thank you. This all looks good to me.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 00/14 for v6.17] vfs 6.17
  2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
                   ` (13 preceding siblings ...)
  2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
@ 2025-07-31  9:40 ` Christian Brauner
  14 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-07-31  9:40 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, linux-kernel

> Lucky for me the v6.17 merge window coincides with me moving. IOW, I'm
> currently getting squashed by moving boxes and disassembled furniture.

Fyi, the move is now mostly over. We're not really done yet setting
everything up and so on but I managed to get back behind a computer for
once. So I'm slowly trying to catch up with everything.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-31  8:27     ` Christian Brauner
@ 2025-07-31 21:57       ` Alexei Starovoitov
  2025-08-04 14:24         ` Christian Brauner
  0 siblings, 1 reply; 44+ messages in thread
From: Alexei Starovoitov @ 2025-07-31 21:57 UTC (permalink / raw)
  To: Christian Brauner; +Cc: Linus Torvalds, Linux-Fsdevel, LKML, bpf

On Thu, Jul 31, 2025 at 1:28 AM Christian Brauner <brauner@kernel.org> wrote:
>
> It's been in -next a few days. Instead of slapping some hotfix on top
> that leaves the tree in a broken state the fix was squashed. In other
> words you would have to reapply the series anyway.

That's not how stable branches work. The whole point of a stable
branch is that sha-s should not change. You don't squash things
after a branch is created.
That extra fix could have been easily added on top.

> I mean, your mail is very short of "Linus, I'm subtly telling you what
> mean Christian did wrong and that he's rebased, which I know you hate
> and you have to resolve merge conflicts so please yell at him.". Come
> on.

Not subtly. You made a mistake and instead of admitting it
you're doubling down on your wrong git process.

> I work hard to effectively cooperate with you but until there is a
> good-faith mutual relationship on-list I don't want meaningful VFS work
> going through the bpf tree. You can take it or leave it and I would
> kindly ask Linus to respect that if he agrees.

Look, you took bpf patches that BPF CI flagged as broken
and bpf maintainers didn't even ack.
Out of 4 patches that you applied one was yours that
touched VFS and 3 were bpf related.
That was a wtf moment, but we didn't complain,
since the feature is useful, so we were happy to see
it land even in this half broken form.
We applied your "stable" branch to bpf-next and added fixes on top.
Then you squashed "hotfix".
That made all of our fixes in bpf-next to become conflicts.
We cannot reapply your branch. We don't rebase the trees.
That was the policy for years. Started long ago during
net-next era and now in bpf-next too.
This time we were lucky that conflicts were not that bad
and it was easy enough for Linus to deal with them,
but that must not repeat.

Do not touch bpf patches if you refuse to follow
stable branch process that everyone else does.
And it's not VFS. It's really just you, Christian.
Back in August 2024 Al created a true stable branch
vfs/stable-struct_fd. We pulled it into bpf-next
in commit 50470d3899cd ("Merge remote-tracking branch 'vfs/stable-struct_fd'")
While Al sent a PR for it during the merge window:
https://lore.kernel.org/all/20240923034731.GF3413968@ZenIV/
On the kernel/bpf/* side we added more changes on top of Al's work,
and, surprise, there were no conflicts during the merge window.
That's how stable branches meant to work.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [GIT PULL 09/14 for v6.17] vfs bpf
  2025-07-31 21:57       ` Alexei Starovoitov
@ 2025-08-04 14:24         ` Christian Brauner
  0 siblings, 0 replies; 44+ messages in thread
From: Christian Brauner @ 2025-08-04 14:24 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: Linus Torvalds, Linux-Fsdevel, LKML, bpf

On Thu, Jul 31, 2025 at 02:57:52PM -0700, Alexei Starovoitov wrote:
> On Thu, Jul 31, 2025 at 1:28 AM Christian Brauner <brauner@kernel.org> wrote:
> >
> > It's been in -next a few days. Instead of slapping some hotfix on top
> > that leaves the tree in a broken state the fix was squashed. In other
> > words you would have to reapply the series anyway.
> 
> That's not how stable branches work. The whole point of a stable
> branch is that sha-s should not change. You don't squash things
> after a branch is created.
> That extra fix could have been easily added on top.
> 
> > I mean, your mail is very short of "Linus, I'm subtly telling you what
> > mean Christian did wrong and that he's rebased, which I know you hate
> > and you have to resolve merge conflicts so please yell at him.". Come
> > on.
> 
> Not subtly. You made a mistake and instead of admitting it
> you're doubling down on your wrong git process.
> 
> > I work hard to effectively cooperate with you but until there is a
> > good-faith mutual relationship on-list I don't want meaningful VFS work
> > going through the bpf tree. You can take it or leave it and I would
> > kindly ask Linus to respect that if he agrees.
> 
> Look, you took bpf patches that BPF CI flagged as broken
> and bpf maintainers didn't even ack.
> Out of 4 patches that you applied one was yours that
> touched VFS and 3 were bpf related.
> That was a wtf moment, but we didn't complain,
> since the feature is useful, so we were happy to see
> it land even in this half broken form.
> We applied your "stable" branch to bpf-next and added fixes on top.
> Then you squashed "hotfix".
> That made all of our fixes in bpf-next to become conflicts.
> We cannot reapply your branch. We don't rebase the trees.
> That was the policy for years. Started long ago during
> net-next era and now in bpf-next too.
> This time we were lucky that conflicts were not that bad
> and it was easy enough for Linus to deal with them,
> but that must not repeat.

Ah, I see what you're complaining about now. But I'm still not happy
that we didn't manage to resolve this confusion earlier.

I was not clear in what way you did rely on that branch and that you
relied on me not folding in the mutex fix especially because you didn't
reply when I said I would fold it and you said that putting fixes on top
wouldn't work upthread.

If I'm aware that a branch is shared and relied upon then I won't change it.
I would've immediately rolled it back would I have know that this causes
issues for you but to me everything looked fine when I didn't hear back.

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2025-08-04 14:24 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-29 18:15   ` Alexei Starovoitov
2025-07-31  8:27     ` Christian Brauner
2025-07-31 21:57       ` Alexei Starovoitov
2025-08-04 14:24         ` Christian Brauner
2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
2025-07-28 18:57   ` Linus Torvalds
2025-07-31  9:37     ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 06/14 for v6.17] vfs fallocate Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
2025-07-28  1:29   ` Hugh Dickins
2025-07-28 22:21     ` Linus Torvalds
2025-07-29  7:49       ` Christoph Hellwig
2025-07-29  8:39         ` Linus Torvalds
2025-07-31  8:00           ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
2025-07-27 13:10   ` Sasha Levin
2025-07-28 16:39     ` Joanne Koong
2025-07-31  8:29       ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-31  9:40 ` [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).