public inbox for linux-man@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/10] man2: document "new" mount API
@ 2025-09-19  1:59 Aleksa Sarai
  2025-09-19  1:59 ` [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Aleksa Sarai
                   ` (10 more replies)
  0 siblings, 11 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

Back in 2019, the new mount API was merged[1]. David Howells then set
about writing man pages for these new APIs, and sent some patches back
in 2020[2].

Unfortunately, these patches were never merged, which meant that these
APIs were practically undocumented for many years -- arguably this has
been a contributing factor to the relatively slow adoption of these new
(far better) APIs. For instance, I have often discovered that many folks
are unaware of the read(2)-based message retrieval interface provided by
filesystem context file descriptors.

In 2024, Christian Brauner adapted David Howell's original man pages
into the easier-to-edit Markdown format and published them on GitHub[3].
These have been maintained since, including updated information on new
features added since David Howells's 2020 draft pages (such as
MOVE_MOUNT_BENEATH).

While this was a welcome improvement to the previous status quo (that
had lasted over 6 years), speaking personally my experience is that not
having access to these man pages from the terminal has been a fairly
common painpoint.

So, this is a modern version of the man pages for these APIs, in the
hopes that we can finally (6 years later) get proper documentation for
these APIs in the man-pages project.

One important thing to note is that most of these were re-written by me,
with very minimal copying from the versions available from Christian[2].
The reasons for this are two-fold:

 * Both Howells's original version and Christian's maintained versions
   contain crucial mistakes that I have been bitten by in the past (the
   most obvious being that all of these APIs were merged in Linux 5.2,
   but the man pages all claim they were merged in different versions.)

 * As the man pages appear to have been written from Howells's
   perspective while implementing them, some of the wording is a little
   too tied to the implementation (or appears to describe features that
   don't really exist in the merged versions of these APIs).

 * The original versions of the man-pages lacked bigger-picture
   explanations of the reasoning behind the API, which would make it
   easier for readers to understand what operations are doing.

I decided that the best way to resolve these issues is to rewrite them
from the perspective of an actual user of these APIs (me), and check
that we do not repeat the mistakes I found in the originals. I have also
done my best to resolve the issues raised by Michael Kerrisk on the
original patchset sent by Howells[1].

In addition, I have also included a man page for open_tree_attr(2) (as a
subsection of the new open_tree(2) man page), which was merged in Linux
6.15.

[1]: https://lore.kernel.org/all/20190507204921.GL23075@ZenIV.linux.org.uk/
[2]: https://lore.kernel.org/linux-man/159680892602.29015.6551860260436544999.stgit@warthog.procyon.org.uk/
[3]: https://github.com/brauner/man-pages-md

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
Changes in v4:
- `sed -i s|\\% |\\%|g`.
- Remove unneeded quotes in SYNOPSIS. [Alejandro Colomar]
- open_tree(2): fix leftover confusing usages of "attach" when referring
  to file descriptors being associated with mount objects.
- open_tree(2): rename "Anonymous mount namespaces" NOTES subsection to
  the far more informative "Mount propagation" and clean up the wording
  a little.
- open_tree_attr(2): add a code comment about
  <https://lore.kernel.org/all/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com/>
- {fsconfig,open_tree_attr}(2): use _Nullable.
- {fsmount,open_tree}(2): mention the the unmount-on-close behaviour is
  actually lazy (a-la MNT_DETACH).
- {fsconfig,mount_setattr}(2): improve "mount attributes and filesystem
  parameters" wording to make it clearer that superblock and mount flags
  are sibling properties, not the same thing.
- open_tree(2): mention that any mount propagation events while the mount
  object is detached are completely lost -- i.e., they don't get replayed once
  you attach the mount somewhere.
- fsconfig(2): fix minor grammatical / missing joining word issues.
- fsconfig(2): fix final leftover `.IR A " and " B` cases.
- fsconfig(2): explain that failed fsconfig(FSCONFIG_CMD_*) operations render
  the filesystem context invalid.
- fsconfig(2): rework the description of superblock reuse, as the previous text
  was very wrong. (Though there has been discussion about changing this
  behaviour...)
- fsconfig(2): remove misleading wording in FSCONFIG_CMD_CREATE_EXCL about how
  we are requesting a new filesystem instance -- in theory filesystems could
  take this request into account but in practice none do (and it seems unlikely
  any ever will).
- fsconfig(2): mention that key, value, and aux must be 0 or NULL for
  FSCONFIG_CMD_RECONF.
- fsmount(2): fix usage of "filesystem instance" in relation to fsmount() and
  open_tree() comparison. [Askar Safin]
- move_mount(2): "as attached" -> "as a detached" [Askar Safin]
- fspick(2): add note about filesystem parameter list being copied rather than
  reset with FSCONFIG_CMD_RECONFIGURE. [Askar Safin]
- v3: <https://lore.kernel.org/r/20250809-new-mount-api-v3-0-f61405c80f34@cyphar.com>

Changes in v3:
- `sed -i s|Co-developed-by|Co-authored-by|g`. [Alejandro Colomar]
  - Add Signed-off-by for co-authors. [Christian Brauner]
- `sed -i s|needs-mount|awaiting-mount|g`, to match the kernel parlance.
- Fix VERSIONS/HISTORY mixup in mount_attr(2type) that was copied from
  open_how(2type). [Alejandro Colomar]
- Fix incorrect .BR usage in SYNOPSIS.
- Some more semantic newlines fixes. [Alejandro Colomar]
- Minor fixes suggested by Alejandro. [Alejandro Colomar]
- open_tree_attr(2): heavily reword everything to be better formatted
  and more explicit about its behaviour.
- open_tree(2): write proper explanatory paragraphs for the EXAMPLES.
- mount_setattr(2): fix stray doublequote in SYNOPSIS. [Askar Safin]
- fsopen(2): rework structure of the DESCRIPTION introduction.
- fsopen(2): explicitly say that read(2) errors in the message retrieval
  interface are actual errors, not return 0. [Askar Safin]
- fsopen(2): add BUGS section to describe the unfortunate -ENODATA
  message dropping behaviour that should be fixed by
  <https://lore.kernel.org/r/20250807-fscontext-log-cleanups-v3-0-8d91d6242dc3@cyphar.com/>.
- fsconfig(2): add a NOTES subsection about generic filesystem
  parameters.
- fsconfig(2): add comment about the weirdness surrounding
  FSCONFIG_SET_PATH.
- {fspick,open_tree}(2): Correct AT_NO_AUTOMOUNT description (copied
  from David, who probably copied it from statx(2)) -- AT_NO_AUTOMOUNT
  applies to all path components, not just the final one. [Christian
  Brauner]
- statx(2): fix AT_NO_AUTOMOUNT documentation.
- open_tree(2): swap open(2) reference for openat(2) when saying that
  the result is identical. [Askar Safin]
- fsmount(2): fix DESCRIPTION introduction, and rework attr_flags
  description to better reference mount_setattr(2).
- {fsopen,fspick,fsmount,open_tree}(2): don't use "attach" when talking
  about the file descriptors we return that reference in-kernel objects,
  to avoid confusing readers with mount object attachment status.
- fsconfig(2): remove pidns argument example, as it was kind of unclear
  and referenced kernel features not yet merged.
- fsconfig(2): remove rambling FSCONFIG_SET_PATH_EMPTY text (which
  mostly describes an academic issue that doesn't apply to any existing
  filesystem), and instead add a CAVEATS section which touches on the
  weird type behaviour of fsconfig(2).
- v2: <https://lore.kernel.org/r/20250807-new-mount-api-v2-0-558a27b8068c@cyphar.com>

Changes in v2:
- `make -R lint-man`. [Alejandro Colomar]
- `sed -i s|Glibc|glibc|g`. [Alejandro Colomar]
- `sed -i s|pathname|path|g` [Alejandro Colomar]
- Clean up macro usage, example code, and synopsis. [Alejandro Colomar]
- Try to use semantic newlines. [Alejandro Colomar]
- Make sure the usage of "filesystem context", "filesystem instance",
  and "mount object" are consistent. [Askar Safin]
- Avoid referring to these syscalls without an "at" suffix as "*at()
  syscalls". [Askar Safin]
- Use \% to avoid hyphenation of constants. [Askar Safin, G. Branden Robinson]
- Add a new subsection to mount_setattr(2) to describe the distinction
  between mount attributes and filesystem parameters.
- (Under protest) double-space-after-period formatted commit messages.
- v1: <https://lore.kernel.org/r/20250806-new-mount-api-v1-0-8678f56c6ee0@cyphar.com>

---
Aleksa Sarai (10):
      man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type)
      man/man2/fsopen.2: document "new" mount API
      man/man2/fspick.2: document "new" mount API
      man/man2/fsconfig.2: document "new" mount API
      man/man2/fsmount.2: document "new" mount API
      man/man2/move_mount.2: document "new" mount API
      man/man2/open_tree.2: document "new" mount API
      man/man2/mount_setattr.2: mirror opening sentence from fsopen(2)
      man/man2/open_tree{,_attr}.2: document new open_tree_attr() API
      man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction

 man/man2/fsconfig.2           | 739 ++++++++++++++++++++++++++++++++++++++++++
 man/man2/fsmount.2            | 231 +++++++++++++
 man/man2/fsopen.2             | 384 ++++++++++++++++++++++
 man/man2/fspick.2             | 342 +++++++++++++++++++
 man/man2/mount_setattr.2      |  63 +++-
 man/man2/move_mount.2         | 646 ++++++++++++++++++++++++++++++++++++
 man/man2/open_tree.2          | 638 ++++++++++++++++++++++++++++++++++++
 man/man2/open_tree_attr.2     |   1 +
 man/man2type/mount_attr.2type |  61 ++++
 9 files changed, 3092 insertions(+), 13 deletions(-)
---
base-commit: e86f9fd0c279f593242969a2fbb5ef379272d89d
change-id: 20250802-new-mount-api-436db984f432


Kind regards,
-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type)
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-19 16:04   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API Aleksa Sarai
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

As with open_how(2type), it makes sense to move this to a separate man
page.  In addition, future man pages added in this patchset will want to
reference mount_attr(2type).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/mount_setattr.2      | 17 ++++--------
 man/man2type/mount_attr.2type | 61 +++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
index 586633f48e894bf8f2823aa7755c96adcddea6a6..4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8 100644
--- a/man/man2/mount_setattr.2
+++ b/man/man2/mount_setattr.2
@@ -114,18 +114,11 @@ .SH DESCRIPTION
 .I attr
 argument of
 .BR mount_setattr ()
-is a structure of the following form:
-.P
-.in +4n
-.EX
-struct mount_attr {
-    __u64 attr_set;     /* Mount properties to set */
-    __u64 attr_clr;     /* Mount properties to clear */
-    __u64 propagation;  /* Mount propagation type */
-    __u64 userns_fd;    /* User namespace file descriptor */
-};
-.EE
-.in
+is a pointer to a
+.I mount_attr
+structure,
+described in
+.BR mount_attr (2type).
 .P
 The
 .I attr_set
diff --git a/man/man2type/mount_attr.2type b/man/man2type/mount_attr.2type
new file mode 100644
index 0000000000000000000000000000000000000000..f5c4f48be46ec1e6c0d3a211b6724a1e95311a41
--- /dev/null
+++ b/man/man2type/mount_attr.2type
@@ -0,0 +1,61 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH mount_attr 2type (date) "Linux man-pages (unreleased)"
+.SH NAME
+mount_attr \- what mount properties to set and clear
+.SH LIBRARY
+Linux kernel headers
+.SH SYNOPSIS
+.EX
+.B #include <sys/mount.h>
+.P
+.B struct mount_attr {
+.BR "    u64 attr_set;" "     /* Mount properties to set */"
+.BR "    u64 attr_clr;" "     /* Mount properties to clear */"
+.BR "    u64 propagation;" "  /* Mount propagation type */"
+.BR "    u64 userns_fd;" "    /* User namespace file descriptor */"
+    /* ... */
+.B };
+.EE
+.SH DESCRIPTION
+Specifies which mount properties should be changed with
+.BR mount_setattr (2).
+.P
+The fields are as follows:
+.TP
+.I .attr_set
+This field specifies which
+.BI MOUNT_ATTR_ *
+attribute flags to set.
+.TP
+.I .attr_clr
+This field specifies which
+.BI MOUNT_ATTR_ *
+attribute flags to clear.
+.TP
+.I .propagation
+This field specifies what mount propagation will be applied.
+The valid values of this field are the same propagation types described in
+.BR mount_namespaces (7).
+.TP
+.I .userns_fd
+This field specifies a file descriptor that indicates which user namespace to
+use as a reference for ID-mapped mounts with
+.BR MOUNT_ATTR_IDMAP .
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.12.
+.\" commit 2a1867219c7b27f928e2545782b86daaf9ad50bd
+glibc 2.36.
+.P
+Extra fields may be appended to the structure,
+with a zero value in a new field resulting in
+the kernel behaving as though that extension field was not present.
+Therefore, a user
+.I must
+zero-fill this structure on initialization.
+.SH SEE ALSO
+.BR mount_setattr (2)

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
  2025-09-19  1:59 ` [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-19 16:24   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 03/10] man/man2/fspick.2: " Aleksa Sarai
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/fsopen.2 | 384 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 384 insertions(+)

diff --git a/man/man2/fsopen.2 b/man/man2/fsopen.2
new file mode 100644
index 0000000000000000000000000000000000000000..7cdbeac7d64b7e5c969dee619a039ec947d1e981
--- /dev/null
+++ b/man/man2/fsopen.2
@@ -0,0 +1,384 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH fsopen 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+fsopen \- create a new filesystem context
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.B #include <sys/mount.h>
+.P
+.BI "int fsopen(const char *" fsname ", unsigned int " flags );
+.fi
+.SH DESCRIPTION
+The
+.BR fsopen ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR fsopen ()
+creates a blank filesystem configuration context within the kernel
+for the filesystem named by
+.I fsname
+and places it into creation mode.
+A new file descriptor
+associated with the filesystem configuration context
+is then returned.
+The calling process must have the
+.B \%CAP_SYS_ADMIN
+capability in order to create a new filesystem configuration context.
+.P
+A filesystem configuration context is
+an in-kernel representation of a pending transaction,
+containing a set of configuration parameters that are to be applied
+when creating a new instance of a filesystem
+(or modifying the configuration of an existing filesystem instance,
+such as when using
+.BR fspick (2)).
+.P
+After obtaining a filesystem configuration context with
+.BR fsopen (),
+the general workflow for operating on the context looks like the following:
+.IP (1) 5
+Pass the filesystem context file descriptor to
+.BR fsconfig (2)
+to specify any desired filesystem parameters.
+This may be done as many times as necessary.
+.IP (2)
+Pass the same filesystem context file descriptor to
+.BR fsconfig (2)
+with
+.B \%FSCONFIG_CMD_CREATE
+to create an instance of the configured filesystem.
+.IP (3)
+Pass the same filesystem context file descriptor to
+.BR fsmount (2)
+to create a new detached mount object for
+the root of the filesystem instance,
+which is then attached to a new file descriptor.
+(This also places the filesystem context file descriptor into
+reconfiguration mode,
+similar to the mode produced by
+.BR fspick (2).)
+Once a mount object has been created with
+.BR fsmount (2),
+the filesystem context file descriptor can be safely closed.
+.IP (4)
+Now that a mount object has been created,
+you may
+.RS
+.IP (4.1) 7
+use the detached mount object file descriptor as a
+.I dirfd
+argument to "*at()" system calls; and/or
+.IP (4.2) 7
+attach the mount object to a mount point
+by passing the mount object file descriptor to
+.BR move_mount (2).
+This will also prevent the mount object from
+being unmounted and destroyed when
+the mount object file descriptor is closed.
+.RE
+.IP
+The mount object file descriptor will
+remain associated with the mount object
+even after doing the above operations,
+so you may repeatedly use the mount object file descriptor with
+.BR move_mount (2)
+and/or "*at()" system calls
+as many times as necessary.
+.P
+A filesystem context will move between different modes
+throughout its lifecycle
+(such as the creation phase
+when created with
+.BR fsopen (),
+the reconfiguration phase
+when an existing filesystem instance is selected with
+.BR fspick (2),
+and the intermediate "awaiting-mount" phase
+.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
+between
+.BR \%FSCONFIG_CMD_CREATE
+and
+.BR fsmount (2)),
+which has an impact on
+what operations are permitted on the filesystem context.
+.P
+The file descriptor returned by
+.BR fsopen ()
+also acts as a channel for filesystem drivers to
+provide more comprehensive diagnostic information
+than is normally provided through the standard
+.BR errno (3)
+interface for system calls.
+If an error occurs at any time during the workflow mentioned above,
+calling
+.BR read (2)
+on the filesystem context file descriptor
+will retrieve any ancillary information about the encountered errors.
+(See the "Message retrieval interface" section
+for more details on the message format.)
+.P
+.I flags
+can be used to control aspects of
+the creation of the filesystem configuration context file descriptor.
+A value for
+.I flags
+is constructed by bitwise ORing
+zero or more of the following constants:
+.RS
+.TP
+.B FSOPEN_CLOEXEC
+Set the close-on-exec
+.RB ( FD_CLOEXEC )
+flag on the new file descriptor.
+See the description of the
+.B O_CLOEXEC
+flag in
+.BR open (2)
+for reasons why this may be useful.
+.RE
+.P
+A list of filesystems supported by the running kernel
+(and thus a list of valid values for
+.IR fsname )
+can be obtained from
+.IR /proc/filesystems .
+(See also
+.BR proc_filesystems (5).)
+.SS Message retrieval interface
+When doing operations on a filesystem configuration context,
+the filesystem driver may choose to provide
+ancillary information to userspace
+in the form of message strings.
+.P
+The filesystem context file descriptors returned by
+.BR fsopen ()
+and
+.BR fspick (2)
+may be queried for message strings at any time by calling
+.BR read (2)
+on the file descriptor.
+Each call to
+.BR read (2)
+will return a single message,
+prefixed to indicate its class:
+.RS
+.TP
+\fBe\fP <\fImessage\fP>
+An error message was logged.
+This is usually associated with an error being returned
+from the corresponding system call which triggered this message.
+.TP
+\fBw\fP <\fImessage\fP>
+A warning message was logged.
+.TP
+\fBi\fP <\fImessage\fP>
+An informational message was logged.
+.RE
+.P
+Messages are removed from the queue as they are read.
+Note that the message queue has limited depth,
+so it is possible for messages to get lost.
+If there are no messages in the message queue,
+.B read(2)
+will return \-1 and
+.I errno
+will be set to
+.BR \%ENODATA .
+If the
+.I buf
+argument to
+.BR read (2)
+is not large enough to contain the entire message,
+.BR read (2)
+will return \-1 and
+.I errno
+will be set to
+.BR \%EMSGSIZE .
+(See BUGS.)
+.P
+If there are multiple filesystem contexts
+referencing the same filesystem instance
+(such as if you call
+.BR fspick (2)
+multiple times for the same mount),
+each one gets its own independent message queue.
+This does not apply to multiple file descriptors that are
+tied to the same underlying open file description
+(such as those created with
+.BR dup (2)).
+.P
+Message strings will usually be prefixed by
+the name of the filesystem or kernel subsystem
+that logged the message,
+though this may not always be the case.
+See the Linux kernel source code for details.
+.SH RETURN VALUE
+On success, a new file descriptor is returned.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EFAULT
+.I fsname
+is NULL
+or a pointer to a location
+outside the calling process's accessible address space.
+.TP
+.B EINVAL
+.I flags
+had an invalid flag set.
+.TP
+.B EMFILE
+The calling process has too many open files to create more.
+.TP
+.B ENFILE
+The system has too many open files to create more.
+.TP
+.B ENODEV
+The filesystem named by
+.I fsname
+is not supported by the kernel.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B EPERM
+The calling process does not have the required
+.B \%CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit 24dcb3d90a1f67fe08c68a004af37df059d74005
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH BUGS
+.SS Message retrieval interface and \fB\%EMSGSIZE\fP
+As described in the "Message retrieval interface" subsection above,
+calling
+.BR read (2)
+with too small a buffer to contain
+the next pending message in the message queue
+for the filesystem configuration context
+will cause
+.BR read (2)
+to return \-1 and set
+.BR errno (3)
+to
+.BR \%EMSGSIZE .
+.P
+However,
+this failed operation still
+consumes the message from the message queue.
+This effectively discards the message silently,
+as no data is copied into the
+.BR read (2)
+buffer.
+.P
+Programs should take care to ensure that
+their buffers are sufficiently large
+to contain any reasonable message string,
+in order to avoid silently losing valuable diagnostic information.
+.\" Aleksa Sarai
+.\"   This unfortunate behaviour has existed since this feature was merged, but
+.\"   I have sent a patchset which will finally fix it.
+.\"   <https://lore.kernel.org/r/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@cyphar.com/>
+.SH EXAMPLES
+To illustrate the workflow for creating a new mount,
+the following is an example of how to mount an
+.BR ext4 (5)
+filesystem stored on
+.I /dev/sdb1
+onto
+.IR /mnt .
+.P
+.in +4n
+.EX
+int fsfd, mntfd;
+\&
+fsfd = fsopen("ext4", FSOPEN_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_PATH, "source", "/dev/sdb1", AT_FDCWD);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0)
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_RELATIME);
+move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+First,
+an ext4 configuration context is created and attached to the file descriptor
+.IR fsfd .
+Then, a series of parameters
+(such as the source of the filesystem)
+are provided using
+.BR fsconfig (2),
+followed by the filesystem instance being created with
+.BR \%FSCONFIG_CMD_CREATE .
+.BR fsmount (2)
+is then used to create a new mount object attached to the file descriptor
+.IR mntfd ,
+which is then attached to the intended mount point using
+.BR move_mount (2).
+.P
+The above procedure is functionally equivalent to
+the following mount operation using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/dev/sdb1", "/mnt", "ext4", MS_RELATIME,
+      "ro,noatime,acl,user_xattr,iversion");
+.EE
+.in
+.P
+And here's an example of creating a mount object
+of an NFS server share
+and setting a Smack security module label.
+However, instead of attaching it to a mount point,
+the program uses the mount object directly
+to open a file from the NFS share.
+.P
+.in +4n
+.EX
+int fsfd, mntfd, fd;
+\&
+fsfd = fsopen("nfs", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "example.com/pub/linux", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, 0, MOUNT_ATTR_NODEV);
+fd = openat(mntfd, "src/linux-5.2.tar.xz", O_RDONLY);
+.EE
+.in
+.P
+Unlike the previous example,
+this operation has no trivial equivalent with
+.BR mount (2),
+as it was not previously possible to create a mount object
+that is not attached to any mount point.
+.SH SEE ALSO
+.BR fsconfig (2),
+.BR fsmount (2),
+.BR fspick (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR mount_namespaces (7)

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
  2025-09-19  1:59 ` [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Aleksa Sarai
  2025-09-19  1:59 ` [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21  8:53   ` Alejandro Colomar
  2025-09-22  0:25   ` Askar Safin
  2025-09-19  1:59 ` [PATCH v4 04/10] man/man2/fsconfig.2: " Aleksa Sarai
                   ` (7 subsequent siblings)
  10 siblings, 2 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/fspick.2 | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 342 insertions(+)

diff --git a/man/man2/fspick.2 b/man/man2/fspick.2
new file mode 100644
index 0000000000000000000000000000000000000000..1f87293f44658adeb7ab7cffebcac3174888f040
--- /dev/null
+++ b/man/man2/fspick.2
@@ -0,0 +1,342 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH fspick 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+fspick \- select filesystem for reconfiguration
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
+.B #include <sys/mount.h>
+.P
+.BI "int fspick(int " dirfd ", const char *" path ", unsigned int " flags );
+.fi
+.SH DESCRIPTION
+The
+.BR fspick ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR fspick()
+creates a new filesystem configuration context
+for the extant filesystem instance
+associated with the path described by
+.IR dirfd
+and
+.IR path ,
+places it into reconfiguration mode
+(similar to
+.BR mount (8)
+with the
+.I -o remount
+option).
+A new file descriptor
+associated with the filesystem configuration context
+is then returned.
+The calling process must have the
+.BR CAP_SYS_ADMIN
+capability in order to create a new filesystem configuration context.
+.P
+The resultant file descriptor can be used with
+.BR fsconfig (2)
+to specify the desired set of changes to
+filesystem parameters of the filesystem instance.
+Once the desired set of changes have been configured,
+the changes can be effectuated by calling
+.BR fsconfig (2)
+with the
+.B \%FSCONFIG_CMD_RECONFIGURE
+command.
+Please note that\[em]in contrast to
+the behaviour of
+.B MS_REMOUNT
+with
+.BR mount (2)\[em] fspick ()
+instantiates the filesystem configuration context
+with a copy of
+the extant filesystem's filesystem parameters,
+meaning that a subsequent
+.B \%FSCONFIG_CMD_RECONFIGURE
+operation
+will only update filesystem parameters
+explicitly modified with
+.BR fsconfig (2).
+.P
+As with "*at()" system calls,
+.BR fspick ()
+uses the
+.I dirfd
+argument in conjunction with the
+.I path
+argument to determine the path to operate on, as follows:
+.IP \[bu] 3
+If the pathname given in
+.I path
+is absolute, then
+.I dirfd
+is ignored.
+.IP \[bu]
+If the pathname given in
+.I path
+is relative and
+.I dirfd
+is the special value
+.BR \%AT_FDCWD ,
+then
+.I path
+is interpreted relative to
+the current working directory
+of the calling process (like
+.BR open (2)).
+.IP \[bu]
+If the pathname given in
+.I path
+is relative,
+then it is interpreted relative to
+the directory referred to by the file descriptor
+.I dirfd
+(rather than relative to
+the current working directory
+of the calling process,
+as is done by
+.BR open (2)
+for a relative pathname).
+In this case,
+.I dirfd
+must be a directory
+that was opened for reading
+.RB ( O_RDONLY )
+or using the
+.B O_PATH
+flag.
+.IP \[bu]
+If
+.I path
+is an empty string,
+and
+.I flags
+contains
+.BR \%FSPICK_EMPTY_PATH ,
+then the file descriptor
+.I dirfd
+is operated on directly.
+In this case,
+.I dirfd
+may refer to any type of file,
+not just a directory.
+.P
+See
+.BR openat (2)
+for an explanation of why the
+.I dirfd
+argument is useful.
+.P
+.I flags
+can be used to control aspects of how
+.I path
+is resolved and
+properties of the returned file descriptor.
+A value for
+.I flags
+is constructed by bitwise ORing
+zero or more of the following constants:
+.RS
+.TP
+.B FSPICK_CLOEXEC
+Set the close-on-exec
+.RB ( FD_CLOEXEC )
+flag on the new file descriptor.
+See the description of the
+.B O_CLOEXEC
+flag in
+.BR open (2)
+for reasons why this may be useful.
+.TP
+.B FSPICK_EMPTY_PATH
+If
+.I path
+is an empty string,
+operate on the file referred to by
+.I dirfd
+(which may have been obtained from
+.BR open (2),
+.BR fsmount (2),
+or
+.BR open_tree (2)).
+In this case,
+.I dirfd
+may refer to any type of file,
+not just a directory.
+If
+.I dirfd
+is
+.BR \%AT_FDCWD ,
+.BR fspick ()
+will operate on the current working directory
+of the calling process.
+.TP
+.B FSPICK_SYMLINK_NOFOLLOW
+Do not follow symbolic links
+in the terminal component of
+.IR path .
+If
+.I path
+references a symbolic link,
+the returned filesystem context will reference
+the filesystem that the symbolic link itself resides on.
+.TP
+.B FSPICK_NO_AUTOMOUNT
+Do not automount any automount points encountered
+while resolving
+.IR path .
+This allows you to reconfigure an automount point,
+rather than the location that would be mounted.
+This flag has no effect if
+the automount point has already been mounted over.
+.RE
+.P
+As with filesystem contexts created with
+.BR fsopen (2),
+the file descriptor returned by
+.BR fspick ()
+may be queried for message strings at any time by calling
+.BR read (2)
+on the file descriptor.
+(See the "Message retrieval interface" subsection in
+.BR fsopen (2)
+for more details on the message format.)
+.SH RETURN VALUE
+On success, a new file descriptor is returned.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied
+for one of the directories
+in the path prefix of
+.IR path .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+.I path
+is relative but
+.I dirfd
+is neither
+.B \%AT_FDCWD
+nor a valid file descriptor.
+.TP
+.B EFAULT
+.I path
+is NULL
+or a pointer to a location
+outside the calling process's accessible address space.
+.TP
+.B EINVAL
+Invalid flag specified in
+.IR flags .
+.TP
+.B ELOOP
+Too many symbolic links encountered when resolving
+.IR path .
+.TP
+.B EMFILE
+The calling process has too many open files to create more.
+.TP
+.B ENAMETOOLONG
+.I path
+is longer than
+.BR PATH_MAX .
+.TP
+.B ENFILE
+The system has too many open files to create more.
+.TP
+.B ENOENT
+A component of
+.I path
+does not exist,
+or is a dangling symbolic link.
+.TP
+.B ENOENT
+.I path
+is an empty string, but
+.B \%FSPICK_EMPTY_PATH
+is not specified in
+.IR flags .
+.TP
+.B ENOTDIR
+A component of the path prefix of
+.I path
+is not a directory;
+or
+.I path
+is relative and
+.I dirfd
+is a file descriptor referring to a file other than a directory.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B EPERM
+The calling process does not have the required
+.B \%CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH EXAMPLES
+The following example sets the read-only flag
+on the filesystem instance referenced by
+the mount object attached at
+.IR /tmp .
+.P
+.in +4n
+.EX
+int fsfd = fspick(AT_FDCWD, "/tmp", FSPICK_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
+fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
+.EE
+.in
+.P
+The above procedure is roughly equivalent to
+the following mount operation using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount(NULL, "/tmp", NULL, MS_REMOUNT | MS_RDONLY, NULL);
+.EE
+.in
+.P
+With the notable caveat that
+in this example,
+.BR mount (2)
+will clear all other filesystem parameters
+(such as
+.B MS_NOSUID
+or
+.BR MS_NOEXEC );
+.BR fsconfig (2)
+will only modify the
+.I ro
+parameter.
+.SH SEE ALSO
+.BR fsconfig (2),
+.BR fsmount (2),
+.BR fsopen (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR mount_namespaces (7)
+

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 04/10] man/man2/fsconfig.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (2 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 03/10] man/man2/fspick.2: " Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21  9:26   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 05/10] man/man2/fsmount.2: " Aleksa Sarai
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/fsconfig.2 | 727 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 727 insertions(+)

diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2
new file mode 100644
index 0000000000000000000000000000000000000000..5a18e08c700ac93aa22c341b4134944ee3c38d0b
--- /dev/null
+++ b/man/man2/fsconfig.2
@@ -0,0 +1,727 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH fsconfig 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+fsconfig \- configure new or existing filesystem context
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.B #include <sys/mount.h>
+.P
+.BI "int fsconfig(int " fd ", unsigned int " cmd ,
+.BI "             const char *_Nullable " key ,
+.BI "             const void *_Nullable " value ", int " aux );
+.fi
+.SH DESCRIPTION
+The
+.BR fsconfig ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR fsconfig ()
+is used to supply parameters to
+and issue commands against
+the filesystem configuration context
+associated with the file descriptor
+.IR fd .
+Filesystem configuration contexts can be created with
+.BR fsopen (2)
+or be instantiated from an extant filesystem instance with
+.BR fspick (2).
+.P
+The
+.I cmd
+argument indicates the command to be issued.
+Some commands supply parameters to the context
+(equivalent to mount options specified with
+.BR mount (8)),
+while others are meta-operations on the filesystem context.
+The list of valid
+.I cmd
+values are:
+.RS
+.TP
+.B FSCONFIG_SET_FLAG
+Set the flag parameter named by
+.IR key .
+.I value
+must be NULL,
+and
+.I aux
+must be 0.
+.TP
+.B FSCONFIG_SET_STRING
+Set the string parameter named by
+.I key
+to the value specified by
+.IR value .
+.I value
+points to a null-terminated string,
+and
+.I aux
+must be 0.
+.TP
+.B FSCONFIG_SET_BINARY
+Set the blob parameter named by
+.I key
+to the contents of the binary blob
+specified by
+.IR value .
+.I value
+points to
+the start of a buffer
+that is
+.I aux
+bytes in length.
+.TP
+.B FSCONFIG_SET_FD
+Set the file parameter named by
+.I key
+to the open file description
+referenced by the file descriptor
+.IR aux .
+.I value
+must be NULL.
+.IP
+You may also use
+.B \%FSCONFIG_SET_STRING
+for file parameters,
+with
+.I value
+set to a null-terminated string
+containing a base-10 representation
+of the file descriptor number.
+This mechanism is primarily intended for compatibility
+with older
+.BR mount (2)-based
+programs,
+and only works for parameters
+that
+.I only
+accept file descriptor arguments.
+.TP
+.B FSCONFIG_SET_PATH
+Set the path parameter named by
+.I key
+to the object at a provided path,
+resolved in a similar manner to
+.BR openat (2).
+.I value
+points to a null-terminated pathname string,
+and
+.I aux
+is equivalent to the
+.I dirfd
+argument to
+.BR openat (2).
+See
+.BR openat (2)
+for an explanation of the need for
+.BR \%FSCONFIG_SET_PATH .
+.IP
+You may also use
+.B \%FSCONFIG_SET_STRING
+for path parameters,
+the behaviour of which is equivalent to
+.B \%FSCONFIG_SET_PATH
+with
+.I aux
+set to
+.BR \%AT_FDCWD .
+.TP
+.B FSCONFIG_SET_PATH_EMPTY
+As with
+.BR \%FSCONFIG_SET_PATH ,
+except that if
+.I value
+is an empty string,
+the file descriptor specified by
+.I aux
+is operated on directly
+and may be any type of file
+(not just a directory).
+This is equivalent to the behaviour of
+.B \%AT_EMPTY_PATH
+with most "*at()" system calls.
+If
+.I aux
+is
+.BR \%AT_FDCWD ,
+the parameter will be set to
+the current working directory
+of the calling process.
+.TP
+.B FSCONFIG_CMD_CREATE
+This command instructs the filesystem driver
+to instantiate an instance of the filesystem in the kernel
+with the parameters specified in the filesystem configuration context.
+.I key
+and
+.I value
+must be NULL,
+and
+.I aux
+must be 0.
+.IP
+This command can only be issued once
+in the lifetime of a filesystem context.
+If the operation succeeds,
+the filesystem context
+associated with file descriptor
+.I fd
+now references the created filesystem instance,
+and is placed into a special "awaiting-mount" mode
+that allows you to use
+.BR fsmount (2)
+to create a mount object from the filesystem instance.
+.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
+If the operation fails,
+in most cases
+the filesystem context is placed in a failed mode
+and cannot be used for any further
+.BR fsconfig ()
+operations
+(though you may still retrieve diagnostic messages
+through the message retrieval interface,
+as described in
+the corresponding subsection of
+.BR fsopen (2)).
+.IP
+This command can only be issued against
+filesystem configuration contexts
+that were created with
+.BR fsopen (2).
+In order to create a filesystem instance,
+the calling process must have the
+.B \%CAP_SYS_ADMIN
+capability.
+.IP
+An important thing to be aware of is that
+the Linux kernel will
+.I silently
+reuse extant filesystem instances
+depending on the filesystem type
+and the configured parameters
+(each filesystem driver has
+its own policy for
+how filesystem instances are reused).
+This means that
+the filesystem instance "created" by
+.B \%FSCONFIG_CMD_CREATE
+may, in fact, be a reference
+to an extant filesystem instance in the kernel.
+(For reference,
+this behaviour also applies to
+.BR mount (2).)
+.IP
+One side-effect of this behaviour is that
+if an extant filesystem instance is reused,
+.I all
+parameters configured
+for this filesystem configuration context
+are
+.I silently ignored
+(with the exception of the
+.I ro
+and
+.I rw
+flag parameters;
+if the state of the read-only flag in the
+extant filesystem instance and the filesystem configuration context
+do not match, this operation will return
+.BR EBUSY ).
+This also means that
+.BR \%FSCONFIG_CMD_RECONFIGURE
+commands issued against
+the "created" filesystem instance
+will also affect any mount objects associated with
+the extant filesystem instance.
+.IP
+Programs that need to ensure
+that they create a new filesystem instance
+with specific parameters
+(notably, security-related parameters
+such as
+.I acl
+to enable POSIX ACLs\[em]as described in
+.BR acl (5))
+should use
+.B \%FSCONFIG_CMD_CREATE_EXCL
+instead.
+.TP
+.BR FSCONFIG_CMD_CREATE_EXCL " (since Linux 6.6)"
+.\" commit 22ed7ecdaefe0cac0c6e6295e83048af60435b13
+.\" commit 84ab1277ce5a90a8d1f377707d662ac43cc0918a
+As with
+.BR \%FSCONFIG_CMD_CREATE ,
+except that the kernel is instructed
+to not reuse extant filesystem instances.
+If the operation
+would be forced to
+reuse an extant filesystem instance,
+this operation will return
+.B EBUSY
+instead.
+.IP
+As a result (unlike
+.BR \%FSCONFIG_CMD_CREATE ),
+if this operation succeeds
+then the calling process can be sure that
+all of the parameters successfully configured with
+.BR fsconfig ()
+will actually be applied
+to the created filesystem instance.
+.TP
+.B FSCONFIG_CMD_RECONFIGURE
+This command instructs the filesystem driver
+to apply the parameters specified in the filesystem configuration context
+to the extant filesystem instance
+referenced by the filesystem configuration context.
+.I key
+and
+.I value
+must be NULL,
+and
+.I aux
+must be 0.
+.IP
+This is primarily intended for use with
+.BR fspick (2),
+but may also be used to modify
+the parameters of a filesystem instance
+after
+.BR \%FSCONFIG_CMD_CREATE
+was used to create it
+and a mount object was created using
+.BR fsmount (2).
+In order to reconfigure an extant filesystem instance,
+the calling process must have the
+.B CAP_SYS_ADMIN
+capability.
+.IP
+If the operation succeeds,
+the filesystem context is reset
+but remains in reconfiguration mode
+and thus can be reused for subsequent
+.B \%FSCONFIG_CMD_RECONFIGURE
+commands.
+If the operation fails,
+in most cases
+the filesystem context is placed in a failed mode
+and cannot be used for any further
+.BR fsconfig ()
+operations
+(though you may still retrieve diagnostic messages
+through the message retrieval interface,
+as described in
+the corresponding subsection of
+.BR fsopen (2)).
+.RE
+.P
+Parameters specified with
+.BI FSCONFIG_SET_ *
+do not take effect
+until a corresponding
+.B \%FSCONFIG_CMD_CREATE
+or
+.B \%FSCONFIG_CMD_RECONFIGURE
+command is issued.
+.SH RETURN VALUE
+On success,
+.BR fsconfig ()
+returns 0.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+If an error occurs, the filesystem driver may provide
+additional information about the error
+through the message retrieval interface for filesystem configuration contexts.
+This additional information can be retrieved at any time by calling
+.BR read (2)
+on the filesystem instance or filesystem configuration context
+referenced by the file descriptor
+.IR fd .
+(See the "Message retrieval interface" subsection in
+.BR fsopen (2)
+for more details on the message format.)
+.P
+Even after an error occurs,
+the filesystem configuration context is
+.I not
+invalidated,
+and thus can still be used with other
+.BR fsconfig ()
+commands.
+This means that users can probe support for filesystem parameters
+on a per-parameter basis,
+and adjust which parameters they wish to set.
+.P
+The error values given below result from
+filesystem type independent errors.
+Each filesystem type may have its own special errors
+and its own special behavior.
+See the Linux kernel source code for details.
+.TP
+.B EACCES
+A component of a path
+provided as a path parameter
+was not searchable.
+(See also
+.BR path_resolution (7).)
+.TP
+.B EACCES
+.B \%FSCONFIG_CMD_CREATE
+was attempted
+for a read-only filesystem
+without specifying the
+.RB ' ro '
+flag parameter.
+.TP
+.B EACCES
+A specified block device parameter
+is located on a filesystem
+mounted with the
+.B \%MS_NODEV
+option.
+.TP
+.B EBADF
+The file descriptor given by
+.I fd
+(or possibly by
+.IR aux ,
+depending on the command)
+is invalid.
+.TP
+.B EBUSY
+The filesystem context associated with
+.I fd
+is in the wrong state
+for the given command.
+.TP
+.B EBUSY
+The filesystem instance cannot be reconfigured as read-only
+with
+.B \%FSCONFIG_CMD_RECONFIGURE
+because some programs
+still hold files open for writing.
+.TP
+.B EBUSY
+A new filesystem instance was requested with
+.B \%FSCONFIG_CMD_CREATE_EXCL
+but a matching superblock already existed.
+.TP
+.B EFAULT
+One of the pointer arguments
+points to a location
+outside the calling process's accessible address space.
+.TP
+.B EINVAL
+.I fd
+does not refer to
+a filesystem configuration context
+or filesystem instance.
+.TP
+.B EINVAL
+One of the values of
+.IR name ,
+.IR value ,
+and/or
+.I aux
+were set to a non-zero value when
+.I cmd
+required that they be zero
+(or NULL).
+.TP
+.B EINVAL
+The parameter named by
+.I name
+cannot be set
+using the type specified with
+.IR cmd .
+.TP
+.B EINVAL
+One of the source parameters
+referred to
+an invalid superblock.
+.TP
+.B ELOOP
+Too many links encountered
+during pathname resolution
+of a path argument.
+.TP
+.B ENAMETOOLONG
+A path argument was longer than
+.BR PATH_MAX .
+.TP
+.B ENOENT
+A path argument had a non-existent component.
+.TP
+.B ENOENT
+A path argument is an empty string,
+but
+.I cmd
+is not
+.BR \%FSCONFIG_SET_PATH_EMPTY .
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B ENOTBLK
+The parameter named by
+.I name
+must be a block device,
+but the provided parameter value was not a block device.
+.TP
+.B ENOTDIR
+A component of the path prefix
+of a path argument
+was not a directory.
+.TP
+.B EOPNOTSUPP
+The command given by
+.I cmd
+is not valid.
+.TP
+.B ENXIO
+The major number
+of a block device parameter
+is out of range.
+.TP
+.B EPERM
+The command given by
+.I cmd
+was
+.BR \%FSCONFIG_CMD_CREATE ,
+.BR \%FSCONFIG_CMD_CREATE_EXCL ,
+or
+.BR \%FSCONFIG_CMD_RECONFIGURE ,
+but the calling process does not have the required
+.B \%CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit ecdab150fddb42fe6a739335257949220033b782
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH NOTES
+.SS Generic filesystem parameters
+Each filesystem driver is responsible for
+parsing most parameters specified with
+.BR fsconfig (),
+meaning that individual filesystems
+may have very different behaviour
+when encountering parameters with the same name.
+In general,
+you should not assume that the behaviour of
+.BR fsconfig ()
+when specifying a parameter to one filesystem type
+will match the behaviour of the same parameter
+with a different filesystem type.
+.P
+However,
+the following generic parameters
+apply to all filesystems and have unified behaviour.
+They are set using the listed
+.BI \%FSCONFIG_SET_ *
+command.
+.TP
+\fIro\fP and \fIrw\fP (\fB\%FSCONFIG_SET_FLAG\fP)
+Configure whether the filesystem instance is read-only.
+.TP
+\fIdirsync\fP (\fB\%FSCONFIG_SET_FLAG\fP)
+Make directory changes on this filesystem instance synchronous.
+.TP
+\fIsync\fP and \fIasync\fP (\fB\%FSCONFIG_SET_FLAG\fP)
+Configure whether writes on this filesystem instance
+will be made synchronous
+(as though the
+.B O_SYNC
+flag to
+.BR open (2)
+was specified for
+all file opens in this filesystem instance).
+.TP
+\fIlazytime\fP and \fInolazytime\fP (\fB\%FSCONFIG_SET_FLAG\fP)
+Configure whether to reduce on-disk updates
+of inode timestamps on this filesystem instance
+(as described in the
+.B \%MS_LAZYTIME
+section of
+.BR mount (2)).
+.TP
+\fImand\fP and \fInomand\fP (\fB\%FSCONFIG_SET_FLAG\fP)
+Configure whether the filesystem instance should permit mandatory locking.
+Since Linux 5.15,
+.\" commit f7e33bdbd6d1bdf9c3df8bba5abcf3399f957ac3
+mandatory locking has been deprecated
+and setting this flag is a no-op.
+.TP
+\fIsource\fP (\fB\%FSCONFIG_SET_STRING\fP)
+This parameter is equivalent to the
+.I source
+parameter passed to
+.BR mount (2)
+for the same filesystem type,
+and is usually the pathname of a block device
+containing the filesystem.
+This parameter may only be set once
+per filesystem configuration context transaction.
+.P
+In addition,
+any filesystem parameters associated with
+Linux Security Modules (LSMs)
+are also generic with respect to the underlying filesystem.
+See the documentation for the LSM you wish to configure for more details.
+.SH CAVEATS
+.SS Filesystem parameter types
+As a result of
+each filesystem driver being responsible for
+parsing most parameters specified with
+.BR fsconfig (),
+some filesystem drivers
+may have unintuitive behaviour
+with regards to which
+.BI \%FSCONFIG_SET_ *
+commands are permitted
+to configure a given parameter.
+.P
+In order for
+filesystem parameters to be backwards compatible with
+.BR mount (2),
+they must be parseable as strings;
+this almost universally means that
+.B \%FSCONFIG_SET_STRING
+can also be used to configure them.
+.\" Aleksa Sarai
+.\"   Theoretically, a filesystem could check fc->oldapi and refuse
+.\"   FSCONFIG_SET_STRING if the operation is coming from the new API, but no
+.\"   filesystems do this (and probably never will).
+However, other
+.BI \%FSCONFIG_SET_ *
+commands need to be opted into
+by each filesystem driver's parameter parser.
+.P
+One of the most user-visible instances of
+this inconsistency is that
+many filesystems do not support
+configuring path parameters with
+.B \%FSCONFIG_SET_PATH
+(despite the name),
+which can lead to somewhat confusing
+.B EINVAL
+errors.
+(For example, the generic
+.I source
+parameter\[em]which is usually a path\[em]can only be configured
+with
+.BR \%FSCONFIG_SET_STRING .)
+.P
+When writing programs that use
+.BR fsconfig ()
+to configure parameters
+with commands other than
+.BR \%FSCONFIG_SET_STRING ,
+users should verify
+that the
+.BI \%FSCONFIG_SET_ *
+commands used to configure each parameter
+are supported by the corresponding filesystem driver.
+.\" Aleksa Sarai
+.\"   While this (quite confusing) inconsistency in behaviour is true today
+.\"   (and has been true since this was merged), this appears to mostly be an
+.\"   unintended consequence of filesystem drivers hand-coding fsparam parsing.
+.\"   Path parameters are the most eggregious causes of confusion. Hopefully we
+.\"   can make this no longer the case in a future kernel.
+.SH EXAMPLES
+To illustrate the different kinds of flags that can be configured with
+.BR fsconfig (),
+here are a few examples of some different filesystems being created:
+.P
+.in +4n
+.EX
+int fsfd, mntfd;
+\&
+fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "inode64", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "uid", "1234", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "huge", "never", 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "casefold", NULL, 0);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOEXEC);
+move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH);
+\&
+fsfd = fsopen("erofs", FSOPEN_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/loop0", 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE_EXCL, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOSUID);
+move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+Usually,
+specifying the same parameter named by
+.I key
+multiple times with
+.BR fsconfig ()
+causes the parameter value to be replaced.
+However, some filesystems may have unique behaviour:
+.P
+.in +4n
+.EX
+\&
+int fsfd, mntfd;
+int lowerdirfd = open("/o/ctr/lower1", O_DIRECTORY | O_CLOEXEC);
+\&
+fsfd = fsopen("overlay", FSOPEN_CLOEXEC);
+/* "lowerdir+" appends to the lower dir stack each time. */
+fsconfig(fsfd, FSCONFIG_SET_FD, "lowerdir+", NULL, lowerdirfd);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower2", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower3", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower4", 0);
+.\" fsconfig(fsfd, FSCONFIG_SET_PATH, "lowerdir+", "/o/ctr/lower5", AT_FDCWD);
+.\" fsconfig(fsfd, FSCONFIG_SET_PATH_EMPTY, "lowerdir+", "", lowerdirfd);
+.\" Aleksa Sarai: Hopefully these will also be supported in the future.
+fsconfig(fsfd, FSCONFIG_SET_STRING, "xino", "auto", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "nfs_export", "off", 0);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0);
+move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+And here is an example of how
+.BR fspick (2)
+can be used with
+.BR fsconfig ()
+to reconfigure the parameters
+of an extant filesystem instance
+attached to
+.IR /proc :
+.P
+.in +4n
+.EX
+int fsfd = fspick(AT_FDCWD, "/proc", FSPICK_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "hidepid", "ptraceable", 0);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "subset", "pid", 0);
+fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
+.EE
+.in
+.SH SEE ALSO
+.BR fsmount (2),
+.BR fsopen (2),
+.BR fspick (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR mount_namespaces (7)
+

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 05/10] man/man2/fsmount.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (3 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 04/10] man/man2/fsconfig.2: " Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21 23:08   ` Askar Safin
  2025-09-19  1:59 ` [PATCH v4 06/10] man/man2/move_mount.2: " Aleksa Sarai
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/fsmount.2 | 231 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 231 insertions(+)

diff --git a/man/man2/fsmount.2 b/man/man2/fsmount.2
new file mode 100644
index 0000000000000000000000000000000000000000..c054c04376975c620aec08b76ad5151d8b6ae2ed
--- /dev/null
+++ b/man/man2/fsmount.2
@@ -0,0 +1,231 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH fsmount 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+fsmount \- instantiate mount object from filesystem context
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.B #include <sys/mount.h>
+.P
+.BI "int fsmount(int " fsfd ", unsigned int " flags ", \
+unsigned int " attr_flags );
+.fi
+.SH DESCRIPTION
+The
+.BR fsmount ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR fsmount ()
+creates a new detached mount object
+for the root of the new filesystem instance
+referenced by the filesystem context file descriptor
+.IR fsfd .
+A new file descriptor
+associated with the detached mount object
+is then returned.
+In order to create a mount object with
+.BR fsmount (),
+the calling process must have the
+.BR \%CAP_SYS_ADMIN
+capability.
+.P
+The filesystem context must have been created with a call to
+.BR fsopen (2)
+and then had a filesystem instance instantiated with a call to
+.BR fsconfig (2)
+with
+.B \%FSCONFIG_CMD_CREATE
+or
+.B \%FSCONFIG_CMD_CREATE_EXCL
+in order to be in the correct state
+for this operation
+(the "awaiting-mount" mode in kernel-developer parlance).
+.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
+Unlike
+.BR open_tree (2)
+with
+.BR \%OPEN_TREE_CLONE,
+.BR fsmount ()
+can only be called once
+in the lifetime of a filesystem context
+to produce a mount object.
+.P
+As with file descriptors returned from
+.BR open_tree (2)
+called with
+.BR OPEN_TREE_CLONE ,
+the returned file descriptor
+can then be used with
+.BR move_mount (2),
+.BR mount_setattr (2),
+or other such system calls to do further mount operations.
+This mount object will be unmounted and destroyed
+when the file descriptor is closed
+if it was not otherwise attached to a mount point
+by calling
+.BR move_mount (2).
+(Note that the unmount operation on
+.BR close (2)
+is lazy\[em]akin to calling
+.BR umount2 (2)
+with
+.BR MOUNT_DETACH ;
+any existing open references to files
+from the mount object
+will continue to work,
+and the mount object will only be completely destroyed
+once it ceases to be busy.)
+The returned file descriptor
+also acts the same as one produced by
+.BR open (2)
+with
+.BR O_PATH ,
+meaning it can also be used as a
+.I dirfd
+argument
+to "*at()" system calls.
+.P
+.I flags
+controls the creation of the returned file descriptor.
+A value for
+.I flags
+is constructed by bitwise ORing
+zero or more of the following constants:
+.RS
+.TP
+.B FSMOUNT_CLOEXEC
+Set the close-on-exec
+.RB ( FD_CLOEXEC )
+flag on the new file descriptor.
+See the description of the
+.B O_CLOEXEC
+flag in
+.BR open (2)
+for reasons why this may be useful.
+.RE
+.P
+.I attr_flags
+specifies mount attributes
+which will be applied to the created mount object,
+in the form of
+.BI \%MOUNT_ATTR_ *
+flags.
+The flags are interpreted as though
+.BR mount_setattr (2)
+was called with
+.I attr.attr_set
+set to the same value as
+.IR attr_flags .
+.BI \%MOUNT_ATTR_ *
+flags which would require
+specifying additional fields in
+.BR mount_attr (2type)
+(such as
+.BR \%MOUNT_ATTR_IDMAP )
+are not valid flag values for
+.IR attr_flags .
+.P
+If the
+.BR fsmount ()
+operation is successful,
+the filesystem context
+associated with the file descriptor
+.I fsfd
+is reset
+and placed into reconfiguration mode,
+as if it were just returned by
+.BR fspick (2).
+You may continue to use
+.BR fsconfig (2)
+with the now-reset filesystem context,
+including issuing the
+.B \%FSCONFIG_CMD_RECONFIGURE
+command
+to reconfigure the filesystem instance.
+.SH RETURN VALUE
+On success, a new file descriptor is returned.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EBUSY
+The filesystem context associated with
+.I fsfd
+is not in the right state
+to be used by
+.BR fsmount ().
+.TP
+.B EINVAL
+.I flags
+had an invalid flag set.
+.TP
+.B EINVAL
+.I attr_flags
+had an invalid
+.BI MOUNT_ATTR_ *
+flag set.
+.TP
+.B EMFILE
+The calling process has too many open files to create more.
+.TP
+.B ENFILE
+The system has too many open files to create more.
+.TP
+.B ENOSPC
+The "anonymous" mount namespace
+necessary to contain the new mount object
+could not be allocated,
+as doing so would exceed
+the configured per-user limit on
+the number of mount namespaces in the current user namespace.
+(See also
+.BR namespaces (7).)
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B EPERM
+The calling process does not have the required
+.B CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit 93766fbd2696c2c4453dd8e1070977e9cd4e6b6d
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH EXAMPLES
+.in +4n
+.EX
+int fsfd, mntfd, tmpfd;
+\&
+fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV | MOUNT_ATTR_NOEXEC);
+\&
+/* Create a new file without attaching the mount object. */
+int tmpfd = openat(mntfd, "tmpfile", O_CREAT | O_EXCL | O_RDWR, 0600);
+unlinkat(mntfd, "tmpfile", 0);
+\&
+/* Attach the mount object to "/tmp". */
+move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.SH SEE ALSO
+.BR fsconfig (2),
+.BR fsopen (2),
+.BR fspick (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR move_mount (2),
+.BR open_tree (2),
+.BR mount_namespaces (7)
+

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 06/10] man/man2/move_mount.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (4 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 05/10] man/man2/fsmount.2: " Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-19  1:59 ` [PATCH v4 07/10] man/man2/open_tree.2: " Aleksa Sarai
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/move_mount.2 | 646 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 646 insertions(+)

diff --git a/man/man2/move_mount.2 b/man/man2/move_mount.2
new file mode 100644
index 0000000000000000000000000000000000000000..13801d61ba0e99e45c693bb83b22cd24b4c04f28
--- /dev/null
+++ b/man/man2/move_mount.2
@@ -0,0 +1,646 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH move_mount 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+move_mount \- move or attach mount object to filesystem
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
+.B #include <sys/mount.h>
+.P
+.BI "int move_mount(int " from_dirfd ", const char *" from_path ,
+.BI "               int " to_dirfd ", const char *" to_path ,
+.BI "               unsigned int " flags );
+.fi
+.SH DESCRIPTION
+The
+.BR move_mount ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR move_mount ()
+moves the mount object indicated by
+.I from_dirfd
+and
+.I from_path
+to the path indicated by
+.I to_dirfd
+and
+.IR to_path .
+The mount object being moved
+can be an existing mount point in the current mount namespace,
+or a detached mount object created by
+.BR fsmount (2)
+or
+.BR open_tree (2)
+with
+.BR \%OPEN_TREE_CLONE .
+.P
+To access the source mount object
+or the destination mount point,
+no permissions are required on the object itself,
+but if either pathname is supplied,
+execute (search) permission is required
+on all of the directories specified in
+.I from_path
+or
+.IR to_path .
+.P
+The calling process must have the
+.BR \%CAP_SYS_ADMIN
+capability in order to move or attach a mount object.
+.P
+As with "*at()" system calls,
+.BR move_mount ()
+uses the
+.I from_dirfd
+and
+.I to_dirfd
+arguments
+in conjunction with the
+.I from_path
+and
+.I to_path
+arguments to determine the source and destination objects to operate on
+(respectively), as follows:
+.IP \[bu] 3
+If the pathname given in
+.I *_path
+is absolute, then
+the corresponding
+.I *_dirfd
+is ignored.
+.IP \[bu]
+If the pathname given in
+.I *_path
+is relative and
+the corresponding
+.I *_dirfd
+is the special value
+.BR \%AT_FDCWD ,
+then
+.I *_path
+is interpreted relative to
+the current working directory
+of the calling process (like
+.BR open (2)).
+.IP \[bu]
+If the pathname given in
+.I *_path
+is relative,
+then it is interpreted relative to
+the directory referred to by
+the corresponding file descriptor
+.I *_dirfd
+(rather than relative to
+the current working directory
+of the calling process,
+as is done by
+.BR open (2)
+for a relative pathname).
+In this case,
+the corresponding
+.I *_dirfd
+must be a directory
+that was opened for reading
+.RB ( O_RDONLY )
+or using the
+.B O_PATH
+flag.
+.IP \[bu]
+If
+.I *_path
+is an empty string,
+and
+.I flags
+contains the appropriate
+.BI \%MOVE_MOUNT_ * _EMPTY_PATH
+flag,
+then the corresponding file descriptor
+.I *_dirfd
+is operated on directly.
+In this case,
+the corresponding
+.I *_dirfd
+may refer to any type of file,
+not just a directory.
+.P
+See
+.BR openat (2)
+for an explanation of why the
+.I *_dirfd
+arguments are useful.
+.P
+.I flags
+can be used to control aspects of the path lookup
+for both the source and destination objects,
+as well as other properties of the mount operation.
+A value for
+.I flags
+is constructed by bitwise ORing
+zero or more of the following constants:
+.RS
+.TP
+.B MOVE_MOUNT_F_EMPTY_PATH
+If
+.I from_path
+is an empty string, operate on the file referred to by
+.I from_dirfd
+(which may have been obtained from
+.BR open (2),
+.BR fsmount (2),
+or
+.BR open_tree (2)).
+In this case,
+.I from_dirfd
+may refer to any type of file,
+not just a directory.
+If
+.I from_dirfd
+is
+.BR \%AT_FDCWD ,
+.BR move_mount ()
+will operate on the current working directory
+of the calling process.
+.IP
+This is the most common mechanism
+used to attach detached mount objects
+produced by
+.BR fsmount (2)
+and
+.BR open_tree (2)
+to a mount point.
+.TP
+.B MOVE_MOUNT_T_EMPTY_PATH
+As with
+.BR \%MOVE_MOUNT_F_EMPTY_PATH ,
+except operating on
+.I to_dirfd
+and
+.IR to_path .
+.TP
+.B MOVE_MOUNT_F_SYMLINKS
+If
+.IR from_path
+references a symbolic link,
+then dereference it.
+The default behaviour for
+.BR move_mount ()
+is to
+.I not follow
+symbolic links.
+.TP
+.B MOVE_MOUNT_T_SYMLINKS
+As with
+.BR \%MOVE_MOUNT_F_SYMLINKS ,
+except operating on
+.I to_dirfd
+and
+.IR to_path .
+.TP
+.B MOVE_MOUNT_F_NO_AUTOMOUNT
+Do not automount any automount points encountered
+while resolving
+.IR from_path .
+This allows a mount object
+that has an automount point at its root
+to be moved
+and prevents unintended triggering of an automount point.
+This flag has no effect
+if the automount point has already been mounted over.
+.TP
+.B MOVE_MOUNT_T_NO_AUTOMOUNT
+As with
+.BR \%MOVE_MOUNT_F_NO_AUTOMOUNT ,
+except operating on
+.I to_dirfd
+and
+.IR to_path .
+This allows an automount point to be manually mounted over.
+.TP
+.BR MOVE_MOUNT_SET_GROUP " (since Linux 5.15)"
+Add the attached private-propagation mount object indicated by
+.I to_dirfd
+and
+.I to_path
+into the mount propagation "peer group"
+of the attached non-private-propagation mount object indicated by
+.I from_dirfd
+and
+.IR from_path .
+.IP
+Unlike other
+.BR move_mount ()
+operations,
+this operation does not move or attach any mount objects.
+Instead, it only updates the metadata
+of attached mount objects.
+(Also, take careful note of
+the argument order\[em]the mount object being modified
+by this operation is the one specified by
+.I to_dirfd
+and
+.IR to_path .)
+.IP
+This makes it possible to first create a mount tree
+consisting only of private mounts
+and then configure the desired propagation layout afterwards.
+(See the "SHARED SUBTREES" section of
+.BR mount_namespaces (7)
+for more information about mount propagation and peer groups.)
+.TP
+.BR MOVE_MOUNT_BENEATH " (since Linux 6.5)"
+If the path indicated by
+.I to_dirfd
+and
+.I to_path
+is an existing mount object,
+rather than attaching or moving the mount object
+indicated by
+.I from_dirfd
+and
+.I from_path
+on top of the mount stack,
+attach or move it beneath the current top mount
+on the mount stack.
+.IP
+After using
+.BR \%MOVE_MOUNT_BENEATH ,
+it is possible to
+.BR umount (2)
+the top mount
+in order to reveal the mount object
+which was attached beneath it earlier.
+This allows for the seamless (and atomic) replacement
+of intricate mount trees,
+which can further be used
+to "upgrade" a mount tree with a newer version.
+.IP
+This operation has several restrictions:
+.RS
+.IP \[bu] 3
+Mount objects cannot be attached beneath the filesystem root,
+including cases where
+the filesystem root was configured by
+.BR chroot (2)
+or
+.BR pivot_root (2).
+To mount beneath the filesystem root,
+.BR pivot_root (2)
+must be used.
+.IP \[bu]
+The target path indicated by
+.I to_dirfd
+and
+.I to_path
+must not be a detached mount object,
+such as those produced by
+.BR open_tree (2)
+with
+.B \%OPEN_TREE_CLONE
+or
+.BR fsmount (2).
+.IP \[bu]
+The current top mount
+of the target path's mount stack
+and its parent mount
+must be in the calling process's mount namespace.
+.IP \[bu]
+The caller must have sufficient privileges
+to unmount the top mount
+of the target path's mount stack,
+to prove they have privileges
+to reveal the underlying mount.
+.IP \[bu]
+Mount propagation events triggered by this
+.BR move_mount ()
+operation
+(as described in
+.BR mount_namespaces (7))
+are calculated based on the parent mount
+of the current top mount
+of the target path's mount stack.
+.IP \[bu]
+The target path's mount
+cannot be an ancestor in the mount tree of
+the source mount object.
+.IP \[bu]
+The source mount object
+must not have any overmounts,
+otherwise it would be possible to create "shadow mounts"
+(i.e., two mounts mounted on the same parent mount at the same mount point).
+.IP \[bu]
+It is not possible to move a mount
+beneath a top mount
+if the parent mount
+of the current top mount
+propagates to the top mount itself.
+Otherwise,
+.B \%MOVE_MOUNT_BENEATH
+would cause the mount object
+to be propagated
+to the top mount
+from the parent mount,
+defeating the purpose of using
+.BR \%MOVE_MOUNT_BENEATH .
+.IP \[bu]
+It is not possible to move a mount
+beneath a top mount
+if the parent mount
+of the current top mount
+propagates to the mount object
+being mounted beneath.
+Otherwise, this would cause a similar propagation issue
+to the previous point,
+also defeating the purpose of using
+.BR \%MOVE_MOUNT_BENEATH .
+.RE
+.RE
+.P
+If
+.I from_dirfd
+is a mount object file descriptor and
+.BR move_mount ()
+is operating on it directly,
+.I from_dirfd
+will remain associated with the mount object after
+.BR move_mount ()
+succeeds,
+so you may repeatedly use
+.I from_dirfd
+with
+.BR move_mount (2)
+and/or "*at()" system calls
+as many times as necessary.
+.SH RETURN VALUE
+On success,
+.BR move_mount ()
+returns 0.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied
+for one of the directories
+in the path prefix of one of
+.I from_path
+or
+.IR to_path .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+One of
+.I from_dirfd
+or
+.I to_dirfd
+is not a valid file descriptor.
+.TP
+.B EFAULT
+One of
+.I from_path
+or
+.I to_path
+is NULL
+or a pointer to a location
+outside the calling process's accessible address space.
+.TP
+.B EINVAL
+Invalid flag specified in
+.IR flags .
+.TP
+.B EINVAL
+The path indicated by
+.I from_dirfd
+and
+.I from_path
+is not a mount object.
+.TP
+.B EINVAL
+The mount object type
+of the source mount object and target inode
+are not compatible
+(i.e., the source is a file but the target is a directory, or vice-versa).
+.TP
+.B EINVAL
+The source mount object or target path
+are not in the calling process's mount namespace
+(or an anonymous mount namespace of the calling process).
+.TP
+.B EINVAL
+The source mount object's parent mount
+has shared mount propagation,
+and thus cannot be moved
+(as described in
+.BR mount_namespaces (7)).
+.TP
+.B EINVAL
+The source mount has
+.B MS_UNBINDABLE
+child mounts
+but the target path
+resides on a mount tree with shared mount propagation,
+which would otherwise cause the unbindable mounts to be propagated
+(as described in
+.BR mount_namespaces (7)).
+.TP
+.B EINVAL
+.B \%MOVE_MOUNT_BENEATH
+was attempted,
+but one of the listed restrictions was violated.
+.TP
+.B ELOOP
+Too many symbolic links encountered
+when resolving one of
+.I from_path
+or
+.IR to_path .
+.TP
+.B ENAMETOOLONG
+One of
+.I from_path
+or
+.I to_path
+is longer than
+.BR PATH_MAX .
+.TP
+.B ENOENT
+A component of one of
+.I from_path
+or
+.I to_path
+does not exist.
+.TP
+.B ENOENT
+One of
+.I from_path
+or
+.I to_path
+is an empty string,
+but the corresponding
+.BI MOVE_MOUNT_ * _EMPTY_PATH
+flag is not specified in
+.IR flags .
+.TP
+.B ENOTDIR
+A component of the path prefix of one of
+.I from_path
+or
+.I to_path
+is not a directory,
+or one of
+.I from_path
+or
+.I to_path
+is relative
+and the corresponding
+.I from_dirfd
+or
+.I to_dirfd
+is a file descriptor referring to a file other than a directory.
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B EPERM
+The calling process does not have the required
+.B \%CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit 2db154b3ea8e14b04fee23e3fdfd5e9d17fbc6ae
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH EXAMPLES
+.BR move_mount ()
+can be used to move attached mounts like the following:
+.P
+.in +4n
+.EX
+move_mount(AT_FDCWD, "/a", AT_FDCWD, "/b", 0);
+.EE
+.in
+.P
+This would move the mount object mounted on
+.I /a
+to
+.IR /b .
+The above procedure is functionally equivalent to
+the following mount operation
+using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/a", "/b", NULL, MS_MOVE, NULL);
+.EE
+.in
+.P
+.BR move_mount ()
+can also be used in conjunction with file descriptors returned from
+.BR open_tree (2)
+or
+.BR open (2):
+.P
+.in +4n
+.EX
+int fd = open_tree(AT_FDCWD, "/mnt", 0); /* or open("/mnt", O_PATH); */
+move_mount(fd, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH);
+move_mount(fd, "", AT_FDCWD, "/mnt3", MOVE_MOUNT_F_EMPTY_PATH);
+move_mount(fd, "", AT_FDCWD, "/mnt4", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+This would move the mount object mounted at
+.I /mnt
+to
+.IR /mnt2 ,
+then
+.IR /mnt3 ,
+and then
+.IR /mnt4 .
+.P
+If the source mount object
+indicated by
+.I from_dirfd
+and
+.I from_path
+is a detached mount object,
+.BR move_mount ()
+can be used to attach it to a mount point:
+.P
+.in +4n
+.EX
+int fsfd, mntfd;
+\&
+fsfd = fsopen("ext4", FSOPEN_CLOEXEC);
+fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/sda1", 0);
+fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
+fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
+mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV);
+move_mount(mntfd, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+This would create a new filesystem configuration context for ext4,
+configure it,
+create a detached mount object,
+and then attach it to
+.IR /home .
+The above procedure is functionally equivalent to
+the following mount operation
+using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/dev/sda1", "/home", "ext4", MS_NODEV, "user_xattr");
+.EE
+.in
+.P
+The same operation also works with detached bind-mounts created with
+.BR open_tree (2)
+with
+.BR OPEN_TREE_CLONE :
+.P
+.in +4n
+.EX
+int mntfd = open_tree(AT_FDCWD, "/home/cyphar", OPEN_TREE_CLONE);
+move_mount(mntfd, "", AT_FDCWD, "/root", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+This would create a new bind-mount of
+.I /home/cyphar
+as a detached mount object,
+and then attach it to
+.IR /root .
+The above procedure is functionally equivalent to
+the following mount operation
+using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/home/cyphar", "/root", NULL, MS_BIND, NULL);
+.EE
+.in
+.SH SEE ALSO
+.BR fsconfig (2),
+.BR fsmount (2),
+.BR fsopen (2),
+.BR fspick (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR open_tree (2),
+.BR mount_namespaces (7)
+

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (5 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 06/10] man/man2/move_mount.2: " Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21  9:47   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2) Aleksa Sarai
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is loosely based on the original documentation written by David
Howells and later maintained by Christian Brauner, but has been
rewritten to be more from a user perspective (as well as fixing a few
critical mistakes).

Co-authored-by: David Howells <dhowells@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Co-authored-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/open_tree.2 | 498 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 498 insertions(+)

diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
new file mode 100644
index 0000000000000000000000000000000000000000..7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4
--- /dev/null
+++ b/man/man2/open_tree.2
@@ -0,0 +1,498 @@
+.\" Copyright, the authors of the Linux man-pages project
+.\"
+.\" SPDX-License-Identifier: Linux-man-pages-copyleft
+.\"
+.TH open_tree 2 (date) "Linux man-pages (unreleased)"
+.SH NAME
+open_tree \- open path or create detached mount object and attach to fd
+.SH LIBRARY
+Standard C library
+.RI ( libc ,\~ \-lc )
+.SH SYNOPSIS
+.nf
+.BR "#define _GNU_SOURCE         " "/* See feature_test_macros(7) */"
+.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
+.B #include <sys/mount.h>
+.P
+.BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
+.fi
+.SH DESCRIPTION
+The
+.BR open_tree ()
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.IP \[bu] 3
+If
+.I flags
+contains
+.BR \%OPEN_TREE_CLONE ,
+.BR open_tree ()
+creates a detached mount object
+which consists of a bind-mount of
+the path specified by the
+.IR path .
+A new file descriptor
+associated with the detached mount object
+is then returned.
+The mount object is equivalent to a bind-mount
+that would be created by
+.BR mount (2)
+called with
+.BR MS_BIND ,
+except that it is tied to a file descriptor
+and is not mounted onto the filesystem.
+.IP
+As with file descriptors returned from
+.BR fsmount (2),
+the resultant file descriptor can then be used with
+.BR move_mount (2),
+.BR mount_setattr (2),
+or other such system calls to do further mount operations.
+This mount object will be unmounted and destroyed
+when the file descriptor is closed
+if it was not otherwise attached to a mount point
+by calling
+.BR move_mount (2).
+(Note that the unmount operation on
+.BR close (2)
+is lazy\[em]akin to calling
+.BR umount2 (2)
+with
+.BR MOUNT_DETACH ;
+any existing open references to files
+from the mount object
+will continue to work,
+and the mount object will only be completely destroyed
+once it ceases to be busy.)
+.IP \[bu]
+If
+.I flags
+does not contain
+.BR \%OPEN_TREE_CLONE ,
+.BR open_tree ()
+returns a file descriptor
+that is exactly equivalent to
+one produced by
+.BR openat (2)
+when called with the same
+.I dirfd
+and
+.IR path .
+.P
+In either case, the resultant file descriptor
+acts the same as one produced by
+.BR open (2)
+with
+.BR O_PATH ,
+meaning it can also be used as a
+.I dirfd
+argument to
+"*at()" system calls.
+.P
+As with "*at()" system calls,
+.BR open_tree ()
+uses the
+.I dirfd
+argument in conjunction with the
+.I path
+argument to determine the path to operate on, as follows:
+.IP \[bu] 3
+If the pathname given in
+.I path
+is absolute, then
+.I dirfd
+is ignored.
+.IP \[bu]
+If the pathname given in
+.I path
+is relative and
+.I dirfd
+is the special value
+.BR \%AT_FDCWD ,
+then
+.I path
+is interpreted relative to
+the current working directory
+of the calling process (like
+.BR open (2)).
+.IP \[bu]
+If the pathname given in
+.I path
+is relative,
+then it is interpreted relative to
+the directory referred to by the file descriptor
+.I dirfd
+(rather than relative to
+the current working directory
+of the calling process,
+as is done by
+.BR open (2)
+for a relative pathname).
+In this case,
+.I dirfd
+must be a directory
+that was opened for reading
+.RB ( O_RDONLY )
+or using the
+.B O_PATH
+flag.
+.IP \[bu]
+If
+.I path
+is an empty string,
+and
+.I flags
+contains
+.BR \%AT_EMPTY_PATH ,
+then the file descriptor
+.I dirfd
+is operated on directly.
+In this case,
+.I dirfd
+may refer to any type of file,
+not just a directory.
+.P
+See
+.BR openat (2)
+for an explanation of why the
+.I dirfd
+argument is useful.
+.P
+.I flags
+can be used to control aspects of the path lookup
+and properties of the returned file descriptor.
+A value for
+.I flags
+is constructed by bitwise ORing
+zero or more of the following constants:
+.RS
+.TP
+.B \%AT_EMPTY_PATH
+If
+.I path
+is an empty string, operate on the file referred to by
+.I dirfd
+(which may have been obtained from
+.BR open (2),
+.BR fsmount(2),
+or from another
+.BR open_tree ()
+call).
+In this case,
+.I dirfd
+may refer to any type of file, not just a directory.
+If
+.I dirfd
+is
+.BR \%AT_FDCWD ,
+.BR open_tree ()
+will operate on the current working directory
+of the calling process.
+This flag is Linux-specific; define
+.B \%_GNU_SOURCE
+to obtain its definition.
+.TP
+.B \%AT_NO_AUTOMOUNT
+Do not automount the terminal ("basename") component of
+.I path
+if it is a directory that is an automount point.
+This allows you to create a handle to the automount point itself,
+rather than the location it would mount.
+This flag has no effect if the mount point has already been mounted over.
+This flag is Linux-specific; define
+.B \%_GNU_SOURCE
+to obtain its definition.
+.TP
+.B \%AT_SYMLINK_NOFOLLOW
+If
+.I path
+is a symbolic link, do not dereference it; instead,
+create either a handle to the link itself
+or a bind-mount of it.
+The resultant file descriptor is indistinguishable from one produced by
+.BR openat (2)
+with
+.BR \%O_PATH | O_NOFOLLLOW .
+.TP
+.B \%OPEN_TREE_CLOEXEC
+Set the close-on-exec
+.RB ( FD_CLOEXEC )
+flag on the new file descriptor.
+See the description of the
+.B O_CLOEXEC
+flag in
+.BR open (2)
+for reasons why this may be useful.
+.TP
+.B \%OPEN_TREE_CLONE
+Rather than creating an
+.BR openat (2)-style
+.B O_PATH
+file descriptor,
+create a bind-mount of
+.I path
+(akin to
+.IR "mount --bind" )
+as a detached mount object.
+In order to do this operation,
+the calling process must have the
+.BR \%CAP_SYS_ADMIN
+capability.
+.TP
+.B \%AT_RECURSIVE
+Create a recursive bind-mount of the path
+(akin to
+.IR "mount --rbind" )
+as a detached mount object.
+This flag is only permitted in conjunction with
+.BR \%OPEN_TREE_CLONE .
+.SH RETURN VALUE
+On success, a new file descriptor is returned.
+On error, \-1 is returned, and
+.I errno
+is set to indicate the error.
+.SH ERRORS
+.TP
+.B EACCES
+Search permission is denied for one of the directories
+in the path prefix of
+.IR path .
+(See also
+.BR path_resolution (7).)
+.TP
+.B EBADF
+.I path
+is relative but
+.I dirfd
+is neither
+.B \%AT_FDCWD
+nor a valid file descriptor.
+.TP
+.B EFAULT
+.I path
+is NULL
+or a pointer to a location
+outside the calling process's accessible address space.
+.TP
+.B EINVAL
+Invalid flag specified in
+.IR flags .
+.TP
+.B ELOOP
+Too many symbolic links encountered when resolving
+.IR path .
+.TP
+.B EMFILE
+The calling process has too many open files to create more.
+.TP
+.B ENAMETOOLONG
+.I path
+is longer than
+.BR PATH_MAX .
+.TP
+.B ENFILE
+The system has too many open files to create more.
+.TP
+.B ENOENT
+A component of
+.I path
+does not exist, or is a dangling symbolic link.
+.TP
+.B ENOENT
+.I path
+is an empty string, but
+.B AT_EMPTY_PATH
+is not specified in
+.IR flags .
+.TP
+.B ENOTDIR
+A component of the path prefix of
+.I path
+is not a directory, or
+.I path
+is relative and
+.I dirfd
+is a file descriptor referring to a file other than a directory.
+.TP
+.B ENOSPC
+The "anonymous" mount namespace
+necessary to contain the
+.B \%OPEN_TREE_CLONE
+detached bind-mount mount object
+could not be allocated,
+as doing so would exceed
+the configured per-user limit on
+the number of mount namespaces in the current user namespace.
+(See also
+.BR namespaces (7).)
+.TP
+.B ENOMEM
+The kernel could not allocate sufficient memory to complete the operation.
+.TP
+.B EPERM
+.I flags
+contains
+.B \%OPEN_TREE_CLONE
+but the calling process does not have the required
+.B CAP_SYS_ADMIN
+capability.
+.SH STANDARDS
+Linux.
+.SH HISTORY
+Linux 5.2.
+.\" commit a07b20004793d8926f78d63eb5980559f7813404
+.\" commit 400913252d09f9cfb8cce33daee43167921fc343
+glibc 2.36.
+.SH NOTES
+.SS Mount propagation
+The bind-mount mount objects created by
+.BR open_tree ()
+with
+.B \%OPEN_TREE_CLONE
+are not associated with
+the mount namespace of the calling process.
+Instead, each mount object is placed
+in a newly allocated "anonymous" mount namespace
+associated with the calling process.
+.P
+One of the side-effects of this is that
+(unlike bind-mounts created with
+.BR mount (2)),
+mount propagation
+(as described in
+.BR mount_namespaces (7))
+will not be applied to bind-mounts created by
+.BR open_tree ()
+until the bind-mount is attached with
+.BR move_mount (2),
+at which point the mount object
+will be associated with the mount namespace
+where it was attached
+and mount propagation will resume.
+Note that any mount propagation events that occurred
+before the mount object was attached
+will
+.I not
+be propagated to the mount object,
+even after it is attached.
+.SH EXAMPLES
+The following examples show how
+.BR open_tree ()
+can be used in place of more traditional
+.BR mount (2)
+calls with
+.BR MS_BIND .
+.P
+.in +4n
+.EX
+int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE);
+move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+First,
+a detached bind-mount mount object of
+.I /var
+is created
+and associated with the file descriptor
+.IR srcfd .
+Then, the mount object is attached to
+.I /mnt
+using
+.BR move_mount (2)
+with
+.B \%MOVE_MOUNT_F_EMPTY_PATH
+to request that the detached mount object
+associated with the file descriptor
+.I srcfd
+be moved (and thus attached) to
+.IR /mnt .
+.P
+The above procedure is functionally equivalent to
+the following mount operation using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/var", "/mnt", NULL, MS_BIND, NULL);
+.EE
+.in
+.P
+.B \%OPEN_TREE_CLONE
+can be combined with
+.B \%AT_RECURSIVE
+to create recursive detached bind-mount mount objects,
+which in turn can be attached to mount points
+to create recursive bind-mounts.
+.P
+.in +4n
+.EX
+int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE | AT_RECURSIVE);
+move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+The above procedure is functionally equivalent to
+the following mount operation using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/var", "/mnt", NULL, MS_BIND | MS_REC, NULL);
+.EE
+.in
+.P
+One of the primary benefits of using
+.BR open_tree ()
+and
+.BR move_mount (2)
+over the traditional
+.BR mount (2)
+is that operating with
+.IR dirfd -style
+file descriptors is far easier and more intuitive.
+.P
+.in +4n
+.EX
+int srcfd = open_tree(100, "", AT_EMPTY_PATH | OPEN_TREE_CLONE);
+move_mount(srcfd, "", 200, "foo", MOVE_MOUNT_F_EMPTY_PATH);
+.EE
+.in
+.P
+The above procedure is roughly equivalent to
+the following mount operation using
+.BR mount (2):
+.P
+.in +4n
+.EX
+mount("/proc/self/fd/100", "/proc/self/fd/200/foo", NULL, MS_BIND, NULL);
+.EE
+.in
+.P
+In addition, you can use the file descriptor returned by
+.BR open_tree ()
+as the
+.I dirfd
+argument to any "*at()" system calls:
+.P
+.in +4n
+.EX
+int dirfd, fd;
+\&
+dirfd = open_tree(AT_FDCWD, "/etc", OPEN_TREE_CLONE);
+fd = openat(dirfd, "passwd", O_RDONLY);
+fchmodat(dirfd, "shadow", 0000, 0);
+close(dirfd);
+close(fd);
+/* The bind-mount is now destroyed. */
+.EE
+.in
+.SH SEE ALSO
+.BR fsconfig (2),
+.BR fsmount (2),
+.BR fsopen (2),
+.BR fspick (2),
+.BR mount (2),
+.BR mount_setattr (2),
+.BR move_mount (2),
+.BR mount_namespaces (7)

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2)
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (6 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 07/10] man/man2/open_tree.2: " Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21  9:50   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API Aleksa Sarai
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

All of the other new mount API docs have this lead-in sentence in order
to make this set of APIs feel a little bit more cohesive.  Despite being
a bit of a latecomer, mount_setattr(2) is definitely part of this family
of APIs and so deserves the same treatment.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/mount_setattr.2 | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
index 4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8..b27db5b96665cfb0c387bf5b60776d45e0139956 100644
--- a/man/man2/mount_setattr.2
+++ b/man/man2/mount_setattr.2
@@ -19,7 +19,11 @@ .SH SYNOPSIS
 .SH DESCRIPTION
 The
 .BR mount_setattr ()
-system call changes the mount properties of a mount or an entire mount tree.
+system call is part of
+the suite of file descriptor based mount facilities in Linux.
+.P
+.BR mount_setattr ()
+changes the mount properties of a mount or an entire mount tree.
 If
 .I path
 is relative,

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (7 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2) Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-21  9:55   ` Alejandro Colomar
  2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
  2025-09-21  2:43 ` [PATCH v4 00/10] man2: document "new" mount API Askar Safin
  10 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This is a new API added in Linux 6.15, and is effectively just a minor
expansion of open_tree(2) in order to allow for MOUNT_ATTR_IDMAP to be
changed for an existing ID-mapped mount.  glibc does not yet have a
wrapper for this.

While working on this man-page, I discovered a bug in open_tree_attr(2)
that accidentally permitted changing MOUNT_ATTR_IDMAP for extant
detached ID-mapped mount objects.  This is definitely a bug, but there
is no need to add this to BUGS because the patch to fix this has already
been accepted (slated for 6.18, and will be backported to 6.15+).

Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/open_tree.2      | 140 ++++++++++++++++++++++++++++++++++++++++++++++
 man/man2/open_tree_attr.2 |   1 +
 2 files changed, 141 insertions(+)

diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
index 7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4..60de4313a9d5be4ef3ff1217051f252506a2ade9 100644
--- a/man/man2/open_tree.2
+++ b/man/man2/open_tree.2
@@ -15,7 +15,19 @@ .SH SYNOPSIS
 .B #include <sys/mount.h>
 .P
 .BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
+.P
+.BR "#include <sys/syscall.h>" "    /* Definition of " SYS_* " constants */"
+.P
+.BI "int syscall(SYS_open_tree_attr, int " dirfd ", const char *" path ,
+.BI "            unsigned int " flags ", struct mount_attr *_Nullable " attr ", \
+size_t " size );
 .fi
+.P
+.IR Note :
+glibc provides no wrapper for
+.BR open_tree_attr (),
+necessitating the use of
+.BR syscall (2).
 .SH DESCRIPTION
 The
 .BR open_tree ()
@@ -246,6 +258,129 @@ .SH DESCRIPTION
 as a detached mount object.
 This flag is only permitted in conjunction with
 .BR \%OPEN_TREE_CLONE .
+.SS open_tree_attr()
+The
+.BR open_tree_attr ()
+system call operates in exactly the same way as
+.BR open_tree (),
+except for the differences described here.
+.P
+After performing the same operation as with
+.BR open_tree (),
+.BR open_tree_attr ()
+will apply the mount attribute changes described in
+.I attr
+to the file descriptor before it is returned.
+(See
+.BR mount_attr (2type)
+for a description of the
+.I mount_attr
+structure.
+As described in
+.BR mount_setattr (2),
+.I size
+must be set to
+.I sizeof(struct mount_attr)
+in order to support future extensions.)
+If
+.I attr
+is NULL,
+or has
+.IR attr.attr_clr ,
+.IR attr.attr_set ,
+and
+.I attr.propagation
+all set to zero,
+then
+.BR open_tree_attr ()
+has identical behaviour to
+.BR open_tree ().
+.P
+The application of
+.I attr
+to the resultant file descriptor
+has identical semantics to
+.BR mount_setattr (2),
+except for the following extensions and general caveats:
+.IP \[bu] 3
+Unlike
+.BR mount_setattr (2)
+called with a regular
+.B OPEN_TREE_CLONE
+detached mount object from
+.BR open_tree (),
+.BR open_tree_attr ()
+can specify a different setting for
+.B \%MOUNT_ATTR_IDMAP
+to the original mount object cloned with
+.BR OPEN_TREE_CLONE .
+.IP
+Adding
+.B \%MOUNT_ATTR_IDMAP
+to
+.I attr.attr_clr
+will disable ID-mapping for the new mount object;
+adding
+.B \%MOUNT_ATTR_IDMAP
+to
+.I attr.attr_set
+will configure the mount object to have the ID-mapping defined by
+the user namespace referenced by the file descriptor
+.IR attr.userns_fd .
+(The semantics of which are identical to when
+.BR mount_setattr (2)
+is used to configure
+.BR \%MOUNT_ATTR_IDMAP .)
+.IP
+Changing or removing the mapping
+of an ID-mapped mount is only permitted
+if a new detached mount object is being created with
+.I flags
+including
+.BR \%OPEN_TREE_CLONE .
+.\" Aleksa Sarai
+.\"  At time of writing, this is not actually true because of a bug where
+.\"  open_tree_attr() would accidentally permit changing MOUNT_ATTR_IDMAP for
+.\"  existing detached mount objects without setting OPEN_TREE_CLONE, but a
+.\"  patch to fix it has been slated for 6.18 and will be backported to 6.15+.
+.\"  <https://lore.kernel.org/r/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com/>
+.IP \[bu]
+If
+.I flags
+contains
+.BR \%AT_RECURSIVE ,
+then the attributes described in
+.I attr
+are applied recursively
+(just as when
+.BR mount_setattr (2)
+is called with
+.BR \%AT_RECURSIVE ).
+However, this applies in addition to the
+.BR open_tree ()-specific
+behaviour regarding
+.BR \%AT_RECURSIVE ,
+and thus
+.I flags
+must also contain
+.BR \%OPEN_TREE_CLONE .
+.P
+Note that if
+.I flags
+does not contain
+.BR \%OPEN_TREE_CLONE ,
+.BR open_tree_attr ()
+will attempt to modify the mount attributes of
+the mount object attached at
+the path described by
+.I dirfd
+and
+.IR path .
+As with
+.BR mount_setattr (2),
+if said path is not a mount point,
+.BR open_tree_attr ()
+will return an error.
 .SH RETURN VALUE
 On success, a new file descriptor is returned.
 On error, \-1 is returned, and
@@ -339,10 +474,15 @@ .SH ERRORS
 .SH STANDARDS
 Linux.
 .SH HISTORY
+.SS open_tree()
 Linux 5.2.
 .\" commit a07b20004793d8926f78d63eb5980559f7813404
 .\" commit 400913252d09f9cfb8cce33daee43167921fc343
 glibc 2.36.
+.SS open_tree_attr()
+Linux 6.15.
+.\" commit c4a16820d90199409c9bf01c4f794e1e9e8d8fd8
+.\" commit 7a54947e727b6df840780a66c970395ed9734ebe
 .SH NOTES
 .SS Mount propagation
 The bind-mount mount objects created by
diff --git a/man/man2/open_tree_attr.2 b/man/man2/open_tree_attr.2
new file mode 100644
index 0000000000000000000000000000000000000000..e57269bbd269bcce0b0a974425644ba75e379f2f
--- /dev/null
+++ b/man/man2/open_tree_attr.2
@@ -0,0 +1 @@
+.so man2/open_tree.2

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (8 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API Aleksa Sarai
@ 2025-09-19  1:59 ` Aleksa Sarai
  2025-09-19 12:23   ` Aleksa Sarai
                     ` (2 more replies)
  2025-09-21  2:43 ` [PATCH v4 00/10] man2: document "new" mount API Askar Safin
  10 siblings, 3 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19  1:59 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner, Aleksa Sarai

This was not particularly well documented in mount(8) nor mount(2), and
since this is a fairly notable aspect of the new mount API, we should
probably add some words about it.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
---
 man/man2/fsconfig.2      | 12 ++++++++++++
 man/man2/mount_setattr.2 | 40 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2
index 5a18e08c700ac93aa22c341b4134944ee3c38d0b..d827a7b96e08284fb025f94c3348a4acc4571b7d 100644
--- a/man/man2/fsconfig.2
+++ b/man/man2/fsconfig.2
@@ -579,6 +579,18 @@ .SS Generic filesystem parameters
 Linux Security Modules (LSMs)
 are also generic with respect to the underlying filesystem.
 See the documentation for the LSM you wish to configure for more details.
+.SS Mount attributes and filesystem parameters
+Some filesystem parameters
+(traditionally associated with
+.BR mount (8)-style
+options)
+have a sibling mount attribute
+with superficially similar user-facing behaviour.
+.P
+For a description of the distinction between
+mount attributes and filesystem parameters,
+see the "Mount attributes and filesystem parameters" subsection of
+.BR mount_setattr (2).
 .SH CAVEATS
 .SS Filesystem parameter types
 As a result of
diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
index b27db5b96665cfb0c387bf5b60776d45e0139956..f7d0b96fddf97698e36cab020f1d695783143025 100644
--- a/man/man2/mount_setattr.2
+++ b/man/man2/mount_setattr.2
@@ -790,6 +790,46 @@ .SS ID-mapped mounts
 .BR chown (2)
 system call changes the ownership globally and permanently.
 .\"
+.SS Mount attributes and filesystem parameters
+Some mount attributes
+(traditionally associated with
+.BR mount (8)-style
+options)
+have a sibling mount attribute
+with superficially similar user-facing behaviour.
+For example, the
+.I -o ro
+option to
+.BR mount (8)
+can refer to the
+"read-only" filesystem parameter,
+or the "read-only" mount attribute.
+Both of these result in mount objects becoming read-only,
+but they do have different behaviour.
+.P
+The distinction between these two kinds of option is that
+mount object attributes are applied per-mount-object
+(allowing different mount objects
+derived from a given filesystem instance
+to have different attributes),
+while filesystem instance parameters
+("superblock flags" in kernel-developer parlance)
+apply to all mount objects
+derived from the same filesystem instance.
+.P
+When using
+.BR mount (2),
+the line between these two types of mount options was blurred.
+However, with
+.BR mount_setattr ()
+and
+.BR fsconfig (2),
+the distinction is made much clearer.
+Mount attributes are configured with
+.BR mount_setattr (),
+while filesystem parameters can be configured using
+.BR fsconfig (2).
+.\"
 .SS Extensibility
 In order to allow for future extensibility,
 .BR mount_setattr ()

-- 
2.51.0


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction
  2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
@ 2025-09-19 12:23   ` Aleksa Sarai
  2025-09-21  9:59   ` Alejandro Colomar
  2025-09-22  1:06   ` Askar Safin
  2 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19 12:23 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 3536 bytes --]

On 2025-09-19, Aleksa Sarai <cyphar@cyphar.com> wrote:
> This was not particularly well documented in mount(8) nor mount(2), and
> since this is a fairly notable aspect of the new mount API, we should
> probably add some words about it.
> 
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/fsconfig.2      | 12 ++++++++++++
>  man/man2/mount_setattr.2 | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2
> index 5a18e08c700ac93aa22c341b4134944ee3c38d0b..d827a7b96e08284fb025f94c3348a4acc4571b7d 100644
> --- a/man/man2/fsconfig.2
> +++ b/man/man2/fsconfig.2
> @@ -579,6 +579,18 @@ .SS Generic filesystem parameters
>  Linux Security Modules (LSMs)
>  are also generic with respect to the underlying filesystem.
>  See the documentation for the LSM you wish to configure for more details.
> +.SS Mount attributes and filesystem parameters
> +Some filesystem parameters
> +(traditionally associated with
> +.BR mount (8)-style
> +options)
> +have a sibling mount attribute
> +with superficially similar user-facing behaviour.
> +.P
> +For a description of the distinction between
> +mount attributes and filesystem parameters,
> +see the "Mount attributes and filesystem parameters" subsection of
> +.BR mount_setattr (2).
>  .SH CAVEATS
>  .SS Filesystem parameter types
>  As a result of
> diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> index b27db5b96665cfb0c387bf5b60776d45e0139956..f7d0b96fddf97698e36cab020f1d695783143025 100644
> --- a/man/man2/mount_setattr.2
> +++ b/man/man2/mount_setattr.2
> @@ -790,6 +790,46 @@ .SS ID-mapped mounts
>  .BR chown (2)
>  system call changes the ownership globally and permanently.
>  .\"
> +.SS Mount attributes and filesystem parameters
> +Some mount attributes
> +(traditionally associated with
> +.BR mount (8)-style
> +options)
> +have a sibling mount attribute
> +with superficially similar user-facing behaviour.
> +For example, the
> +.I -o ro
> +option to
> +.BR mount (8)
> +can refer to the
> +"read-only" filesystem parameter,
> +or the "read-only" mount attribute.
> +Both of these result in mount objects becoming read-only,
> +but they do have different behaviour.
> +.P
> +The distinction between these two kinds of option is that
> +mount object attributes are applied per-mount-object
> +(allowing different mount objects
> +derived from a given filesystem instance
> +to have different attributes),
> +while filesystem instance parameters
> +("superblock flags" in kernel-developer parlance)
> +apply to all mount objects
> +derived from the same filesystem instance.
> +.P
> +When using
> +.BR mount (2),
> +the line between these two types of mount options was blurred.
> +However, with
> +.BR mount_setattr ()
> +and
> +.BR fsconfig (2),
> +the distinction is made much clearer.
> +Mount attributes are configured with
> +.BR mount_setattr (),
> +while filesystem parameters can be configured using
                               are configured using

probably reads a bit better here. I'll include it in the next version if
this isn't merged, but I won't resend the whole patchset for a one-word
change.

> +.BR fsconfig (2).
> +.\"
>  .SS Extensibility
>  In order to allow for future extensibility,
>  .BR mount_setattr ()
> 
> -- 
> 2.51.0
> 

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type)
  2025-09-19  1:59 ` [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Aleksa Sarai
@ 2025-09-19 16:04   ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-19 16:04 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 3663 bytes --]

On Fri, Sep 19, 2025 at 11:59:42AM +1000, Aleksa Sarai wrote:
> As with open_how(2type), it makes sense to move this to a separate man
> page.  In addition, future man pages added in this patchset will want to
> reference mount_attr(2type).
> 
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

Hi Aleksa,

Thanks!  I've applied this patch.


Have a lovely day!
Alex

> ---
>  man/man2/mount_setattr.2      | 17 ++++--------
>  man/man2type/mount_attr.2type | 61 +++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 66 insertions(+), 12 deletions(-)
> 
> diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> index 586633f48e894bf8f2823aa7755c96adcddea6a6..4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8 100644
> --- a/man/man2/mount_setattr.2
> +++ b/man/man2/mount_setattr.2
> @@ -114,18 +114,11 @@ .SH DESCRIPTION
>  .I attr
>  argument of
>  .BR mount_setattr ()
> -is a structure of the following form:
> -.P
> -.in +4n
> -.EX
> -struct mount_attr {
> -    __u64 attr_set;     /* Mount properties to set */
> -    __u64 attr_clr;     /* Mount properties to clear */
> -    __u64 propagation;  /* Mount propagation type */
> -    __u64 userns_fd;    /* User namespace file descriptor */
> -};
> -.EE
> -.in
> +is a pointer to a
> +.I mount_attr
> +structure,
> +described in
> +.BR mount_attr (2type).
>  .P
>  The
>  .I attr_set
> diff --git a/man/man2type/mount_attr.2type b/man/man2type/mount_attr.2type
> new file mode 100644
> index 0000000000000000000000000000000000000000..f5c4f48be46ec1e6c0d3a211b6724a1e95311a41
> --- /dev/null
> +++ b/man/man2type/mount_attr.2type
> @@ -0,0 +1,61 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH mount_attr 2type (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +mount_attr \- what mount properties to set and clear
> +.SH LIBRARY
> +Linux kernel headers
> +.SH SYNOPSIS
> +.EX
> +.B #include <sys/mount.h>
> +.P
> +.B struct mount_attr {
> +.BR "    u64 attr_set;" "     /* Mount properties to set */"
> +.BR "    u64 attr_clr;" "     /* Mount properties to clear */"
> +.BR "    u64 propagation;" "  /* Mount propagation type */"
> +.BR "    u64 userns_fd;" "    /* User namespace file descriptor */"
> +    /* ... */
> +.B };
> +.EE
> +.SH DESCRIPTION
> +Specifies which mount properties should be changed with
> +.BR mount_setattr (2).
> +.P
> +The fields are as follows:
> +.TP
> +.I .attr_set
> +This field specifies which
> +.BI MOUNT_ATTR_ *
> +attribute flags to set.
> +.TP
> +.I .attr_clr
> +This field specifies which
> +.BI MOUNT_ATTR_ *
> +attribute flags to clear.
> +.TP
> +.I .propagation
> +This field specifies what mount propagation will be applied.
> +The valid values of this field are the same propagation types described in
> +.BR mount_namespaces (7).
> +.TP
> +.I .userns_fd
> +This field specifies a file descriptor that indicates which user namespace to
> +use as a reference for ID-mapped mounts with
> +.BR MOUNT_ATTR_IDMAP .
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 5.12.
> +.\" commit 2a1867219c7b27f928e2545782b86daaf9ad50bd
> +glibc 2.36.
> +.P
> +Extra fields may be appended to the structure,
> +with a zero value in a new field resulting in
> +the kernel behaving as though that extension field was not present.
> +Therefore, a user
> +.I must
> +zero-fill this structure on initialization.
> +.SH SEE ALSO
> +.BR mount_setattr (2)
> 
> -- 
> 2.51.0
> 
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API Aleksa Sarai
@ 2025-09-19 16:24   ` Alejandro Colomar
  2025-09-19 21:19     ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-19 16:24 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 15237 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:43AM +1000, Aleksa Sarai wrote:
> This is loosely based on the original documentation written by David
> Howells and later maintained by Christian Brauner, but has been
> rewritten to be more from a user perspective (as well as fixing a few
> critical mistakes).
> 
> Co-authored-by: David Howells <dhowells@redhat.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Co-authored-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/fsopen.2 | 384 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 384 insertions(+)
> 
> diff --git a/man/man2/fsopen.2 b/man/man2/fsopen.2
> new file mode 100644
> index 0000000000000000000000000000000000000000..7cdbeac7d64b7e5c969dee619a039ec947d1e981
> --- /dev/null
> +++ b/man/man2/fsopen.2
> @@ -0,0 +1,384 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH fsopen 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +fsopen \- create a new filesystem context
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/mount.h>
> +.P
> +.BI "int fsopen(const char *" fsname ", unsigned int " flags );
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR fsopen ()
> +system call is part of
> +the suite of file descriptor based mount facilities in Linux.

Minor nitpick (I can amend that; no worries):

Because 'file-descriptor-based' works as a single modifier of
facilities, it goes with hyphens.

> +.P
> +.BR fsopen ()
> +creates a blank filesystem configuration context within the kernel
> +for the filesystem named by
> +.I fsname
> +and places it into creation mode.
> +A new file descriptor
> +associated with the filesystem configuration context
> +is then returned.
> +The calling process must have the
> +.B \%CAP_SYS_ADMIN
> +capability in order to create a new filesystem configuration context.
> +.P
> +A filesystem configuration context is
> +an in-kernel representation of a pending transaction,
> +containing a set of configuration parameters that are to be applied
> +when creating a new instance of a filesystem
> +(or modifying the configuration of an existing filesystem instance,
> +such as when using
> +.BR fspick (2)).
> +.P
> +After obtaining a filesystem configuration context with
> +.BR fsopen (),
> +the general workflow for operating on the context looks like the following:
> +.IP (1) 5
> +Pass the filesystem context file descriptor to
> +.BR fsconfig (2)
> +to specify any desired filesystem parameters.
> +This may be done as many times as necessary.
> +.IP (2)
> +Pass the same filesystem context file descriptor to
> +.BR fsconfig (2)
> +with
> +.B \%FSCONFIG_CMD_CREATE
> +to create an instance of the configured filesystem.
> +.IP (3)
> +Pass the same filesystem context file descriptor to
> +.BR fsmount (2)
> +to create a new detached mount object for
> +the root of the filesystem instance,
> +which is then attached to a new file descriptor.
> +(This also places the filesystem context file descriptor into
> +reconfiguration mode,
> +similar to the mode produced by
> +.BR fspick (2).)
> +Once a mount object has been created with
> +.BR fsmount (2),
> +the filesystem context file descriptor can be safely closed.
> +.IP (4)
> +Now that a mount object has been created,
> +you may
> +.RS
> +.IP (4.1) 7
> +use the detached mount object file descriptor as a
> +.I dirfd
> +argument to "*at()" system calls; and/or
> +.IP (4.2) 7

I'll paste here the formatted part of this page:

        (4)  Now that a mount object has been created, you may
 
             (4.1)  use the detached mount object file descrip‐
                    tor as a dirfd argument to "*at()" system
                    calls; and/or
 
             (4.2)  attach the mount object to a mount point by
                    passing the mount object file descriptor to
                    move_mount(2).  This will also prevent the
                    mount object from being unmounted and de‐
                    stroyed when the mount object file descrip‐
                    tor is closed.

             The mount object file descriptor will remain asso‐
             ciated with the mount object even after doing the
             above operations, so you may repeatedly use the
             mount object file descriptor with move_mount(2)
             and/or "*at()" system calls as many times as neces‐
             sary.

That sublist seems to be an unordered one.  I think we should use
a bullet list for those items (the outer list 1,2,3,4 is okay as is).

       Bullet lists
              Elements are preceded by bullet symbols  (\[bu]).
              Anything  that  doesn't  fit elsewhere is usually
              covered by this type of list.

> +attach the mount object to a mount point
> +by passing the mount object file descriptor to
> +.BR move_mount (2).
> +This will also prevent the mount object from
> +being unmounted and destroyed when
> +the mount object file descriptor is closed.
> +.RE
> +.IP
> +The mount object file descriptor will
> +remain associated with the mount object
> +even after doing the above operations,
> +so you may repeatedly use the mount object file descriptor with
> +.BR move_mount (2)
> +and/or "*at()" system calls
> +as many times as necessary.
> +.P
> +A filesystem context will move between different modes
> +throughout its lifecycle
> +(such as the creation phase
> +when created with
> +.BR fsopen (),
> +the reconfiguration phase
> +when an existing filesystem instance is selected with
> +.BR fspick (2),
> +and the intermediate "awaiting-mount" phase
> +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
> +between
> +.BR \%FSCONFIG_CMD_CREATE
> +and
> +.BR fsmount (2)),
> +which has an impact on
> +what operations are permitted on the filesystem context.
> +.P
> +The file descriptor returned by
> +.BR fsopen ()
> +also acts as a channel for filesystem drivers to
> +provide more comprehensive diagnostic information
> +than is normally provided through the standard
> +.BR errno (3)
> +interface for system calls.
> +If an error occurs at any time during the workflow mentioned above,
> +calling
> +.BR read (2)
> +on the filesystem context file descriptor
> +will retrieve any ancillary information about the encountered errors.
> +(See the "Message retrieval interface" section
> +for more details on the message format.)
> +.P
> +.I flags
> +can be used to control aspects of
> +the creation of the filesystem configuration context file descriptor.
> +A value for
> +.I flags
> +is constructed by bitwise ORing
> +zero or more of the following constants:
> +.RS
> +.TP
> +.B FSOPEN_CLOEXEC
> +Set the close-on-exec
> +.RB ( FD_CLOEXEC )
> +flag on the new file descriptor.
> +See the description of the
> +.B O_CLOEXEC
> +flag in
> +.BR open (2)
> +for reasons why this may be useful.
> +.RE
> +.P
> +A list of filesystems supported by the running kernel
> +(and thus a list of valid values for
> +.IR fsname )
> +can be obtained from
> +.IR /proc/filesystems .
> +(See also
> +.BR proc_filesystems (5).)
> +.SS Message retrieval interface
> +When doing operations on a filesystem configuration context,
> +the filesystem driver may choose to provide
> +ancillary information to userspace
> +in the form of message strings.
> +.P
> +The filesystem context file descriptors returned by
> +.BR fsopen ()
> +and
> +.BR fspick (2)
> +may be queried for message strings at any time by calling
> +.BR read (2)
> +on the file descriptor.
> +Each call to
> +.BR read (2)
> +will return a single message,
> +prefixed to indicate its class:
> +.RS
> +.TP
> +\fBe\fP <\fImessage\fP>

We don't use '<' and '>' for indicating variable parts.  We already use
italics for that.  The reason to avoid the '<' and '>' is that it is
confusing: it is often unclear if the '<' are literal or placeholders.

We only use '<' when they're literal.

I suspect your want

	.BI e\~ message

BTW, I'm assuming there's one space between the letter and the message,
and there are no literal '<'/'>', right?


Have a lovely day!
Alex

> +An error message was logged.
> +This is usually associated with an error being returned
> +from the corresponding system call which triggered this message.
> +.TP
> +\fBw\fP <\fImessage\fP>
> +A warning message was logged.
> +.TP
> +\fBi\fP <\fImessage\fP>
> +An informational message was logged.
> +.RE
> +.P
> +Messages are removed from the queue as they are read.
> +Note that the message queue has limited depth,
> +so it is possible for messages to get lost.
> +If there are no messages in the message queue,
> +.B read(2)
> +will return \-1 and
> +.I errno
> +will be set to
> +.BR \%ENODATA .
> +If the
> +.I buf
> +argument to
> +.BR read (2)
> +is not large enough to contain the entire message,
> +.BR read (2)
> +will return \-1 and
> +.I errno
> +will be set to
> +.BR \%EMSGSIZE .
> +(See BUGS.)
> +.P
> +If there are multiple filesystem contexts
> +referencing the same filesystem instance
> +(such as if you call
> +.BR fspick (2)
> +multiple times for the same mount),
> +each one gets its own independent message queue.
> +This does not apply to multiple file descriptors that are
> +tied to the same underlying open file description
> +(such as those created with
> +.BR dup (2)).
> +.P
> +Message strings will usually be prefixed by
> +the name of the filesystem or kernel subsystem
> +that logged the message,
> +though this may not always be the case.
> +See the Linux kernel source code for details.
> +.SH RETURN VALUE
> +On success, a new file descriptor is returned.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +.TP
> +.B EFAULT
> +.I fsname
> +is NULL
> +or a pointer to a location
> +outside the calling process's accessible address space.
> +.TP
> +.B EINVAL
> +.I flags
> +had an invalid flag set.
> +.TP
> +.B EMFILE
> +The calling process has too many open files to create more.
> +.TP
> +.B ENFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENODEV
> +The filesystem named by
> +.I fsname
> +is not supported by the kernel.
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the operation.
> +.TP
> +.B EPERM
> +The calling process does not have the required
> +.B \%CAP_SYS_ADMIN
> +capability.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 5.2.
> +.\" commit 24dcb3d90a1f67fe08c68a004af37df059d74005
> +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> +glibc 2.36.
> +.SH BUGS
> +.SS Message retrieval interface and \fB\%EMSGSIZE\fP
> +As described in the "Message retrieval interface" subsection above,
> +calling
> +.BR read (2)
> +with too small a buffer to contain
> +the next pending message in the message queue
> +for the filesystem configuration context
> +will cause
> +.BR read (2)
> +to return \-1 and set
> +.BR errno (3)
> +to
> +.BR \%EMSGSIZE .
> +.P
> +However,
> +this failed operation still
> +consumes the message from the message queue.
> +This effectively discards the message silently,
> +as no data is copied into the
> +.BR read (2)
> +buffer.
> +.P
> +Programs should take care to ensure that
> +their buffers are sufficiently large
> +to contain any reasonable message string,
> +in order to avoid silently losing valuable diagnostic information.
> +.\" Aleksa Sarai
> +.\"   This unfortunate behaviour has existed since this feature was merged, but
> +.\"   I have sent a patchset which will finally fix it.
> +.\"   <https://lore.kernel.org/r/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@cyphar.com/>
> +.SH EXAMPLES
> +To illustrate the workflow for creating a new mount,
> +the following is an example of how to mount an
> +.BR ext4 (5)
> +filesystem stored on
> +.I /dev/sdb1
> +onto
> +.IR /mnt .
> +.P
> +.in +4n
> +.EX
> +int fsfd, mntfd;
> +\&
> +fsfd = fsopen("ext4", FSOPEN_CLOEXEC);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_PATH, "source", "/dev/sdb1", AT_FDCWD);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0)
> +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_RELATIME);
> +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +First,
> +an ext4 configuration context is created and attached to the file descriptor
> +.IR fsfd .
> +Then, a series of parameters
> +(such as the source of the filesystem)
> +are provided using
> +.BR fsconfig (2),
> +followed by the filesystem instance being created with
> +.BR \%FSCONFIG_CMD_CREATE .
> +.BR fsmount (2)
> +is then used to create a new mount object attached to the file descriptor
> +.IR mntfd ,
> +which is then attached to the intended mount point using
> +.BR move_mount (2).
> +.P
> +The above procedure is functionally equivalent to
> +the following mount operation using
> +.BR mount (2):
> +.P
> +.in +4n
> +.EX
> +mount("/dev/sdb1", "/mnt", "ext4", MS_RELATIME,
> +      "ro,noatime,acl,user_xattr,iversion");
> +.EE
> +.in
> +.P
> +And here's an example of creating a mount object
> +of an NFS server share
> +and setting a Smack security module label.
> +However, instead of attaching it to a mount point,
> +the program uses the mount object directly
> +to open a file from the NFS share.
> +.P
> +.in +4n
> +.EX
> +int fsfd, mntfd, fd;
> +\&
> +fsfd = fsopen("nfs", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "example.com/pub/linux", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mntfd = fsmount(fsfd, 0, MOUNT_ATTR_NODEV);
> +fd = openat(mntfd, "src/linux-5.2.tar.xz", O_RDONLY);
> +.EE
> +.in
> +.P
> +Unlike the previous example,
> +this operation has no trivial equivalent with
> +.BR mount (2),
> +as it was not previously possible to create a mount object
> +that is not attached to any mount point.
> +.SH SEE ALSO
> +.BR fsconfig (2),
> +.BR fsmount (2),
> +.BR fspick (2),
> +.BR mount (2),
> +.BR mount_setattr (2),
> +.BR move_mount (2),
> +.BR open_tree (2),
> +.BR mount_namespaces (7)
> 
> -- 
> 2.51.0
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API
  2025-09-19 16:24   ` Alejandro Colomar
@ 2025-09-19 21:19     ` Aleksa Sarai
  2025-09-21  1:33       ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-19 21:19 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 16729 bytes --]

On 2025-09-19, Alejandro Colomar <alx@kernel.org> wrote:
> Hi Aleksa,
> 
> On Fri, Sep 19, 2025 at 11:59:43AM +1000, Aleksa Sarai wrote:
> > This is loosely based on the original documentation written by David
> > Howells and later maintained by Christian Brauner, but has been
> > rewritten to be more from a user perspective (as well as fixing a few
> > critical mistakes).
> > 
> > Co-authored-by: David Howells <dhowells@redhat.com>
> > Signed-off-by: David Howells <dhowells@redhat.com>
> > Co-authored-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> >  man/man2/fsopen.2 | 384 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 384 insertions(+)
> > 
> > diff --git a/man/man2/fsopen.2 b/man/man2/fsopen.2
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..7cdbeac7d64b7e5c969dee619a039ec947d1e981
> > --- /dev/null
> > +++ b/man/man2/fsopen.2
> > @@ -0,0 +1,384 @@
> > +.\" Copyright, the authors of the Linux man-pages project
> > +.\"
> > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > +.\"
> > +.TH fsopen 2 (date) "Linux man-pages (unreleased)"
> > +.SH NAME
> > +fsopen \- create a new filesystem context
> > +.SH LIBRARY
> > +Standard C library
> > +.RI ( libc ,\~ \-lc )
> > +.SH SYNOPSIS
> > +.nf
> > +.B #include <sys/mount.h>
> > +.P
> > +.BI "int fsopen(const char *" fsname ", unsigned int " flags );
> > +.fi
> > +.SH DESCRIPTION
> > +The
> > +.BR fsopen ()
> > +system call is part of
> > +the suite of file descriptor based mount facilities in Linux.
> 
> Minor nitpick (I can amend that; no worries):
> 
> Because 'file-descriptor-based' works as a single modifier of
> facilities, it goes with hyphens.

Will do for all of the new pages.

> > +.P
> > +.BR fsopen ()
> > +creates a blank filesystem configuration context within the kernel
> > +for the filesystem named by
> > +.I fsname
> > +and places it into creation mode.
> > +A new file descriptor
> > +associated with the filesystem configuration context
> > +is then returned.
> > +The calling process must have the
> > +.B \%CAP_SYS_ADMIN
> > +capability in order to create a new filesystem configuration context.
> > +.P
> > +A filesystem configuration context is
> > +an in-kernel representation of a pending transaction,
> > +containing a set of configuration parameters that are to be applied
> > +when creating a new instance of a filesystem
> > +(or modifying the configuration of an existing filesystem instance,
> > +such as when using
> > +.BR fspick (2)).
> > +.P
> > +After obtaining a filesystem configuration context with
> > +.BR fsopen (),
> > +the general workflow for operating on the context looks like the following:
> > +.IP (1) 5
> > +Pass the filesystem context file descriptor to
> > +.BR fsconfig (2)
> > +to specify any desired filesystem parameters.
> > +This may be done as many times as necessary.
> > +.IP (2)
> > +Pass the same filesystem context file descriptor to
> > +.BR fsconfig (2)
> > +with
> > +.B \%FSCONFIG_CMD_CREATE
> > +to create an instance of the configured filesystem.
> > +.IP (3)
> > +Pass the same filesystem context file descriptor to
> > +.BR fsmount (2)
> > +to create a new detached mount object for
> > +the root of the filesystem instance,
> > +which is then attached to a new file descriptor.
> > +(This also places the filesystem context file descriptor into
> > +reconfiguration mode,
> > +similar to the mode produced by
> > +.BR fspick (2).)
> > +Once a mount object has been created with
> > +.BR fsmount (2),
> > +the filesystem context file descriptor can be safely closed.
> > +.IP (4)
> > +Now that a mount object has been created,
> > +you may
> > +.RS
> > +.IP (4.1) 7
> > +use the detached mount object file descriptor as a
> > +.I dirfd
> > +argument to "*at()" system calls; and/or
> > +.IP (4.2) 7
> 
> I'll paste here the formatted part of this page:
> 
>         (4)  Now that a mount object has been created, you may
>  
>              (4.1)  use the detached mount object file descrip‐
>                     tor as a dirfd argument to "*at()" system
>                     calls; and/or
>  
>              (4.2)  attach the mount object to a mount point by
>                     passing the mount object file descriptor to
>                     move_mount(2).  This will also prevent the
>                     mount object from being unmounted and de‐
>                     stroyed when the mount object file descrip‐
>                     tor is closed.
> 
>              The mount object file descriptor will remain asso‐
>              ciated with the mount object even after doing the
>              above operations, so you may repeatedly use the
>              mount object file descriptor with move_mount(2)
>              and/or "*at()" system calls as many times as neces‐
>              sary.
> 
> That sublist seems to be an unordered one.  I think we should use
> a bullet list for those items (the outer list 1,2,3,4 is okay as is).
> 
>        Bullet lists
>               Elements are preceded by bullet symbols  (\[bu]).
>               Anything  that  doesn't  fit elsewhere is usually
>               covered by this type of list.

I think I originally planned to reference the inner list (which would
require non-bullets), but it seems I dropped that text later. I'll fix
this up.

> > +attach the mount object to a mount point
> > +by passing the mount object file descriptor to
> > +.BR move_mount (2).
> > +This will also prevent the mount object from
> > +being unmounted and destroyed when
> > +the mount object file descriptor is closed.
> > +.RE
> > +.IP
> > +The mount object file descriptor will
> > +remain associated with the mount object
> > +even after doing the above operations,
> > +so you may repeatedly use the mount object file descriptor with
> > +.BR move_mount (2)
> > +and/or "*at()" system calls
> > +as many times as necessary.
> > +.P
> > +A filesystem context will move between different modes
> > +throughout its lifecycle
> > +(such as the creation phase
> > +when created with
> > +.BR fsopen (),
> > +the reconfiguration phase
> > +when an existing filesystem instance is selected with
> > +.BR fspick (2),
> > +and the intermediate "awaiting-mount" phase
> > +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
> > +between
> > +.BR \%FSCONFIG_CMD_CREATE
> > +and
> > +.BR fsmount (2)),
> > +which has an impact on
> > +what operations are permitted on the filesystem context.
> > +.P
> > +The file descriptor returned by
> > +.BR fsopen ()
> > +also acts as a channel for filesystem drivers to
> > +provide more comprehensive diagnostic information
> > +than is normally provided through the standard
> > +.BR errno (3)
> > +interface for system calls.
> > +If an error occurs at any time during the workflow mentioned above,
> > +calling
> > +.BR read (2)
> > +on the filesystem context file descriptor
> > +will retrieve any ancillary information about the encountered errors.
> > +(See the "Message retrieval interface" section
> > +for more details on the message format.)
> > +.P
> > +.I flags
> > +can be used to control aspects of
> > +the creation of the filesystem configuration context file descriptor.
> > +A value for
> > +.I flags
> > +is constructed by bitwise ORing
> > +zero or more of the following constants:
> > +.RS
> > +.TP
> > +.B FSOPEN_CLOEXEC
> > +Set the close-on-exec
> > +.RB ( FD_CLOEXEC )
> > +flag on the new file descriptor.
> > +See the description of the
> > +.B O_CLOEXEC
> > +flag in
> > +.BR open (2)
> > +for reasons why this may be useful.
> > +.RE
> > +.P
> > +A list of filesystems supported by the running kernel
> > +(and thus a list of valid values for
> > +.IR fsname )
> > +can be obtained from
> > +.IR /proc/filesystems .
> > +(See also
> > +.BR proc_filesystems (5).)
> > +.SS Message retrieval interface
> > +When doing operations on a filesystem configuration context,
> > +the filesystem driver may choose to provide
> > +ancillary information to userspace
> > +in the form of message strings.
> > +.P
> > +The filesystem context file descriptors returned by
> > +.BR fsopen ()
> > +and
> > +.BR fspick (2)
> > +may be queried for message strings at any time by calling
> > +.BR read (2)
> > +on the file descriptor.
> > +Each call to
> > +.BR read (2)
> > +will return a single message,
> > +prefixed to indicate its class:
> > +.RS
> > +.TP
> > +\fBe\fP <\fImessage\fP>
> 
> We don't use '<' and '>' for indicating variable parts.  We already use
> italics for that.  The reason to avoid the '<' and '>' is that it is
> confusing: it is often unclear if the '<' are literal or placeholders.
> 
> We only use '<' when they're literal.
> 
> I suspect your want
> 
> 	.BI e\~ message
> 
> BTW, I'm assuming there's one space between the letter and the message,
> and there are no literal '<'/'>', right?

No there aren't literal '<' / '>' characters, it's just a regular
string. I guess I was struggling with the (lack of) distinction between
using italics for variable names and things that are variable.

> Have a lovely day!
> Alex
> 
> > +An error message was logged.
> > +This is usually associated with an error being returned
> > +from the corresponding system call which triggered this message.
> > +.TP
> > +\fBw\fP <\fImessage\fP>
> > +A warning message was logged.
> > +.TP
> > +\fBi\fP <\fImessage\fP>
> > +An informational message was logged.
> > +.RE
> > +.P
> > +Messages are removed from the queue as they are read.
> > +Note that the message queue has limited depth,
> > +so it is possible for messages to get lost.
> > +If there are no messages in the message queue,
> > +.B read(2)
> > +will return \-1 and
> > +.I errno
> > +will be set to
> > +.BR \%ENODATA .
> > +If the
> > +.I buf
> > +argument to
> > +.BR read (2)
> > +is not large enough to contain the entire message,
> > +.BR read (2)
> > +will return \-1 and
> > +.I errno
> > +will be set to
> > +.BR \%EMSGSIZE .
> > +(See BUGS.)
> > +.P
> > +If there are multiple filesystem contexts
> > +referencing the same filesystem instance
> > +(such as if you call
> > +.BR fspick (2)
> > +multiple times for the same mount),
> > +each one gets its own independent message queue.
> > +This does not apply to multiple file descriptors that are
> > +tied to the same underlying open file description
> > +(such as those created with
> > +.BR dup (2)).
> > +.P
> > +Message strings will usually be prefixed by
> > +the name of the filesystem or kernel subsystem
> > +that logged the message,
> > +though this may not always be the case.
> > +See the Linux kernel source code for details.
> > +.SH RETURN VALUE
> > +On success, a new file descriptor is returned.
> > +On error, \-1 is returned, and
> > +.I errno
> > +is set to indicate the error.
> > +.SH ERRORS
> > +.TP
> > +.B EFAULT
> > +.I fsname
> > +is NULL
> > +or a pointer to a location
> > +outside the calling process's accessible address space.
> > +.TP
> > +.B EINVAL
> > +.I flags
> > +had an invalid flag set.
> > +.TP
> > +.B EMFILE
> > +The calling process has too many open files to create more.
> > +.TP
> > +.B ENFILE
> > +The system has too many open files to create more.
> > +.TP
> > +.B ENODEV
> > +The filesystem named by
> > +.I fsname
> > +is not supported by the kernel.
> > +.TP
> > +.B ENOMEM
> > +The kernel could not allocate sufficient memory to complete the operation.
> > +.TP
> > +.B EPERM
> > +The calling process does not have the required
> > +.B \%CAP_SYS_ADMIN
> > +capability.
> > +.SH STANDARDS
> > +Linux.
> > +.SH HISTORY
> > +Linux 5.2.
> > +.\" commit 24dcb3d90a1f67fe08c68a004af37df059d74005
> > +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> > +glibc 2.36.
> > +.SH BUGS
> > +.SS Message retrieval interface and \fB\%EMSGSIZE\fP
> > +As described in the "Message retrieval interface" subsection above,
> > +calling
> > +.BR read (2)
> > +with too small a buffer to contain
> > +the next pending message in the message queue
> > +for the filesystem configuration context
> > +will cause
> > +.BR read (2)
> > +to return \-1 and set
> > +.BR errno (3)
> > +to
> > +.BR \%EMSGSIZE .
> > +.P
> > +However,
> > +this failed operation still
> > +consumes the message from the message queue.
> > +This effectively discards the message silently,
> > +as no data is copied into the
> > +.BR read (2)
> > +buffer.
> > +.P
> > +Programs should take care to ensure that
> > +their buffers are sufficiently large
> > +to contain any reasonable message string,
> > +in order to avoid silently losing valuable diagnostic information.
> > +.\" Aleksa Sarai
> > +.\"   This unfortunate behaviour has existed since this feature was merged, but
> > +.\"   I have sent a patchset which will finally fix it.
> > +.\"   <https://lore.kernel.org/r/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@cyphar.com/>
> > +.SH EXAMPLES
> > +To illustrate the workflow for creating a new mount,
> > +the following is an example of how to mount an
> > +.BR ext4 (5)
> > +filesystem stored on
> > +.I /dev/sdb1
> > +onto
> > +.IR /mnt .
> > +.P
> > +.in +4n
> > +.EX
> > +int fsfd, mntfd;
> > +\&
> > +fsfd = fsopen("ext4", FSOPEN_CLOEXEC);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_SET_PATH, "source", "/dev/sdb1", AT_FDCWD);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0)
> > +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> > +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_RELATIME);
> > +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> > +.EE
> > +.in
> > +.P
> > +First,
> > +an ext4 configuration context is created and attached to the file descriptor
> > +.IR fsfd .
> > +Then, a series of parameters
> > +(such as the source of the filesystem)
> > +are provided using
> > +.BR fsconfig (2),
> > +followed by the filesystem instance being created with
> > +.BR \%FSCONFIG_CMD_CREATE .
> > +.BR fsmount (2)
> > +is then used to create a new mount object attached to the file descriptor
> > +.IR mntfd ,
> > +which is then attached to the intended mount point using
> > +.BR move_mount (2).
> > +.P
> > +The above procedure is functionally equivalent to
> > +the following mount operation using
> > +.BR mount (2):
> > +.P
> > +.in +4n
> > +.EX
> > +mount("/dev/sdb1", "/mnt", "ext4", MS_RELATIME,
> > +      "ro,noatime,acl,user_xattr,iversion");
> > +.EE
> > +.in
> > +.P
> > +And here's an example of creating a mount object
> > +of an NFS server share
> > +and setting a Smack security module label.
> > +However, instead of attaching it to a mount point,
> > +the program uses the mount object directly
> > +to open a file from the NFS share.
> > +.P
> > +.in +4n
> > +.EX
> > +int fsfd, mntfd, fd;
> > +\&
> > +fsfd = fsopen("nfs", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "example.com/pub/linux", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_STRING, "rsize", "65536", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_STRING, "wsize", "65536", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> > +mntfd = fsmount(fsfd, 0, MOUNT_ATTR_NODEV);
> > +fd = openat(mntfd, "src/linux-5.2.tar.xz", O_RDONLY);
> > +.EE
> > +.in
> > +.P
> > +Unlike the previous example,
> > +this operation has no trivial equivalent with
> > +.BR mount (2),
> > +as it was not previously possible to create a mount object
> > +that is not attached to any mount point.
> > +.SH SEE ALSO
> > +.BR fsconfig (2),
> > +.BR fsmount (2),
> > +.BR fspick (2),
> > +.BR mount (2),
> > +.BR mount_setattr (2),
> > +.BR move_mount (2),
> > +.BR open_tree (2),
> > +.BR mount_namespaces (7)
> > 
> > -- 
> > 2.51.0
> > 
> 
> -- 
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).



-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API
  2025-09-19 21:19     ` Aleksa Sarai
@ 2025-09-21  1:33       ` Aleksa Sarai
  2025-09-21  8:39         ` Alejandro Colomar
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-21  1:33 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

On 2025-09-20, Aleksa Sarai <cyphar@cyphar.com> wrote:
> On 2025-09-19, Alejandro Colomar <alx@kernel.org> wrote:
> > Hi Aleksa,
> > 
> > On Fri, Sep 19, 2025 at 11:59:43AM +1000, Aleksa Sarai wrote:
> > > This is loosely based on the original documentation written by David
> > > Howells and later maintained by Christian Brauner, but has been
> > > rewritten to be more from a user perspective (as well as fixing a few
> > > critical mistakes).
> > > 
> > > Co-authored-by: David Howells <dhowells@redhat.com>
> > > Signed-off-by: David Howells <dhowells@redhat.com>
> > > Co-authored-by: Christian Brauner <brauner@kernel.org>
> > > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > > ---
> > >  man/man2/fsopen.2 | 384 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 384 insertions(+)
> > > 
> > > diff --git a/man/man2/fsopen.2 b/man/man2/fsopen.2
> > > new file mode 100644
> > > index 0000000000000000000000000000000000000000..7cdbeac7d64b7e5c969dee619a039ec947d1e981
> > > --- /dev/null
> > > +++ b/man/man2/fsopen.2
> > > @@ -0,0 +1,384 @@
> > > +.\" Copyright, the authors of the Linux man-pages project
> > > +.\"
> > > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > > +.\"
> > > +.TH fsopen 2 (date) "Linux man-pages (unreleased)"
> > > +.SH NAME
> > > +fsopen \- create a new filesystem context
> > > +.SH LIBRARY
> > > +Standard C library
> > > +.RI ( libc ,\~ \-lc )
> > > +.SH SYNOPSIS
> > > +.nf
> > > +.B #include <sys/mount.h>
> > > +.P
> > > +.BI "int fsopen(const char *" fsname ", unsigned int " flags );
> > > +.fi
> > > +.SH DESCRIPTION
> > > +The
> > > +.BR fsopen ()
> > > +system call is part of
> > > +the suite of file descriptor based mount facilities in Linux.
> > 
> > Minor nitpick (I can amend that; no worries):
> > 
> > Because 'file-descriptor-based' works as a single modifier of
> > facilities, it goes with hyphens.
> 
> Will do for all of the new pages.

By the way, I'll wait for your review of all of the remaining man-pages
before sending v5. Thanks!

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/10] man2: document "new" mount API
  2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
                   ` (9 preceding siblings ...)
  2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
@ 2025-09-21  2:43 ` Askar Safin
  2025-09-21  9:58   ` Aleksa Sarai
  10 siblings, 1 reply; 43+ messages in thread
From: Askar Safin @ 2025-09-21  2:43 UTC (permalink / raw)
  To: cyphar
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

Aleksa, thank you! Don't give up. We all need these manpages.

I see you didn't address some my previous notes.

* move_mount(2) still says "Mount objects cannot be attached beneath the filesystem root".
I suggest saying "root directory" or "root" or "root directory of the process" or just "/"
instead. But you may keep this phrase as is, of course.

* Docs for FSPICK_NO_AUTOMOUNT in fspick(2) are still wrong. They say that FSPICK_NO_AUTOMOUNT
affects all components of path. Similar thing applies to mount_setattr(2) and move_mount(2)

* open_tree(2) still says:
> If flags does not contain OPEN_TREE_CLONE, open_tree() returns a file descriptor
> that is exactly equivalent to one produced by openat(2) when called with the same dirfd and path.

This is not true if automounts are involved. I suggest adding "modulo automounts". But you may
keep everything, of course.

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API
  2025-09-21  1:33       ` Aleksa Sarai
@ 2025-09-21  8:39         ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  8:39 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 299 bytes --]

Hi Aleksa,

On Sun, Sep 21, 2025 at 11:33:34AM +1000, Aleksa Sarai wrote:
> By the way, I'll wait for your review of all of the remaining man-pages
> before sending v5. Thanks!

Okay.


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 03/10] man/man2/fspick.2: " Aleksa Sarai
@ 2025-09-21  8:53   ` Alejandro Colomar
  2025-09-21 14:55     ` Aleksa Sarai
  2025-09-22  0:25   ` Askar Safin
  1 sibling, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  8:53 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 10367 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:44AM +1000, Aleksa Sarai wrote:
> This is loosely based on the original documentation written by David
> Howells and later maintained by Christian Brauner, but has been
> rewritten to be more from a user perspective (as well as fixing a few
> critical mistakes).
> 
> Co-authored-by: David Howells <dhowells@redhat.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Co-authored-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/fspick.2 | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 342 insertions(+)
> 
> diff --git a/man/man2/fspick.2 b/man/man2/fspick.2
> new file mode 100644
> index 0000000000000000000000000000000000000000..1f87293f44658adeb7ab7cffebcac3174888f040
> --- /dev/null
> +++ b/man/man2/fspick.2
> @@ -0,0 +1,342 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH fspick 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +fspick \- select filesystem for reconfiguration
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
> +.B #include <sys/mount.h>
> +.P
> +.BI "int fspick(int " dirfd ", const char *" path ", unsigned int " flags );
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR fspick ()
> +system call is part of
> +the suite of file descriptor based mount facilities in Linux.
> +.P
> +.BR fspick()
> +creates a new filesystem configuration context
> +for the extant filesystem instance
> +associated with the path described by
> +.IR dirfd
> +and
> +.IR path ,
> +places it into reconfiguration mode
> +(similar to
> +.BR mount (8)
> +with the
> +.I -o remount
> +option).
> +A new file descriptor
> +associated with the filesystem configuration context
> +is then returned.
> +The calling process must have the
> +.BR CAP_SYS_ADMIN

This should use '.B. (Bold).  BR means alternating Bold and Roman, but
this only has one token, so it can't alternate.

If you run `make -R build-catman-troff`, this will trigger a diagnostic:

	an.tmac: <page>:<line>: style: .BR expects at least 2 arguments, got 1

> +capability in order to create a new filesystem configuration context.
> +.P
> +The resultant file descriptor can be used with
> +.BR fsconfig (2)
> +to specify the desired set of changes to
> +filesystem parameters of the filesystem instance.
> +Once the desired set of changes have been configured,
> +the changes can be effectuated by calling
> +.BR fsconfig (2)
> +with the
> +.B \%FSCONFIG_CMD_RECONFIGURE
> +command.
> +Please note that\[em]in contrast to
> +the behaviour of
> +.B MS_REMOUNT
> +with
> +.BR mount (2)\[em] fspick ()

Only have one important keyword per macro call.  In this case, I prefer
em dashes to only be attached to one side, as if they were parentheses,
so we don't need any tricks:

	Please note that
	\[em]in contrast to
	...
	.BR mount (2)\[em]
	.BR fspick ()

> +instantiates the filesystem configuration context
> +with a copy of
> +the extant filesystem's filesystem parameters,
> +meaning that a subsequent
> +.B \%FSCONFIG_CMD_RECONFIGURE
> +operation
> +will only update filesystem parameters
> +explicitly modified with
> +.BR fsconfig (2).
> +.P
> +As with "*at()" system calls,
> +.BR fspick ()
> +uses the
> +.I dirfd
> +argument in conjunction with the
> +.I path
> +argument to determine the path to operate on, as follows:
> +.IP \[bu] 3
> +If the pathname given in
> +.I path
> +is absolute, then
> +.I dirfd
> +is ignored.
> +.IP \[bu]
> +If the pathname given in
> +.I path
> +is relative and
> +.I dirfd
> +is the special value
> +.BR \%AT_FDCWD ,
> +then
> +.I path
> +is interpreted relative to
> +the current working directory
> +of the calling process (like
> +.BR open (2)).
> +.IP \[bu]
> +If the pathname given in
> +.I path
> +is relative,
> +then it is interpreted relative to
> +the directory referred to by the file descriptor
> +.I dirfd
> +(rather than relative to
> +the current working directory
> +of the calling process,
> +as is done by
> +.BR open (2)
> +for a relative pathname).
> +In this case,
> +.I dirfd
> +must be a directory
> +that was opened for reading
> +.RB ( O_RDONLY )
> +or using the
> +.B O_PATH
> +flag.
> +.IP \[bu]
> +If
> +.I path
> +is an empty string,
> +and
> +.I flags
> +contains
> +.BR \%FSPICK_EMPTY_PATH ,
> +then the file descriptor
> +.I dirfd
> +is operated on directly.
> +In this case,
> +.I dirfd
> +may refer to any type of file,
> +not just a directory.
> +.P
> +See
> +.BR openat (2)
> +for an explanation of why the
> +.I dirfd
> +argument is useful.
> +.P
> +.I flags
> +can be used to control aspects of how
> +.I path
> +is resolved and
> +properties of the returned file descriptor.
> +A value for
> +.I flags
> +is constructed by bitwise ORing
> +zero or more of the following constants:
> +.RS
> +.TP
> +.B FSPICK_CLOEXEC
> +Set the close-on-exec
> +.RB ( FD_CLOEXEC )
> +flag on the new file descriptor.
> +See the description of the
> +.B O_CLOEXEC
> +flag in
> +.BR open (2)
> +for reasons why this may be useful.
> +.TP
> +.B FSPICK_EMPTY_PATH
> +If
> +.I path
> +is an empty string,
> +operate on the file referred to by
> +.I dirfd
> +(which may have been obtained from
> +.BR open (2),
> +.BR fsmount (2),
> +or
> +.BR open_tree (2)).
> +In this case,
> +.I dirfd
> +may refer to any type of file,
> +not just a directory.
> +If
> +.I dirfd
> +is
> +.BR \%AT_FDCWD ,
> +.BR fspick ()
> +will operate on the current working directory
> +of the calling process.
> +.TP
> +.B FSPICK_SYMLINK_NOFOLLOW
> +Do not follow symbolic links
> +in the terminal component of
> +.IR path .
> +If
> +.I path
> +references a symbolic link,
> +the returned filesystem context will reference
> +the filesystem that the symbolic link itself resides on.
> +.TP
> +.B FSPICK_NO_AUTOMOUNT
> +Do not automount any automount points encountered
> +while resolving
> +.IR path .
> +This allows you to reconfigure an automount point,
> +rather than the location that would be mounted.
> +This flag has no effect if
> +the automount point has already been mounted over.

I'll amend other similar issues if I find them, but in general, I'd put
the 'if' in the next line, as it is more tied to that part of the
sentence (think for example that if you reversed the sentence to say
"if ..., then ...", you'd move the 'if' with what follows it.  You don't
need to search for all of these and fix them; just keep it in mind for
next time.  In general I like the break points you used.


Have a lovely day!
Alex

> +.RE
> +.P
> +As with filesystem contexts created with
> +.BR fsopen (2),
> +the file descriptor returned by
> +.BR fspick ()
> +may be queried for message strings at any time by calling
> +.BR read (2)
> +on the file descriptor.
> +(See the "Message retrieval interface" subsection in
> +.BR fsopen (2)
> +for more details on the message format.)
> +.SH RETURN VALUE
> +On success, a new file descriptor is returned.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Search permission is denied
> +for one of the directories
> +in the path prefix of
> +.IR path .
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EBADF
> +.I path
> +is relative but
> +.I dirfd
> +is neither
> +.B \%AT_FDCWD
> +nor a valid file descriptor.
> +.TP
> +.B EFAULT
> +.I path
> +is NULL
> +or a pointer to a location
> +outside the calling process's accessible address space.
> +.TP
> +.B EINVAL
> +Invalid flag specified in
> +.IR flags .
> +.TP
> +.B ELOOP
> +Too many symbolic links encountered when resolving
> +.IR path .
> +.TP
> +.B EMFILE
> +The calling process has too many open files to create more.
> +.TP
> +.B ENAMETOOLONG
> +.I path
> +is longer than
> +.BR PATH_MAX .
> +.TP
> +.B ENFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENOENT
> +A component of
> +.I path
> +does not exist,
> +or is a dangling symbolic link.
> +.TP
> +.B ENOENT
> +.I path
> +is an empty string, but
> +.B \%FSPICK_EMPTY_PATH
> +is not specified in
> +.IR flags .
> +.TP
> +.B ENOTDIR
> +A component of the path prefix of
> +.I path
> +is not a directory;
> +or
> +.I path
> +is relative and
> +.I dirfd
> +is a file descriptor referring to a file other than a directory.
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the operation.
> +.TP
> +.B EPERM
> +The calling process does not have the required
> +.B \%CAP_SYS_ADMIN
> +capability.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 5.2.
> +.\" commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb
> +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> +glibc 2.36.
> +.SH EXAMPLES
> +The following example sets the read-only flag
> +on the filesystem instance referenced by
> +the mount object attached at
> +.IR /tmp .
> +.P
> +.in +4n
> +.EX
> +int fsfd = fspick(AT_FDCWD, "/tmp", FSPICK_CLOEXEC);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> +.EE
> +.in
> +.P
> +The above procedure is roughly equivalent to
> +the following mount operation using
> +.BR mount (2):
> +.P
> +.in +4n
> +.EX
> +mount(NULL, "/tmp", NULL, MS_REMOUNT | MS_RDONLY, NULL);
> +.EE
> +.in
> +.P
> +With the notable caveat that
> +in this example,
> +.BR mount (2)
> +will clear all other filesystem parameters
> +(such as
> +.B MS_NOSUID
> +or
> +.BR MS_NOEXEC );
> +.BR fsconfig (2)
> +will only modify the
> +.I ro
> +parameter.
> +.SH SEE ALSO
> +.BR fsconfig (2),
> +.BR fsmount (2),
> +.BR fsopen (2),
> +.BR mount (2),
> +.BR mount_setattr (2),
> +.BR move_mount (2),
> +.BR open_tree (2),
> +.BR mount_namespaces (7)
> +
> 
> -- 
> 2.51.0
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 04/10] man/man2/fsconfig.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 04/10] man/man2/fsconfig.2: " Aleksa Sarai
@ 2025-09-21  9:26   ` Alejandro Colomar
  2025-09-24  6:41     ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  9:26 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 22478 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:45AM +1000, Aleksa Sarai wrote:
> This is loosely based on the original documentation written by David
> Howells and later maintained by Christian Brauner, but has been
> rewritten to be more from a user perspective (as well as fixing a few
> critical mistakes).
> 
> Co-authored-by: David Howells <dhowells@redhat.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Co-authored-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/fsconfig.2 | 727 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 727 insertions(+)
> 
> diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2
> new file mode 100644
> index 0000000000000000000000000000000000000000..5a18e08c700ac93aa22c341b4134944ee3c38d0b
> --- /dev/null
> +++ b/man/man2/fsconfig.2
> @@ -0,0 +1,727 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH fsconfig 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +fsconfig \- configure new or existing filesystem context
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.B #include <sys/mount.h>
> +.P
> +.BI "int fsconfig(int " fd ", unsigned int " cmd ,
> +.BI "             const char *_Nullable " key ,
> +.BI "             const void *_Nullable " value ", int " aux );
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR fsconfig ()
> +system call is part of
> +the suite of file descriptor based mount facilities in Linux.
> +.P
> +.BR fsconfig ()
> +is used to supply parameters to
> +and issue commands against
> +the filesystem configuration context
> +associated with the file descriptor
> +.IR fd .
> +Filesystem configuration contexts can be created with
> +.BR fsopen (2)
> +or be instantiated from an extant filesystem instance with
> +.BR fspick (2).
> +.P
> +The
> +.I cmd
> +argument indicates the command to be issued.
> +Some commands supply parameters to the context
> +(equivalent to mount options specified with
> +.BR mount (8)),
> +while others are meta-operations on the filesystem context.
> +The list of valid
> +.I cmd
> +values are:

I think I would have this page split into one page per command.

I would keep an overview in this page, of the main system call, and the
descriptions of each subcommand would go into each separate page.

You could have a look at fcntl(2), which has been the most recent page
split, and let me know what you think.


Have a lovely day!
Alex

> +.RS
> +.TP
> +.B FSCONFIG_SET_FLAG
> +Set the flag parameter named by
> +.IR key .
> +.I value
> +must be NULL,
> +and
> +.I aux
> +must be 0.
> +.TP
> +.B FSCONFIG_SET_STRING
> +Set the string parameter named by
> +.I key
> +to the value specified by
> +.IR value .
> +.I value
> +points to a null-terminated string,
> +and
> +.I aux
> +must be 0.
> +.TP
> +.B FSCONFIG_SET_BINARY
> +Set the blob parameter named by
> +.I key
> +to the contents of the binary blob
> +specified by
> +.IR value .
> +.I value
> +points to
> +the start of a buffer
> +that is
> +.I aux
> +bytes in length.
> +.TP
> +.B FSCONFIG_SET_FD
> +Set the file parameter named by
> +.I key
> +to the open file description
> +referenced by the file descriptor
> +.IR aux .
> +.I value
> +must be NULL.
> +.IP
> +You may also use
> +.B \%FSCONFIG_SET_STRING
> +for file parameters,
> +with
> +.I value
> +set to a null-terminated string
> +containing a base-10 representation
> +of the file descriptor number.
> +This mechanism is primarily intended for compatibility
> +with older
> +.BR mount (2)-based
> +programs,
> +and only works for parameters
> +that
> +.I only
> +accept file descriptor arguments.
> +.TP
> +.B FSCONFIG_SET_PATH
> +Set the path parameter named by
> +.I key
> +to the object at a provided path,
> +resolved in a similar manner to
> +.BR openat (2).
> +.I value
> +points to a null-terminated pathname string,
> +and
> +.I aux
> +is equivalent to the
> +.I dirfd
> +argument to
> +.BR openat (2).
> +See
> +.BR openat (2)
> +for an explanation of the need for
> +.BR \%FSCONFIG_SET_PATH .
> +.IP
> +You may also use
> +.B \%FSCONFIG_SET_STRING
> +for path parameters,
> +the behaviour of which is equivalent to
> +.B \%FSCONFIG_SET_PATH
> +with
> +.I aux
> +set to
> +.BR \%AT_FDCWD .
> +.TP
> +.B FSCONFIG_SET_PATH_EMPTY
> +As with
> +.BR \%FSCONFIG_SET_PATH ,
> +except that if
> +.I value
> +is an empty string,
> +the file descriptor specified by
> +.I aux
> +is operated on directly
> +and may be any type of file
> +(not just a directory).
> +This is equivalent to the behaviour of
> +.B \%AT_EMPTY_PATH
> +with most "*at()" system calls.
> +If
> +.I aux
> +is
> +.BR \%AT_FDCWD ,
> +the parameter will be set to
> +the current working directory
> +of the calling process.
> +.TP
> +.B FSCONFIG_CMD_CREATE
> +This command instructs the filesystem driver
> +to instantiate an instance of the filesystem in the kernel
> +with the parameters specified in the filesystem configuration context.
> +.I key
> +and
> +.I value
> +must be NULL,
> +and
> +.I aux
> +must be 0.
> +.IP
> +This command can only be issued once
> +in the lifetime of a filesystem context.
> +If the operation succeeds,
> +the filesystem context
> +associated with file descriptor
> +.I fd
> +now references the created filesystem instance,
> +and is placed into a special "awaiting-mount" mode
> +that allows you to use
> +.BR fsmount (2)
> +to create a mount object from the filesystem instance.
> +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this.
> +If the operation fails,
> +in most cases
> +the filesystem context is placed in a failed mode
> +and cannot be used for any further
> +.BR fsconfig ()
> +operations
> +(though you may still retrieve diagnostic messages
> +through the message retrieval interface,
> +as described in
> +the corresponding subsection of
> +.BR fsopen (2)).
> +.IP
> +This command can only be issued against
> +filesystem configuration contexts
> +that were created with
> +.BR fsopen (2).
> +In order to create a filesystem instance,
> +the calling process must have the
> +.B \%CAP_SYS_ADMIN
> +capability.
> +.IP
> +An important thing to be aware of is that
> +the Linux kernel will
> +.I silently
> +reuse extant filesystem instances
> +depending on the filesystem type
> +and the configured parameters
> +(each filesystem driver has
> +its own policy for
> +how filesystem instances are reused).
> +This means that
> +the filesystem instance "created" by
> +.B \%FSCONFIG_CMD_CREATE
> +may, in fact, be a reference
> +to an extant filesystem instance in the kernel.
> +(For reference,
> +this behaviour also applies to
> +.BR mount (2).)
> +.IP
> +One side-effect of this behaviour is that
> +if an extant filesystem instance is reused,
> +.I all
> +parameters configured
> +for this filesystem configuration context
> +are
> +.I silently ignored
> +(with the exception of the
> +.I ro
> +and
> +.I rw
> +flag parameters;
> +if the state of the read-only flag in the
> +extant filesystem instance and the filesystem configuration context
> +do not match, this operation will return
> +.BR EBUSY ).
> +This also means that
> +.BR \%FSCONFIG_CMD_RECONFIGURE
> +commands issued against
> +the "created" filesystem instance
> +will also affect any mount objects associated with
> +the extant filesystem instance.
> +.IP
> +Programs that need to ensure
> +that they create a new filesystem instance
> +with specific parameters
> +(notably, security-related parameters
> +such as
> +.I acl
> +to enable POSIX ACLs\[em]as described in
> +.BR acl (5))
> +should use
> +.B \%FSCONFIG_CMD_CREATE_EXCL
> +instead.
> +.TP
> +.BR FSCONFIG_CMD_CREATE_EXCL " (since Linux 6.6)"
> +.\" commit 22ed7ecdaefe0cac0c6e6295e83048af60435b13
> +.\" commit 84ab1277ce5a90a8d1f377707d662ac43cc0918a
> +As with
> +.BR \%FSCONFIG_CMD_CREATE ,
> +except that the kernel is instructed
> +to not reuse extant filesystem instances.
> +If the operation
> +would be forced to
> +reuse an extant filesystem instance,
> +this operation will return
> +.B EBUSY
> +instead.
> +.IP
> +As a result (unlike
> +.BR \%FSCONFIG_CMD_CREATE ),
> +if this operation succeeds
> +then the calling process can be sure that
> +all of the parameters successfully configured with
> +.BR fsconfig ()
> +will actually be applied
> +to the created filesystem instance.
> +.TP
> +.B FSCONFIG_CMD_RECONFIGURE
> +This command instructs the filesystem driver
> +to apply the parameters specified in the filesystem configuration context
> +to the extant filesystem instance
> +referenced by the filesystem configuration context.
> +.I key
> +and
> +.I value
> +must be NULL,
> +and
> +.I aux
> +must be 0.
> +.IP
> +This is primarily intended for use with
> +.BR fspick (2),
> +but may also be used to modify
> +the parameters of a filesystem instance
> +after
> +.BR \%FSCONFIG_CMD_CREATE
> +was used to create it
> +and a mount object was created using
> +.BR fsmount (2).
> +In order to reconfigure an extant filesystem instance,
> +the calling process must have the
> +.B CAP_SYS_ADMIN
> +capability.
> +.IP
> +If the operation succeeds,
> +the filesystem context is reset
> +but remains in reconfiguration mode
> +and thus can be reused for subsequent
> +.B \%FSCONFIG_CMD_RECONFIGURE
> +commands.
> +If the operation fails,
> +in most cases
> +the filesystem context is placed in a failed mode
> +and cannot be used for any further
> +.BR fsconfig ()
> +operations
> +(though you may still retrieve diagnostic messages
> +through the message retrieval interface,
> +as described in
> +the corresponding subsection of
> +.BR fsopen (2)).
> +.RE
> +.P
> +Parameters specified with
> +.BI FSCONFIG_SET_ *
> +do not take effect
> +until a corresponding
> +.B \%FSCONFIG_CMD_CREATE
> +or
> +.B \%FSCONFIG_CMD_RECONFIGURE
> +command is issued.
> +.SH RETURN VALUE
> +On success,
> +.BR fsconfig ()
> +returns 0.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +If an error occurs, the filesystem driver may provide
> +additional information about the error
> +through the message retrieval interface for filesystem configuration contexts.
> +This additional information can be retrieved at any time by calling
> +.BR read (2)
> +on the filesystem instance or filesystem configuration context
> +referenced by the file descriptor
> +.IR fd .
> +(See the "Message retrieval interface" subsection in
> +.BR fsopen (2)
> +for more details on the message format.)
> +.P
> +Even after an error occurs,
> +the filesystem configuration context is
> +.I not
> +invalidated,
> +and thus can still be used with other
> +.BR fsconfig ()
> +commands.
> +This means that users can probe support for filesystem parameters
> +on a per-parameter basis,
> +and adjust which parameters they wish to set.
> +.P
> +The error values given below result from
> +filesystem type independent errors.
> +Each filesystem type may have its own special errors
> +and its own special behavior.
> +See the Linux kernel source code for details.
> +.TP
> +.B EACCES
> +A component of a path
> +provided as a path parameter
> +was not searchable.
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EACCES
> +.B \%FSCONFIG_CMD_CREATE
> +was attempted
> +for a read-only filesystem
> +without specifying the
> +.RB ' ro '
> +flag parameter.
> +.TP
> +.B EACCES
> +A specified block device parameter
> +is located on a filesystem
> +mounted with the
> +.B \%MS_NODEV
> +option.
> +.TP
> +.B EBADF
> +The file descriptor given by
> +.I fd
> +(or possibly by
> +.IR aux ,
> +depending on the command)
> +is invalid.
> +.TP
> +.B EBUSY
> +The filesystem context associated with
> +.I fd
> +is in the wrong state
> +for the given command.
> +.TP
> +.B EBUSY
> +The filesystem instance cannot be reconfigured as read-only
> +with
> +.B \%FSCONFIG_CMD_RECONFIGURE
> +because some programs
> +still hold files open for writing.
> +.TP
> +.B EBUSY
> +A new filesystem instance was requested with
> +.B \%FSCONFIG_CMD_CREATE_EXCL
> +but a matching superblock already existed.
> +.TP
> +.B EFAULT
> +One of the pointer arguments
> +points to a location
> +outside the calling process's accessible address space.
> +.TP
> +.B EINVAL
> +.I fd
> +does not refer to
> +a filesystem configuration context
> +or filesystem instance.
> +.TP
> +.B EINVAL
> +One of the values of
> +.IR name ,
> +.IR value ,
> +and/or
> +.I aux
> +were set to a non-zero value when
> +.I cmd
> +required that they be zero
> +(or NULL).
> +.TP
> +.B EINVAL
> +The parameter named by
> +.I name
> +cannot be set
> +using the type specified with
> +.IR cmd .
> +.TP
> +.B EINVAL
> +One of the source parameters
> +referred to
> +an invalid superblock.
> +.TP
> +.B ELOOP
> +Too many links encountered
> +during pathname resolution
> +of a path argument.
> +.TP
> +.B ENAMETOOLONG
> +A path argument was longer than
> +.BR PATH_MAX .
> +.TP
> +.B ENOENT
> +A path argument had a non-existent component.
> +.TP
> +.B ENOENT
> +A path argument is an empty string,
> +but
> +.I cmd
> +is not
> +.BR \%FSCONFIG_SET_PATH_EMPTY .
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the operation.
> +.TP
> +.B ENOTBLK
> +The parameter named by
> +.I name
> +must be a block device,
> +but the provided parameter value was not a block device.
> +.TP
> +.B ENOTDIR
> +A component of the path prefix
> +of a path argument
> +was not a directory.
> +.TP
> +.B EOPNOTSUPP
> +The command given by
> +.I cmd
> +is not valid.
> +.TP
> +.B ENXIO
> +The major number
> +of a block device parameter
> +is out of range.
> +.TP
> +.B EPERM
> +The command given by
> +.I cmd
> +was
> +.BR \%FSCONFIG_CMD_CREATE ,
> +.BR \%FSCONFIG_CMD_CREATE_EXCL ,
> +or
> +.BR \%FSCONFIG_CMD_RECONFIGURE ,
> +but the calling process does not have the required
> +.B \%CAP_SYS_ADMIN
> +capability.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 5.2.
> +.\" commit ecdab150fddb42fe6a739335257949220033b782
> +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> +glibc 2.36.
> +.SH NOTES
> +.SS Generic filesystem parameters
> +Each filesystem driver is responsible for
> +parsing most parameters specified with
> +.BR fsconfig (),
> +meaning that individual filesystems
> +may have very different behaviour
> +when encountering parameters with the same name.
> +In general,
> +you should not assume that the behaviour of
> +.BR fsconfig ()
> +when specifying a parameter to one filesystem type
> +will match the behaviour of the same parameter
> +with a different filesystem type.
> +.P
> +However,
> +the following generic parameters
> +apply to all filesystems and have unified behaviour.
> +They are set using the listed
> +.BI \%FSCONFIG_SET_ *
> +command.
> +.TP
> +\fIro\fP and \fIrw\fP (\fB\%FSCONFIG_SET_FLAG\fP)
> +Configure whether the filesystem instance is read-only.
> +.TP
> +\fIdirsync\fP (\fB\%FSCONFIG_SET_FLAG\fP)
> +Make directory changes on this filesystem instance synchronous.
> +.TP
> +\fIsync\fP and \fIasync\fP (\fB\%FSCONFIG_SET_FLAG\fP)
> +Configure whether writes on this filesystem instance
> +will be made synchronous
> +(as though the
> +.B O_SYNC
> +flag to
> +.BR open (2)
> +was specified for
> +all file opens in this filesystem instance).
> +.TP
> +\fIlazytime\fP and \fInolazytime\fP (\fB\%FSCONFIG_SET_FLAG\fP)
> +Configure whether to reduce on-disk updates
> +of inode timestamps on this filesystem instance
> +(as described in the
> +.B \%MS_LAZYTIME
> +section of
> +.BR mount (2)).
> +.TP
> +\fImand\fP and \fInomand\fP (\fB\%FSCONFIG_SET_FLAG\fP)
> +Configure whether the filesystem instance should permit mandatory locking.
> +Since Linux 5.15,
> +.\" commit f7e33bdbd6d1bdf9c3df8bba5abcf3399f957ac3
> +mandatory locking has been deprecated
> +and setting this flag is a no-op.
> +.TP
> +\fIsource\fP (\fB\%FSCONFIG_SET_STRING\fP)
> +This parameter is equivalent to the
> +.I source
> +parameter passed to
> +.BR mount (2)
> +for the same filesystem type,
> +and is usually the pathname of a block device
> +containing the filesystem.
> +This parameter may only be set once
> +per filesystem configuration context transaction.
> +.P
> +In addition,
> +any filesystem parameters associated with
> +Linux Security Modules (LSMs)
> +are also generic with respect to the underlying filesystem.
> +See the documentation for the LSM you wish to configure for more details.
> +.SH CAVEATS
> +.SS Filesystem parameter types
> +As a result of
> +each filesystem driver being responsible for
> +parsing most parameters specified with
> +.BR fsconfig (),
> +some filesystem drivers
> +may have unintuitive behaviour
> +with regards to which
> +.BI \%FSCONFIG_SET_ *
> +commands are permitted
> +to configure a given parameter.
> +.P
> +In order for
> +filesystem parameters to be backwards compatible with
> +.BR mount (2),
> +they must be parseable as strings;
> +this almost universally means that
> +.B \%FSCONFIG_SET_STRING
> +can also be used to configure them.
> +.\" Aleksa Sarai
> +.\"   Theoretically, a filesystem could check fc->oldapi and refuse
> +.\"   FSCONFIG_SET_STRING if the operation is coming from the new API, but no
> +.\"   filesystems do this (and probably never will).
> +However, other
> +.BI \%FSCONFIG_SET_ *
> +commands need to be opted into
> +by each filesystem driver's parameter parser.
> +.P
> +One of the most user-visible instances of
> +this inconsistency is that
> +many filesystems do not support
> +configuring path parameters with
> +.B \%FSCONFIG_SET_PATH
> +(despite the name),
> +which can lead to somewhat confusing
> +.B EINVAL
> +errors.
> +(For example, the generic
> +.I source
> +parameter\[em]which is usually a path\[em]can only be configured
> +with
> +.BR \%FSCONFIG_SET_STRING .)
> +.P
> +When writing programs that use
> +.BR fsconfig ()
> +to configure parameters
> +with commands other than
> +.BR \%FSCONFIG_SET_STRING ,
> +users should verify
> +that the
> +.BI \%FSCONFIG_SET_ *
> +commands used to configure each parameter
> +are supported by the corresponding filesystem driver.
> +.\" Aleksa Sarai
> +.\"   While this (quite confusing) inconsistency in behaviour is true today
> +.\"   (and has been true since this was merged), this appears to mostly be an
> +.\"   unintended consequence of filesystem drivers hand-coding fsparam parsing.
> +.\"   Path parameters are the most eggregious causes of confusion. Hopefully we
> +.\"   can make this no longer the case in a future kernel.
> +.SH EXAMPLES
> +To illustrate the different kinds of flags that can be configured with
> +.BR fsconfig (),
> +here are a few examples of some different filesystems being created:
> +.P
> +.in +4n
> +.EX
> +int fsfd, mntfd;
> +\&
> +fsfd = fsopen("tmpfs", FSOPEN_CLOEXEC);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "inode64", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "uid", "1234", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "huge", "never", 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "casefold", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOEXEC);
> +move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH);
> +\&
> +fsfd = fsopen("erofs", FSOPEN_CLOEXEC);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/loop0", 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0);
> +fsconfig(fsfd, FSCONFIG_CMD_CREATE_EXCL, NULL, NULL, 0);
> +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOSUID);
> +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +Usually,
> +specifying the same parameter named by
> +.I key
> +multiple times with
> +.BR fsconfig ()
> +causes the parameter value to be replaced.
> +However, some filesystems may have unique behaviour:
> +.P
> +.in +4n
> +.EX
> +\&
> +int fsfd, mntfd;
> +int lowerdirfd = open("/o/ctr/lower1", O_DIRECTORY | O_CLOEXEC);
> +\&
> +fsfd = fsopen("overlay", FSOPEN_CLOEXEC);
> +/* "lowerdir+" appends to the lower dir stack each time. */
> +fsconfig(fsfd, FSCONFIG_SET_FD, "lowerdir+", NULL, lowerdirfd);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower2", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower3", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower4", 0);
> +.\" fsconfig(fsfd, FSCONFIG_SET_PATH, "lowerdir+", "/o/ctr/lower5", AT_FDCWD);
> +.\" fsconfig(fsfd, FSCONFIG_SET_PATH_EMPTY, "lowerdir+", "", lowerdirfd);
> +.\" Aleksa Sarai: Hopefully these will also be supported in the future.
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "xino", "auto", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfs_export", "off", 0);
> +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0);
> +mntfd = fsmount(fsfd, FSMOUNT_CLOEXEC, 0);
> +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +And here is an example of how
> +.BR fspick (2)
> +can be used with
> +.BR fsconfig ()
> +to reconfigure the parameters
> +of an extant filesystem instance
> +attached to
> +.IR /proc :
> +.P
> +.in +4n
> +.EX
> +int fsfd = fspick(AT_FDCWD, "/proc", FSPICK_CLOEXEC);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "hidepid", "ptraceable", 0);
> +fsconfig(fsfd, FSCONFIG_SET_STRING, "subset", "pid", 0);
> +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> +.EE
> +.in
> +.SH SEE ALSO
> +.BR fsmount (2),
> +.BR fsopen (2),
> +.BR fspick (2),
> +.BR mount (2),
> +.BR mount_setattr (2),
> +.BR move_mount (2),
> +.BR open_tree (2),
> +.BR mount_namespaces (7)
> +
> 
> -- 
> 2.51.0
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 07/10] man/man2/open_tree.2: " Aleksa Sarai
@ 2025-09-21  9:47   ` Alejandro Colomar
  2025-09-22 10:09     ` Aleksa Sarai
  2025-09-30 12:32     ` G. Branden Robinson
  0 siblings, 2 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  9:47 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 15113 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:48AM +1000, Aleksa Sarai wrote:
> This is loosely based on the original documentation written by David
> Howells and later maintained by Christian Brauner, but has been
> rewritten to be more from a user perspective (as well as fixing a few
> critical mistakes).
> 
> Co-authored-by: David Howells <dhowells@redhat.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> Co-authored-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/open_tree.2 | 498 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 498 insertions(+)
> 
> diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
> new file mode 100644
> index 0000000000000000000000000000000000000000..7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4
> --- /dev/null
> +++ b/man/man2/open_tree.2
> @@ -0,0 +1,498 @@
> +.\" Copyright, the authors of the Linux man-pages project
> +.\"
> +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> +.\"
> +.TH open_tree 2 (date) "Linux man-pages (unreleased)"
> +.SH NAME
> +open_tree \- open path or create detached mount object and attach to fd
> +.SH LIBRARY
> +Standard C library
> +.RI ( libc ,\~ \-lc )
> +.SH SYNOPSIS
> +.nf
> +.BR "#define _GNU_SOURCE         " "/* See feature_test_macros(7) */"
> +.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
> +.B #include <sys/mount.h>
> +.P
> +.BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
> +.fi
> +.SH DESCRIPTION
> +The
> +.BR open_tree ()
> +system call is part of
> +the suite of file descriptor based mount facilities in Linux.
> +.IP \[bu] 3
> +If
> +.I flags
> +contains
> +.BR \%OPEN_TREE_CLONE ,
> +.BR open_tree ()
> +creates a detached mount object
> +which consists of a bind-mount of
> +the path specified by the
> +.IR path .
> +A new file descriptor
> +associated with the detached mount object
> +is then returned.
> +The mount object is equivalent to a bind-mount
> +that would be created by
> +.BR mount (2)
> +called with
> +.BR MS_BIND ,
> +except that it is tied to a file descriptor
> +and is not mounted onto the filesystem.
> +.IP
> +As with file descriptors returned from
> +.BR fsmount (2),
> +the resultant file descriptor can then be used with
> +.BR move_mount (2),
> +.BR mount_setattr (2),
> +or other such system calls to do further mount operations.
> +This mount object will be unmounted and destroyed
> +when the file descriptor is closed
> +if it was not otherwise attached to a mount point
> +by calling
> +.BR move_mount (2).
> +(Note that the unmount operation on

Maybe I would make this note a paragraph of its own; this would give it
more visibility, I think.  And I'd remove 'Note that', and start
directly with the noted contents (everything in a manual page must be
noteworthy, in general).

> +.BR close (2)

I'm a bit confused by the reference to close(2).  The previous text
mentions closing, but not close(2), so I'm not sure if this refers to
that or if it is comparing it to close(2).  Would you mind having a look
at the wording of this entire paragraph?

> +is lazy\[em]akin to calling

I prefer em dashes in both sides of the parenthetical; it more clearly
denotes where it ends.

	is lazy
	\[em]akin to calling
	.BR umount2 (2)
	with
	.BR MOUNT_DETACH \[em];

(I assume that's where it ends.)

> +.BR umount2 (2)
> +with
> +.BR MOUNT_DETACH ;
> +any existing open references to files
> +from the mount object
> +will continue to work,
> +and the mount object will only be completely destroyed
> +once it ceases to be busy.)
> +.IP \[bu]
> +If
> +.I flags
> +does not contain
> +.BR \%OPEN_TREE_CLONE ,
> +.BR open_tree ()
> +returns a file descriptor
> +that is exactly equivalent to
> +one produced by
> +.BR openat (2)
> +when called with the same
> +.I dirfd
> +and
> +.IR path .
> +.P
> +In either case, the resultant file descriptor
> +acts the same as one produced by
> +.BR open (2)
> +with
> +.BR O_PATH ,
> +meaning it can also be used as a
> +.I dirfd
> +argument to
> +"*at()" system calls.
> +.P
> +As with "*at()" system calls,
> +.BR open_tree ()
> +uses the
> +.I dirfd
> +argument in conjunction with the
> +.I path
> +argument to determine the path to operate on, as follows:
> +.IP \[bu] 3
> +If the pathname given in
> +.I path
> +is absolute, then
> +.I dirfd
> +is ignored.
> +.IP \[bu]
> +If the pathname given in
> +.I path
> +is relative and
> +.I dirfd
> +is the special value
> +.BR \%AT_FDCWD ,
> +then
> +.I path
> +is interpreted relative to
> +the current working directory
> +of the calling process (like
> +.BR open (2)).
> +.IP \[bu]
> +If the pathname given in
> +.I path
> +is relative,
> +then it is interpreted relative to
> +the directory referred to by the file descriptor
> +.I dirfd
> +(rather than relative to
> +the current working directory
> +of the calling process,
> +as is done by
> +.BR open (2)
> +for a relative pathname).
> +In this case,
> +.I dirfd
> +must be a directory
> +that was opened for reading
> +.RB ( O_RDONLY )
> +or using the
> +.B O_PATH
> +flag.
> +.IP \[bu]
> +If
> +.I path
> +is an empty string,
> +and
> +.I flags
> +contains
> +.BR \%AT_EMPTY_PATH ,
> +then the file descriptor
> +.I dirfd
> +is operated on directly.
> +In this case,
> +.I dirfd
> +may refer to any type of file,
> +not just a directory.
> +.P
> +See
> +.BR openat (2)
> +for an explanation of why the
> +.I dirfd
> +argument is useful.
> +.P
> +.I flags
> +can be used to control aspects of the path lookup
> +and properties of the returned file descriptor.
> +A value for
> +.I flags
> +is constructed by bitwise ORing
> +zero or more of the following constants:
> +.RS
> +.TP
> +.B \%AT_EMPTY_PATH
> +If
> +.I path
> +is an empty string, operate on the file referred to by
> +.I dirfd
> +(which may have been obtained from
> +.BR open (2),
> +.BR fsmount(2),
> +or from another
> +.BR open_tree ()
> +call).
> +In this case,
> +.I dirfd
> +may refer to any type of file, not just a directory.
> +If
> +.I dirfd
> +is
> +.BR \%AT_FDCWD ,
> +.BR open_tree ()
> +will operate on the current working directory
> +of the calling process.
> +This flag is Linux-specific; define
> +.B \%_GNU_SOURCE
> +to obtain its definition.
> +.TP
> +.B \%AT_NO_AUTOMOUNT
> +Do not automount the terminal ("basename") component of
> +.I path
> +if it is a directory that is an automount point.
> +This allows you to create a handle to the automount point itself,
> +rather than the location it would mount.
> +This flag has no effect if the mount point has already been mounted over.
> +This flag is Linux-specific; define
> +.B \%_GNU_SOURCE
> +to obtain its definition.
> +.TP
> +.B \%AT_SYMLINK_NOFOLLOW
> +If
> +.I path
> +is a symbolic link, do not dereference it; instead,
> +create either a handle to the link itself
> +or a bind-mount of it.
> +The resultant file descriptor is indistinguishable from one produced by
> +.BR openat (2)
> +with
> +.BR \%O_PATH | O_NOFOLLLOW .
> +.TP
> +.B \%OPEN_TREE_CLOEXEC
> +Set the close-on-exec
> +.RB ( FD_CLOEXEC )
> +flag on the new file descriptor.
> +See the description of the
> +.B O_CLOEXEC
> +flag in
> +.BR open (2)
> +for reasons why this may be useful.
> +.TP
> +.B \%OPEN_TREE_CLONE
> +Rather than creating an
> +.BR openat (2)-style
> +.B O_PATH
> +file descriptor,
> +create a bind-mount of
> +.I path
> +(akin to
> +.IR "mount --bind" )

You need to escape dashes in manual pages.  Otherwise, they're formatted
as hyphens, which can't be pasted into the terminal (and another
consequence is not being able to search for them in the man(1) reader
with literal dashes).

Depending on your system, you might be able to search for them or paste
them to the terminal, because some distros patch this in
/etc/local/an.tmac, at the expense of generating lower quality pages,
but in general don't rely on that.

I've noticed now, but this probably also happens in previous pages in
this patch set.

While at it, you should also use a non-breaking space, to keep the
entire command in the same line.

	.IR \%mount\~\-\-bind )


Cheers,
Alex

> +as a detached mount object.
> +In order to do this operation,
> +the calling process must have the
> +.BR \%CAP_SYS_ADMIN
> +capability.
> +.TP
> +.B \%AT_RECURSIVE
> +Create a recursive bind-mount of the path
> +(akin to
> +.IR "mount --rbind" )
> +as a detached mount object.
> +This flag is only permitted in conjunction with
> +.BR \%OPEN_TREE_CLONE .
> +.SH RETURN VALUE
> +On success, a new file descriptor is returned.
> +On error, \-1 is returned, and
> +.I errno
> +is set to indicate the error.
> +.SH ERRORS
> +.TP
> +.B EACCES
> +Search permission is denied for one of the directories
> +in the path prefix of
> +.IR path .
> +(See also
> +.BR path_resolution (7).)
> +.TP
> +.B EBADF
> +.I path
> +is relative but
> +.I dirfd
> +is neither
> +.B \%AT_FDCWD
> +nor a valid file descriptor.
> +.TP
> +.B EFAULT
> +.I path
> +is NULL
> +or a pointer to a location
> +outside the calling process's accessible address space.
> +.TP
> +.B EINVAL
> +Invalid flag specified in
> +.IR flags .
> +.TP
> +.B ELOOP
> +Too many symbolic links encountered when resolving
> +.IR path .
> +.TP
> +.B EMFILE
> +The calling process has too many open files to create more.
> +.TP
> +.B ENAMETOOLONG
> +.I path
> +is longer than
> +.BR PATH_MAX .
> +.TP
> +.B ENFILE
> +The system has too many open files to create more.
> +.TP
> +.B ENOENT
> +A component of
> +.I path
> +does not exist, or is a dangling symbolic link.
> +.TP
> +.B ENOENT
> +.I path
> +is an empty string, but
> +.B AT_EMPTY_PATH
> +is not specified in
> +.IR flags .
> +.TP
> +.B ENOTDIR
> +A component of the path prefix of
> +.I path
> +is not a directory, or
> +.I path
> +is relative and
> +.I dirfd
> +is a file descriptor referring to a file other than a directory.
> +.TP
> +.B ENOSPC
> +The "anonymous" mount namespace
> +necessary to contain the
> +.B \%OPEN_TREE_CLONE
> +detached bind-mount mount object
> +could not be allocated,
> +as doing so would exceed
> +the configured per-user limit on
> +the number of mount namespaces in the current user namespace.
> +(See also
> +.BR namespaces (7).)
> +.TP
> +.B ENOMEM
> +The kernel could not allocate sufficient memory to complete the operation.
> +.TP
> +.B EPERM
> +.I flags
> +contains
> +.B \%OPEN_TREE_CLONE
> +but the calling process does not have the required
> +.B CAP_SYS_ADMIN
> +capability.
> +.SH STANDARDS
> +Linux.
> +.SH HISTORY
> +Linux 5.2.
> +.\" commit a07b20004793d8926f78d63eb5980559f7813404
> +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> +glibc 2.36.
> +.SH NOTES
> +.SS Mount propagation
> +The bind-mount mount objects created by
> +.BR open_tree ()
> +with
> +.B \%OPEN_TREE_CLONE
> +are not associated with
> +the mount namespace of the calling process.
> +Instead, each mount object is placed
> +in a newly allocated "anonymous" mount namespace
> +associated with the calling process.
> +.P
> +One of the side-effects of this is that
> +(unlike bind-mounts created with
> +.BR mount (2)),
> +mount propagation
> +(as described in
> +.BR mount_namespaces (7))
> +will not be applied to bind-mounts created by
> +.BR open_tree ()
> +until the bind-mount is attached with
> +.BR move_mount (2),
> +at which point the mount object
> +will be associated with the mount namespace
> +where it was attached
> +and mount propagation will resume.
> +Note that any mount propagation events that occurred
> +before the mount object was attached
> +will
> +.I not
> +be propagated to the mount object,
> +even after it is attached.
> +.SH EXAMPLES
> +The following examples show how
> +.BR open_tree ()
> +can be used in place of more traditional
> +.BR mount (2)
> +calls with
> +.BR MS_BIND .
> +.P
> +.in +4n
> +.EX
> +int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE);
> +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +First,
> +a detached bind-mount mount object of
> +.I /var
> +is created
> +and associated with the file descriptor
> +.IR srcfd .
> +Then, the mount object is attached to
> +.I /mnt
> +using
> +.BR move_mount (2)
> +with
> +.B \%MOVE_MOUNT_F_EMPTY_PATH
> +to request that the detached mount object
> +associated with the file descriptor
> +.I srcfd
> +be moved (and thus attached) to
> +.IR /mnt .
> +.P
> +The above procedure is functionally equivalent to
> +the following mount operation using
> +.BR mount (2):
> +.P
> +.in +4n
> +.EX
> +mount("/var", "/mnt", NULL, MS_BIND, NULL);
> +.EE
> +.in
> +.P
> +.B \%OPEN_TREE_CLONE
> +can be combined with
> +.B \%AT_RECURSIVE
> +to create recursive detached bind-mount mount objects,
> +which in turn can be attached to mount points
> +to create recursive bind-mounts.
> +.P
> +.in +4n
> +.EX
> +int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE | AT_RECURSIVE);
> +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +The above procedure is functionally equivalent to
> +the following mount operation using
> +.BR mount (2):
> +.P
> +.in +4n
> +.EX
> +mount("/var", "/mnt", NULL, MS_BIND | MS_REC, NULL);
> +.EE
> +.in
> +.P
> +One of the primary benefits of using
> +.BR open_tree ()
> +and
> +.BR move_mount (2)
> +over the traditional
> +.BR mount (2)
> +is that operating with
> +.IR dirfd -style
> +file descriptors is far easier and more intuitive.
> +.P
> +.in +4n
> +.EX
> +int srcfd = open_tree(100, "", AT_EMPTY_PATH | OPEN_TREE_CLONE);
> +move_mount(srcfd, "", 200, "foo", MOVE_MOUNT_F_EMPTY_PATH);
> +.EE
> +.in
> +.P
> +The above procedure is roughly equivalent to
> +the following mount operation using
> +.BR mount (2):
> +.P
> +.in +4n
> +.EX
> +mount("/proc/self/fd/100", "/proc/self/fd/200/foo", NULL, MS_BIND, NULL);
> +.EE
> +.in
> +.P
> +In addition, you can use the file descriptor returned by
> +.BR open_tree ()
> +as the
> +.I dirfd
> +argument to any "*at()" system calls:
> +.P
> +.in +4n
> +.EX
> +int dirfd, fd;
> +\&
> +dirfd = open_tree(AT_FDCWD, "/etc", OPEN_TREE_CLONE);
> +fd = openat(dirfd, "passwd", O_RDONLY);
> +fchmodat(dirfd, "shadow", 0000, 0);
> +close(dirfd);
> +close(fd);
> +/* The bind-mount is now destroyed. */
> +.EE
> +.in
> +.SH SEE ALSO
> +.BR fsconfig (2),
> +.BR fsmount (2),
> +.BR fsopen (2),
> +.BR fspick (2),
> +.BR mount (2),
> +.BR mount_setattr (2),
> +.BR move_mount (2),
> +.BR mount_namespaces (7)
> 
> -- 
> 2.51.0
> 
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2)
  2025-09-19  1:59 ` [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2) Aleksa Sarai
@ 2025-09-21  9:50   ` Alejandro Colomar
  2025-09-21 10:03     ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  9:50 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1456 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:49AM +1000, Aleksa Sarai wrote:
> All of the other new mount API docs have this lead-in sentence in order
> to make this set of APIs feel a little bit more cohesive.  Despite being
> a bit of a latecomer, mount_setattr(2) is definitely part of this family
> of APIs and so deserves the same treatment.
> 
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>

Thanks!  I've applied this patch.
<https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=7022531182883ed1db5d4c926506cd373e0795ee>
(Use port :80/)


Cheers,
Alex

> ---
>  man/man2/mount_setattr.2 | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> index 4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8..b27db5b96665cfb0c387bf5b60776d45e0139956 100644
> --- a/man/man2/mount_setattr.2
> +++ b/man/man2/mount_setattr.2
> @@ -19,7 +19,11 @@ .SH SYNOPSIS
>  .SH DESCRIPTION
>  The
>  .BR mount_setattr ()
> -system call changes the mount properties of a mount or an entire mount tree.
> +system call is part of
> +the suite of file descriptor based mount facilities in Linux.
> +.P
> +.BR mount_setattr ()
> +changes the mount properties of a mount or an entire mount tree.
>  If
>  .I path
>  is relative,
> 
> -- 
> 2.51.0
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API
  2025-09-19  1:59 ` [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API Aleksa Sarai
@ 2025-09-21  9:55   ` Alejandro Colomar
  2025-09-24  6:31     ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  9:55 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 6625 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:50AM +1000, Aleksa Sarai wrote:
> This is a new API added in Linux 6.15, and is effectively just a minor
> expansion of open_tree(2) in order to allow for MOUNT_ATTR_IDMAP to be
> changed for an existing ID-mapped mount.  glibc does not yet have a
> wrapper for this.
> 
> While working on this man-page, I discovered a bug in open_tree_attr(2)
> that accidentally permitted changing MOUNT_ATTR_IDMAP for extant
> detached ID-mapped mount objects.  This is definitely a bug, but there
> is no need to add this to BUGS because the patch to fix this has already
> been accepted (slated for 6.18, and will be backported to 6.15+).

Okay.

> 
> Cc: Christian Brauner <brauner@kernel.org>
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/open_tree.2      | 140 ++++++++++++++++++++++++++++++++++++++++++++++
>  man/man2/open_tree_attr.2 |   1 +
>  2 files changed, 141 insertions(+)
> 
> diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
> index 7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4..60de4313a9d5be4ef3ff1217051f252506a2ade9 100644
> --- a/man/man2/open_tree.2
> +++ b/man/man2/open_tree.2
> @@ -15,7 +15,19 @@ .SH SYNOPSIS
>  .B #include <sys/mount.h>
>  .P
>  .BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
> +.P
> +.BR "#include <sys/syscall.h>" "    /* Definition of " SYS_* " constants */"
> +.P
> +.BI "int syscall(SYS_open_tree_attr, int " dirfd ", const char *" path ,
> +.BI "            unsigned int " flags ", struct mount_attr *_Nullable " attr ", \
> +size_t " size );

Do we maybe want to move this to its own separate page?

The separate page could perfectly contain the same exact text you're
adding here; you don't need to repeat open_tree() descriptions.

In general, I feel that while this improves discoverability of related
functions, it produces more complex pages.


Cheers,
Alex

>  .fi
> +.P
> +.IR Note :
> +glibc provides no wrapper for
> +.BR open_tree_attr (),
> +necessitating the use of
> +.BR syscall (2).
>  .SH DESCRIPTION
>  The
>  .BR open_tree ()
> @@ -246,6 +258,129 @@ .SH DESCRIPTION
>  as a detached mount object.
>  This flag is only permitted in conjunction with
>  .BR \%OPEN_TREE_CLONE .
> +.SS open_tree_attr()
> +The
> +.BR open_tree_attr ()
> +system call operates in exactly the same way as
> +.BR open_tree (),
> +except for the differences described here.
> +.P
> +After performing the same operation as with
> +.BR open_tree (),
> +.BR open_tree_attr ()
> +will apply the mount attribute changes described in
> +.I attr
> +to the file descriptor before it is returned.
> +(See
> +.BR mount_attr (2type)
> +for a description of the
> +.I mount_attr
> +structure.
> +As described in
> +.BR mount_setattr (2),
> +.I size
> +must be set to
> +.I sizeof(struct mount_attr)
> +in order to support future extensions.)
> +If
> +.I attr
> +is NULL,
> +or has
> +.IR attr.attr_clr ,
> +.IR attr.attr_set ,
> +and
> +.I attr.propagation
> +all set to zero,
> +then
> +.BR open_tree_attr ()
> +has identical behaviour to
> +.BR open_tree ().
> +.P
> +The application of
> +.I attr
> +to the resultant file descriptor
> +has identical semantics to
> +.BR mount_setattr (2),
> +except for the following extensions and general caveats:
> +.IP \[bu] 3
> +Unlike
> +.BR mount_setattr (2)
> +called with a regular
> +.B OPEN_TREE_CLONE
> +detached mount object from
> +.BR open_tree (),
> +.BR open_tree_attr ()
> +can specify a different setting for
> +.B \%MOUNT_ATTR_IDMAP
> +to the original mount object cloned with
> +.BR OPEN_TREE_CLONE .
> +.IP
> +Adding
> +.B \%MOUNT_ATTR_IDMAP
> +to
> +.I attr.attr_clr
> +will disable ID-mapping for the new mount object;
> +adding
> +.B \%MOUNT_ATTR_IDMAP
> +to
> +.I attr.attr_set
> +will configure the mount object to have the ID-mapping defined by
> +the user namespace referenced by the file descriptor
> +.IR attr.userns_fd .
> +(The semantics of which are identical to when
> +.BR mount_setattr (2)
> +is used to configure
> +.BR \%MOUNT_ATTR_IDMAP .)
> +.IP
> +Changing or removing the mapping
> +of an ID-mapped mount is only permitted
> +if a new detached mount object is being created with
> +.I flags
> +including
> +.BR \%OPEN_TREE_CLONE .
> +.\" Aleksa Sarai
> +.\"  At time of writing, this is not actually true because of a bug where
> +.\"  open_tree_attr() would accidentally permit changing MOUNT_ATTR_IDMAP for
> +.\"  existing detached mount objects without setting OPEN_TREE_CLONE, but a
> +.\"  patch to fix it has been slated for 6.18 and will be backported to 6.15+.
> +.\"  <https://lore.kernel.org/r/20250808-open_tree_attr-bugfix-idmap-v1-0-0ec7bc05646c@cyphar.com/>
> +.IP \[bu]
> +If
> +.I flags
> +contains
> +.BR \%AT_RECURSIVE ,
> +then the attributes described in
> +.I attr
> +are applied recursively
> +(just as when
> +.BR mount_setattr (2)
> +is called with
> +.BR \%AT_RECURSIVE ).
> +However, this applies in addition to the
> +.BR open_tree ()-specific
> +behaviour regarding
> +.BR \%AT_RECURSIVE ,
> +and thus
> +.I flags
> +must also contain
> +.BR \%OPEN_TREE_CLONE .
> +.P
> +Note that if
> +.I flags
> +does not contain
> +.BR \%OPEN_TREE_CLONE ,
> +.BR open_tree_attr ()
> +will attempt to modify the mount attributes of
> +the mount object attached at
> +the path described by
> +.I dirfd
> +and
> +.IR path .
> +As with
> +.BR mount_setattr (2),
> +if said path is not a mount point,
> +.BR open_tree_attr ()
> +will return an error.
>  .SH RETURN VALUE
>  On success, a new file descriptor is returned.
>  On error, \-1 is returned, and
> @@ -339,10 +474,15 @@ .SH ERRORS
>  .SH STANDARDS
>  Linux.
>  .SH HISTORY
> +.SS open_tree()
>  Linux 5.2.
>  .\" commit a07b20004793d8926f78d63eb5980559f7813404
>  .\" commit 400913252d09f9cfb8cce33daee43167921fc343
>  glibc 2.36.
> +.SS open_tree_attr()
> +Linux 6.15.
> +.\" commit c4a16820d90199409c9bf01c4f794e1e9e8d8fd8
> +.\" commit 7a54947e727b6df840780a66c970395ed9734ebe
>  .SH NOTES
>  .SS Mount propagation
>  The bind-mount mount objects created by
> diff --git a/man/man2/open_tree_attr.2 b/man/man2/open_tree_attr.2
> new file mode 100644
> index 0000000000000000000000000000000000000000..e57269bbd269bcce0b0a974425644ba75e379f2f
> --- /dev/null
> +++ b/man/man2/open_tree_attr.2
> @@ -0,0 +1 @@
> +.so man2/open_tree.2
> 
> -- 
> 2.51.0
> 
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/10] man2: document "new" mount API
  2025-09-21  2:43 ` [PATCH v4 00/10] man2: document "new" mount API Askar Safin
@ 2025-09-21  9:58   ` Aleksa Sarai
  2025-09-24 11:11     ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-21  9:58 UTC (permalink / raw)
  To: Askar Safin
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

[-- Attachment #1: Type: text/plain, Size: 1821 bytes --]

On 2025-09-21, Askar Safin <safinaskar@gmail.com> wrote:
> Aleksa, thank you! Don't give up. We all need these manpages.
> 
> I see you didn't address some my previous notes.
> 
> * move_mount(2) still says "Mount objects cannot be attached beneath the filesystem root".
> I suggest saying "root directory" or "root" or "root directory of the process" or just "/"
> instead. But you may keep this phrase as is, of course.
> 
> * Docs for FSPICK_NO_AUTOMOUNT in fspick(2) are still wrong. They say that FSPICK_NO_AUTOMOUNT
> affects all components of path. Similar thing applies to mount_setattr(2) and move_mount(2)

Sorry, I last read through your review comments a month ago, I must've
forgotten to make these changes back then. I'll include them in v5.
(It seems I remembered to change the open_tree(2) automount one but
forgot to do it for the others, oops!)

> * open_tree(2) still says:
> > If flags does not contain OPEN_TREE_CLONE, open_tree() returns a file descriptor
> > that is exactly equivalent to one produced by openat(2) when called with the same dirfd and path.
> 
> This is not true if automounts are involved. I suggest adding "modulo automounts". But you may
> keep everything, of course.

Hmmm. As we discussed last time, this sentence is more intended to
indicate that the file descriptor is just a regular open file (with no
dissolve_on_fput() + FMODE_NEED_UNMOUNT magic) rather than the exact
behaviour you get with regards to path lookup.

I would honestly prefer to remove "when called with the same dirfd and
path" rather than add caveats, but I think it makes the sentence less
readable... I'll think about it and try to fix this wording up somehow
for v5.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction
  2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
  2025-09-19 12:23   ` Aleksa Sarai
@ 2025-09-21  9:59   ` Alejandro Colomar
  2025-09-22  1:06   ` Askar Safin
  2 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21  9:59 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 3498 bytes --]

Hi Aleksa,

On Fri, Sep 19, 2025 at 11:59:51AM +1000, Aleksa Sarai wrote:
> This was not particularly well documented in mount(8) nor mount(2), and
> since this is a fairly notable aspect of the new mount API, we should
> probably add some words about it.
> 
> Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> ---
>  man/man2/fsconfig.2      | 12 ++++++++++++
>  man/man2/mount_setattr.2 | 40 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 52 insertions(+)
> 
> diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2
> index 5a18e08c700ac93aa22c341b4134944ee3c38d0b..d827a7b96e08284fb025f94c3348a4acc4571b7d 100644
> --- a/man/man2/fsconfig.2
> +++ b/man/man2/fsconfig.2
> @@ -579,6 +579,18 @@ .SS Generic filesystem parameters
>  Linux Security Modules (LSMs)
>  are also generic with respect to the underlying filesystem.
>  See the documentation for the LSM you wish to configure for more details.
> +.SS Mount attributes and filesystem parameters
> +Some filesystem parameters
> +(traditionally associated with
> +.BR mount (8)-style
> +options)
> +have a sibling mount attribute
> +with superficially similar user-facing behaviour.
> +.P
> +For a description of the distinction between
> +mount attributes and filesystem parameters,
> +see the "Mount attributes and filesystem parameters" subsection of
> +.BR mount_setattr (2).
>  .SH CAVEATS
>  .SS Filesystem parameter types
>  As a result of
> diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> index b27db5b96665cfb0c387bf5b60776d45e0139956..f7d0b96fddf97698e36cab020f1d695783143025 100644
> --- a/man/man2/mount_setattr.2
> +++ b/man/man2/mount_setattr.2
> @@ -790,6 +790,46 @@ .SS ID-mapped mounts
>  .BR chown (2)
>  system call changes the ownership globally and permanently.
>  .\"
> +.SS Mount attributes and filesystem parameters
> +Some mount attributes
> +(traditionally associated with
> +.BR mount (8)-style
> +options)
> +have a sibling mount attribute
> +with superficially similar user-facing behaviour.
> +For example, the
> +.I -o ro

As said in another page, this should be

	.I \-o\~ro

> +option to
> +.BR mount (8)
> +can refer to the
> +"read-only" filesystem parameter,
> +or the "read-only" mount attribute.
> +Both of these result in mount objects becoming read-only,
> +but they do have different behaviour.
> +.P
> +The distinction between these two kinds of option is that
> +mount object attributes are applied per-mount-object
> +(allowing different mount objects
> +derived from a given filesystem instance
> +to have different attributes),
> +while filesystem instance parameters
> +("superblock flags" in kernel-developer parlance)
> +apply to all mount objects
> +derived from the same filesystem instance.
> +.P
> +When using
> +.BR mount (2),
> +the line between these two types of mount options was blurred.
> +However, with
> +.BR mount_setattr ()
> +and
> +.BR fsconfig (2),
> +the distinction is made much clearer.
> +Mount attributes are configured with
> +.BR mount_setattr (),
> +while filesystem parameters can be configured using
> +.BR fsconfig (2).
> +.\"

LGTM.

I've finished with the review of the patch set.  It's quite good.  :)


Have a lovely day!
Alex

>  .SS Extensibility
>  In order to allow for future extensibility,
>  .BR mount_setattr ()
> 
> -- 
> 2.51.0
> 
> 

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2)
  2025-09-21  9:50   ` Alejandro Colomar
@ 2025-09-21 10:03     ` Aleksa Sarai
  2025-09-21 10:19       ` Alejandro Colomar
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-21 10:03 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1802 bytes --]

On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> Hi Aleksa,
> 
> On Fri, Sep 19, 2025 at 11:59:49AM +1000, Aleksa Sarai wrote:
> > All of the other new mount API docs have this lead-in sentence in order
> > to make this set of APIs feel a little bit more cohesive.  Despite being
> > a bit of a latecomer, mount_setattr(2) is definitely part of this family
> > of APIs and so deserves the same treatment.
> > 
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> 
> Thanks!  I've applied this patch.
> <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=7022531182883ed1db5d4c926506cd373e0795ee>
> (Use port :80/)

Ah, you forgot to switch to "file-descriptor-based" like you suggested
in patch 1. ;)

> 
> Cheers,
> Alex
> 
> > ---
> >  man/man2/mount_setattr.2 | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> > index 4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8..b27db5b96665cfb0c387bf5b60776d45e0139956 100644
> > --- a/man/man2/mount_setattr.2
> > +++ b/man/man2/mount_setattr.2
> > @@ -19,7 +19,11 @@ .SH SYNOPSIS
> >  .SH DESCRIPTION
> >  The
> >  .BR mount_setattr ()
> > -system call changes the mount properties of a mount or an entire mount tree.
> > +system call is part of
> > +the suite of file descriptor based mount facilities in Linux.
> > +.P
> > +.BR mount_setattr ()
> > +changes the mount properties of a mount or an entire mount tree.
> >  If
> >  .I path
> >  is relative,
> > 
> > -- 
> > 2.51.0
> > 
> 
> -- 
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).



-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2)
  2025-09-21 10:03     ` Aleksa Sarai
@ 2025-09-21 10:19       ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21 10:19 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 2175 bytes --]

On Sun, Sep 21, 2025 at 08:03:08PM +1000, Aleksa Sarai wrote:
> On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> > Hi Aleksa,
> > 
> > On Fri, Sep 19, 2025 at 11:59:49AM +1000, Aleksa Sarai wrote:
> > > All of the other new mount API docs have this lead-in sentence in order
> > > to make this set of APIs feel a little bit more cohesive.  Despite being
> > > a bit of a latecomer, mount_setattr(2) is definitely part of this family
> > > of APIs and so deserves the same treatment.
> > > 
> > > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > 
> > Thanks!  I've applied this patch.
> > <https://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git/commit/?h=contrib&id=7022531182883ed1db5d4c926506cd373e0795ee>
> > (Use port :80/)
> 
> Ah, you forgot to switch to "file-descriptor-based" like you suggested
> in patch 1. ;)

Oh, thanks for the reminder!  :-)

I've amended it, and now pushed to <kernel.org>.


Cheers,
Alex

> 
> > 
> > Cheers,
> > Alex
> > 
> > > ---
> > >  man/man2/mount_setattr.2 | 6 +++++-
> > >  1 file changed, 5 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2
> > > index 4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8..b27db5b96665cfb0c387bf5b60776d45e0139956 100644
> > > --- a/man/man2/mount_setattr.2
> > > +++ b/man/man2/mount_setattr.2
> > > @@ -19,7 +19,11 @@ .SH SYNOPSIS
> > >  .SH DESCRIPTION
> > >  The
> > >  .BR mount_setattr ()
> > > -system call changes the mount properties of a mount or an entire mount tree.
> > > +system call is part of
> > > +the suite of file descriptor based mount facilities in Linux.
> > > +.P
> > > +.BR mount_setattr ()
> > > +changes the mount properties of a mount or an entire mount tree.
> > >  If
> > >  .I path
> > >  is relative,
> > > 
> > > -- 
> > > 2.51.0
> > > 
> > 
> > -- 
> > <https://www.alejandro-colomar.es>
> > Use port 80 (that is, <...:80/>).
> 
> 
> 
> -- 
> Aleksa Sarai
> Senior Software Engineer (Containers)
> SUSE Linux GmbH
> https://www.cyphar.com/



-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API
  2025-09-21  8:53   ` Alejandro Colomar
@ 2025-09-21 14:55     ` Aleksa Sarai
  2025-09-21 15:17       ` Alejandro Colomar
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-21 14:55 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 11709 bytes --]

On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> Hi Aleksa,
> 
> On Fri, Sep 19, 2025 at 11:59:44AM +1000, Aleksa Sarai wrote:
> > This is loosely based on the original documentation written by David
> > Howells and later maintained by Christian Brauner, but has been
> > rewritten to be more from a user perspective (as well as fixing a few
> > critical mistakes).
> > 
> > Co-authored-by: David Howells <dhowells@redhat.com>
> > Signed-off-by: David Howells <dhowells@redhat.com>
> > Co-authored-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> >  man/man2/fspick.2 | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 342 insertions(+)
> > 
> > diff --git a/man/man2/fspick.2 b/man/man2/fspick.2
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..1f87293f44658adeb7ab7cffebcac3174888f040
> > --- /dev/null
> > +++ b/man/man2/fspick.2
> > @@ -0,0 +1,342 @@
> > +.\" Copyright, the authors of the Linux man-pages project
> > +.\"
> > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > +.\"
> > +.TH fspick 2 (date) "Linux man-pages (unreleased)"
> > +.SH NAME
> > +fspick \- select filesystem for reconfiguration
> > +.SH LIBRARY
> > +Standard C library
> > +.RI ( libc ,\~ \-lc )
> > +.SH SYNOPSIS
> > +.nf
> > +.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
> > +.B #include <sys/mount.h>
> > +.P
> > +.BI "int fspick(int " dirfd ", const char *" path ", unsigned int " flags );
> > +.fi
> > +.SH DESCRIPTION
> > +The
> > +.BR fspick ()
> > +system call is part of
> > +the suite of file descriptor based mount facilities in Linux.
> > +.P
> > +.BR fspick()
> > +creates a new filesystem configuration context
> > +for the extant filesystem instance
> > +associated with the path described by
> > +.IR dirfd
> > +and
> > +.IR path ,
> > +places it into reconfiguration mode
> > +(similar to
> > +.BR mount (8)
> > +with the
> > +.I -o remount
> > +option).
> > +A new file descriptor
> > +associated with the filesystem configuration context
> > +is then returned.
> > +The calling process must have the
> > +.BR CAP_SYS_ADMIN
> 
> This should use '.B. (Bold).  BR means alternating Bold and Roman, but
> this only has one token, so it can't alternate.
> 
> If you run `make -R build-catman-troff`, this will trigger a diagnostic:
> 
> 	an.tmac: <page>:<line>: style: .BR expects at least 2 arguments, got 1

Grr, I thought I fixed all of these. I must've changed it in a rework
and forgot to fix it.

> > +capability in order to create a new filesystem configuration context.
> > +.P
> > +The resultant file descriptor can be used with
> > +.BR fsconfig (2)
> > +to specify the desired set of changes to
> > +filesystem parameters of the filesystem instance.
> > +Once the desired set of changes have been configured,
> > +the changes can be effectuated by calling
> > +.BR fsconfig (2)
> > +with the
> > +.B \%FSCONFIG_CMD_RECONFIGURE
> > +command.
> > +Please note that\[em]in contrast to
> > +the behaviour of
> > +.B MS_REMOUNT
> > +with
> > +.BR mount (2)\[em] fspick ()
> 
> Only have one important keyword per macro call.  In this case, I prefer
> em dashes to only be attached to one side, as if they were parentheses,
> so we don't need any tricks:
> 
> 	Please note that
> 	\[em]in contrast to
> 	...
> 	.BR mount (2)\[em]
> 	.BR fspick ()

Based on my testing, doing it that way adds whitespace to one side of
the em dash and typographically em dashes should not have whitespace on
either side AFAIK. If there is a way to get the layout right without
breaking the "one macro per line" rule, I'd love to know! :D

> > +instantiates the filesystem configuration context
> > +with a copy of
> > +the extant filesystem's filesystem parameters,
> > +meaning that a subsequent
> > +.B \%FSCONFIG_CMD_RECONFIGURE
> > +operation
> > +will only update filesystem parameters
> > +explicitly modified with
> > +.BR fsconfig (2).
> > +.P
> > +As with "*at()" system calls,
> > +.BR fspick ()
> > +uses the
> > +.I dirfd
> > +argument in conjunction with the
> > +.I path
> > +argument to determine the path to operate on, as follows:
> > +.IP \[bu] 3
> > +If the pathname given in
> > +.I path
> > +is absolute, then
> > +.I dirfd
> > +is ignored.
> > +.IP \[bu]
> > +If the pathname given in
> > +.I path
> > +is relative and
> > +.I dirfd
> > +is the special value
> > +.BR \%AT_FDCWD ,
> > +then
> > +.I path
> > +is interpreted relative to
> > +the current working directory
> > +of the calling process (like
> > +.BR open (2)).
> > +.IP \[bu]
> > +If the pathname given in
> > +.I path
> > +is relative,
> > +then it is interpreted relative to
> > +the directory referred to by the file descriptor
> > +.I dirfd
> > +(rather than relative to
> > +the current working directory
> > +of the calling process,
> > +as is done by
> > +.BR open (2)
> > +for a relative pathname).
> > +In this case,
> > +.I dirfd
> > +must be a directory
> > +that was opened for reading
> > +.RB ( O_RDONLY )
> > +or using the
> > +.B O_PATH
> > +flag.
> > +.IP \[bu]
> > +If
> > +.I path
> > +is an empty string,
> > +and
> > +.I flags
> > +contains
> > +.BR \%FSPICK_EMPTY_PATH ,
> > +then the file descriptor
> > +.I dirfd
> > +is operated on directly.
> > +In this case,
> > +.I dirfd
> > +may refer to any type of file,
> > +not just a directory.
> > +.P
> > +See
> > +.BR openat (2)
> > +for an explanation of why the
> > +.I dirfd
> > +argument is useful.
> > +.P
> > +.I flags
> > +can be used to control aspects of how
> > +.I path
> > +is resolved and
> > +properties of the returned file descriptor.
> > +A value for
> > +.I flags
> > +is constructed by bitwise ORing
> > +zero or more of the following constants:
> > +.RS
> > +.TP
> > +.B FSPICK_CLOEXEC
> > +Set the close-on-exec
> > +.RB ( FD_CLOEXEC )
> > +flag on the new file descriptor.
> > +See the description of the
> > +.B O_CLOEXEC
> > +flag in
> > +.BR open (2)
> > +for reasons why this may be useful.
> > +.TP
> > +.B FSPICK_EMPTY_PATH
> > +If
> > +.I path
> > +is an empty string,
> > +operate on the file referred to by
> > +.I dirfd
> > +(which may have been obtained from
> > +.BR open (2),
> > +.BR fsmount (2),
> > +or
> > +.BR open_tree (2)).
> > +In this case,
> > +.I dirfd
> > +may refer to any type of file,
> > +not just a directory.
> > +If
> > +.I dirfd
> > +is
> > +.BR \%AT_FDCWD ,
> > +.BR fspick ()
> > +will operate on the current working directory
> > +of the calling process.
> > +.TP
> > +.B FSPICK_SYMLINK_NOFOLLOW
> > +Do not follow symbolic links
> > +in the terminal component of
> > +.IR path .
> > +If
> > +.I path
> > +references a symbolic link,
> > +the returned filesystem context will reference
> > +the filesystem that the symbolic link itself resides on.
> > +.TP
> > +.B FSPICK_NO_AUTOMOUNT
> > +Do not automount any automount points encountered
> > +while resolving
> > +.IR path .
> > +This allows you to reconfigure an automount point,
> > +rather than the location that would be mounted.
> > +This flag has no effect if
> > +the automount point has already been mounted over.
> 
> I'll amend other similar issues if I find them, but in general, I'd put
> the 'if' in the next line, as it is more tied to that part of the
> sentence (think for example that if you reversed the sentence to say
> "if ..., then ...", you'd move the 'if' with what follows it.  You don't
> need to search for all of these and fix them; just keep it in mind for
> next time.  In general I like the break points you used.
> 
> 
> Have a lovely day!
> Alex
> 
> > +.RE
> > +.P
> > +As with filesystem contexts created with
> > +.BR fsopen (2),
> > +the file descriptor returned by
> > +.BR fspick ()
> > +may be queried for message strings at any time by calling
> > +.BR read (2)
> > +on the file descriptor.
> > +(See the "Message retrieval interface" subsection in
> > +.BR fsopen (2)
> > +for more details on the message format.)
> > +.SH RETURN VALUE
> > +On success, a new file descriptor is returned.
> > +On error, \-1 is returned, and
> > +.I errno
> > +is set to indicate the error.
> > +.SH ERRORS
> > +.TP
> > +.B EACCES
> > +Search permission is denied
> > +for one of the directories
> > +in the path prefix of
> > +.IR path .
> > +(See also
> > +.BR path_resolution (7).)
> > +.TP
> > +.B EBADF
> > +.I path
> > +is relative but
> > +.I dirfd
> > +is neither
> > +.B \%AT_FDCWD
> > +nor a valid file descriptor.
> > +.TP
> > +.B EFAULT
> > +.I path
> > +is NULL
> > +or a pointer to a location
> > +outside the calling process's accessible address space.
> > +.TP
> > +.B EINVAL
> > +Invalid flag specified in
> > +.IR flags .
> > +.TP
> > +.B ELOOP
> > +Too many symbolic links encountered when resolving
> > +.IR path .
> > +.TP
> > +.B EMFILE
> > +The calling process has too many open files to create more.
> > +.TP
> > +.B ENAMETOOLONG
> > +.I path
> > +is longer than
> > +.BR PATH_MAX .
> > +.TP
> > +.B ENFILE
> > +The system has too many open files to create more.
> > +.TP
> > +.B ENOENT
> > +A component of
> > +.I path
> > +does not exist,
> > +or is a dangling symbolic link.
> > +.TP
> > +.B ENOENT
> > +.I path
> > +is an empty string, but
> > +.B \%FSPICK_EMPTY_PATH
> > +is not specified in
> > +.IR flags .
> > +.TP
> > +.B ENOTDIR
> > +A component of the path prefix of
> > +.I path
> > +is not a directory;
> > +or
> > +.I path
> > +is relative and
> > +.I dirfd
> > +is a file descriptor referring to a file other than a directory.
> > +.TP
> > +.B ENOMEM
> > +The kernel could not allocate sufficient memory to complete the operation.
> > +.TP
> > +.B EPERM
> > +The calling process does not have the required
> > +.B \%CAP_SYS_ADMIN
> > +capability.
> > +.SH STANDARDS
> > +Linux.
> > +.SH HISTORY
> > +Linux 5.2.
> > +.\" commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb
> > +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> > +glibc 2.36.
> > +.SH EXAMPLES
> > +The following example sets the read-only flag
> > +on the filesystem instance referenced by
> > +the mount object attached at
> > +.IR /tmp .
> > +.P
> > +.in +4n
> > +.EX
> > +int fsfd = fspick(AT_FDCWD, "/tmp", FSPICK_CLOEXEC);
> > +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0);
> > +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0);
> > +.EE
> > +.in
> > +.P
> > +The above procedure is roughly equivalent to
> > +the following mount operation using
> > +.BR mount (2):
> > +.P
> > +.in +4n
> > +.EX
> > +mount(NULL, "/tmp", NULL, MS_REMOUNT | MS_RDONLY, NULL);
> > +.EE
> > +.in
> > +.P
> > +With the notable caveat that
> > +in this example,
> > +.BR mount (2)
> > +will clear all other filesystem parameters
> > +(such as
> > +.B MS_NOSUID
> > +or
> > +.BR MS_NOEXEC );
> > +.BR fsconfig (2)
> > +will only modify the
> > +.I ro
> > +parameter.
> > +.SH SEE ALSO
> > +.BR fsconfig (2),
> > +.BR fsmount (2),
> > +.BR fsopen (2),
> > +.BR mount (2),
> > +.BR mount_setattr (2),
> > +.BR move_mount (2),
> > +.BR open_tree (2),
> > +.BR mount_namespaces (7)
> > +
> > 
> > -- 
> > 2.51.0
> > 
> 
> -- 
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).



-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API
  2025-09-21 14:55     ` Aleksa Sarai
@ 2025-09-21 15:17       ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-21 15:17 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 2071 bytes --]

Hi Aleksa,

On Mon, Sep 22, 2025 at 12:55:13AM +1000, Aleksa Sarai wrote:
> > This should use '.B. (Bold).  BR means alternating Bold and Roman, but
> > this only has one token, so it can't alternate.
> > 
> > If you run `make -R build-catman-troff`, this will trigger a diagnostic:
> > 
> > 	an.tmac: <page>:<line>: style: .BR expects at least 2 arguments, got 1
> 
> Grr, I thought I fixed all of these. I must've changed it in a rework
> and forgot to fix it.

No problem; mistakes happen.  :)

> > > +Please note that\[em]in contrast to
> > > +the behaviour of
> > > +.B MS_REMOUNT
> > > +with
> > > +.BR mount (2)\[em] fspick ()
> > 
> > Only have one important keyword per macro call.  In this case, I prefer
> > em dashes to only be attached to one side, as if they were parentheses,
> > so we don't need any tricks:
> > 
> > 	Please note that
> > 	\[em]in contrast to
> > 	...
> > 	.BR mount (2)\[em]
> > 	.BR fspick ()
> 
> Based on my testing, doing it that way adds whitespace to one side of
> the em dash

You're correct; this adds whitespace on one side of the em dash.

> and typographically em dashes should not have whitespace on
> either side AFAIK.

This rule differs for different style guides, and different languages.
In Spanish, the most common style is having spaces as if the dashes
were parentheses; very much in a logical style, like quotes not having
extraneous punctuation inside them.

I very much prefer the Spanish conventions, and dislike the more common
used conventions for English.  I don't know if Branden can illustrate us
with some history about em dashes.

> If there is a way to get the layout right without
> breaking the "one macro per line" rule, I'd love to know! :D

There's a way.  I'll show it just for your curiosity.  :D


	.BR mount (2)\[em]\c
	.BR fspick ()

(I hope it works, because I haven't tested it.  Accidental typos might
 break my untested examples.)  :)


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 05/10] man/man2/fsmount.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 05/10] man/man2/fsmount.2: " Aleksa Sarai
@ 2025-09-21 23:08   ` Askar Safin
  2025-09-22  1:10     ` Askar Safin
  0 siblings, 1 reply; 43+ messages in thread
From: Askar Safin @ 2025-09-21 23:08 UTC (permalink / raw)
  To: cyphar
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

> Note that the unmount operation on close(2) is lazy—akin to calling umount2(2) with MOUNT_DETACH

MNT_DETACH, not MOUNT_DETACH

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API
  2025-09-19  1:59 ` [PATCH v4 03/10] man/man2/fspick.2: " Aleksa Sarai
  2025-09-21  8:53   ` Alejandro Colomar
@ 2025-09-22  0:25   ` Askar Safin
  1 sibling, 0 replies; 43+ messages in thread
From: Askar Safin @ 2025-09-22  0:25 UTC (permalink / raw)
  To: cyphar
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

> With the notable caveat that in this example, mount(2) will clear all other filesystem parameters (such as MS_NOSUID or MS_NOEXEC); fsconfig(2) will only modify the ro parameter.

MS_NOSUID and MS_NOEXEC are not filesystem parameters. They can be set per-mount, but not
per-filesystem. Here is list of all filesystem-agnostic per-superblock parameters:

https://elixir.bootlin.com/linux/v6.17-rc6/source/fs/namespace.c#L4103

Note that these SB_* constants are equal to corresponding MS_* constants.

As you can see, there is no NOSUID and NOEXEC in that list.

Also, SB_NOSUID does exist:
https://elixir.bootlin.com/linux/v6.17-rc6/source/include/linux/fs.h#L1240
.

So, it seems that "NOSUID superblock" does exist as a concept. But, thanks to
code in path_mount (provided above) user cannot (in filesystem-agnostic way)
make given superblock NOSUID.

So, from user point of view, NOSUID and NOEXEC are not filesystem parameters.

If you need some example of filesystem parameter, I suggest MS_SYNCHRONOUS,
I used it here:
https://lore.kernel.org/all/198d1f2e189.11dbac16b2998.3847935512688537521@zohomail.com/

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction
  2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
  2025-09-19 12:23   ` Aleksa Sarai
  2025-09-21  9:59   ` Alejandro Colomar
@ 2025-09-22  1:06   ` Askar Safin
  2 siblings, 0 replies; 43+ messages in thread
From: Askar Safin @ 2025-09-22  1:06 UTC (permalink / raw)
  To: cyphar
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

> Some mount attributes (traditionally associated with mount(8)-style options) have a sibling mount attribute with superficially similar user-facing behaviour

"Some mount attributes... have a sibling mount attribute"

Something is wrong here.

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 05/10] man/man2/fsmount.2: document "new" mount API
  2025-09-21 23:08   ` Askar Safin
@ 2025-09-22  1:10     ` Askar Safin
  0 siblings, 0 replies; 43+ messages in thread
From: Askar Safin @ 2025-09-22  1:10 UTC (permalink / raw)
  To: safinaskar
  Cc: alx, brauner, cyphar, dhowells, g.branden.robinson, jack,
	linux-api, linux-fsdevel, linux-kernel, linux-man, mtk.manpages,
	safinaskar, viro

> MNT_DETACH, not MOUNT_DETACH

Same for open_tree and open_tree_attr

-- 
Askar Safin

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-21  9:47   ` Alejandro Colomar
@ 2025-09-22 10:09     ` Aleksa Sarai
  2025-09-22 13:22       ` Alejandro Colomar
  2025-09-30 12:32     ` G. Branden Robinson
  1 sibling, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-22 10:09 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 17051 bytes --]

On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> Hi Aleksa,
> 
> On Fri, Sep 19, 2025 at 11:59:48AM +1000, Aleksa Sarai wrote:
> > This is loosely based on the original documentation written by David
> > Howells and later maintained by Christian Brauner, but has been
> > rewritten to be more from a user perspective (as well as fixing a few
> > critical mistakes).
> > 
> > Co-authored-by: David Howells <dhowells@redhat.com>
> > Signed-off-by: David Howells <dhowells@redhat.com>
> > Co-authored-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Christian Brauner <brauner@kernel.org>
> > Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
> > ---
> >  man/man2/open_tree.2 | 498 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 498 insertions(+)
> > 
> > diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
> > new file mode 100644
> > index 0000000000000000000000000000000000000000..7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4
> > --- /dev/null
> > +++ b/man/man2/open_tree.2
> > @@ -0,0 +1,498 @@
> > +.\" Copyright, the authors of the Linux man-pages project
> > +.\"
> > +.\" SPDX-License-Identifier: Linux-man-pages-copyleft
> > +.\"
> > +.TH open_tree 2 (date) "Linux man-pages (unreleased)"
> > +.SH NAME
> > +open_tree \- open path or create detached mount object and attach to fd
> > +.SH LIBRARY
> > +Standard C library
> > +.RI ( libc ,\~ \-lc )
> > +.SH SYNOPSIS
> > +.nf
> > +.BR "#define _GNU_SOURCE         " "/* See feature_test_macros(7) */"
> > +.BR "#include <fcntl.h>" "          /* Definition of " AT_* " constants */"
> > +.B #include <sys/mount.h>
> > +.P
> > +.BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
> > +.fi
> > +.SH DESCRIPTION
> > +The
> > +.BR open_tree ()
> > +system call is part of
> > +the suite of file descriptor based mount facilities in Linux.
> > +.IP \[bu] 3
> > +If
> > +.I flags
> > +contains
> > +.BR \%OPEN_TREE_CLONE ,
> > +.BR open_tree ()
> > +creates a detached mount object
> > +which consists of a bind-mount of
> > +the path specified by the
> > +.IR path .
> > +A new file descriptor
> > +associated with the detached mount object
> > +is then returned.
> > +The mount object is equivalent to a bind-mount
> > +that would be created by
> > +.BR mount (2)
> > +called with
> > +.BR MS_BIND ,
> > +except that it is tied to a file descriptor
> > +and is not mounted onto the filesystem.
> > +.IP
> > +As with file descriptors returned from
> > +.BR fsmount (2),
> > +the resultant file descriptor can then be used with
> > +.BR move_mount (2),
> > +.BR mount_setattr (2),
> > +or other such system calls to do further mount operations.
> > +This mount object will be unmounted and destroyed
> > +when the file descriptor is closed
> > +if it was not otherwise attached to a mount point
> > +by calling
> > +.BR move_mount (2).
> > +(Note that the unmount operation on
> 
> Maybe I would make this note a paragraph of its own; this would give it
> more visibility, I think.  And I'd remove 'Note that', and start
> directly with the noted contents (everything in a manual page must be
> noteworthy, in general).
> 
> > +.BR close (2)
> 
> I'm a bit confused by the reference to close(2).  The previous text
> mentions closing, but not close(2), so I'm not sure if this refers to
> that or if it is comparing it to close(2).  Would you mind having a look
> at the wording of this entire paragraph?

Well, it's more that these kinds of file descriptors are marked with
FMODE_NEEDS_UMOUNT which will cause dissolve_on_fput() to be called when
they have no more references.

So this could be through close(2) or any other condition that causes a
file descriptor to be closed (dup2(2), process death, execve with
O_CLOEXEC, etc). Maybe it's better to not mention close(2) explicitly...

> > +is lazy\[em]akin to calling
> 
> I prefer em dashes in both sides of the parenthetical; it more clearly
> denotes where it ends.
> 
> 	is lazy
> 	\[em]akin to calling
> 	.BR umount2 (2)
> 	with
> 	.BR MOUNT_DETACH \[em];

An \[em] next to a ";"? Let me see if I can rewrite it to avoid this...

> (I assume that's where it ends.)
> 
> > +.BR umount2 (2)
> > +with
> > +.BR MOUNT_DETACH ;
> > +any existing open references to files
> > +from the mount object
> > +will continue to work,
> > +and the mount object will only be completely destroyed
> > +once it ceases to be busy.)
> > +.IP \[bu]
> > +If
> > +.I flags
> > +does not contain
> > +.BR \%OPEN_TREE_CLONE ,
> > +.BR open_tree ()
> > +returns a file descriptor
> > +that is exactly equivalent to
> > +one produced by
> > +.BR openat (2)
> > +when called with the same
> > +.I dirfd
> > +and
> > +.IR path .
> > +.P
> > +In either case, the resultant file descriptor
> > +acts the same as one produced by
> > +.BR open (2)
> > +with
> > +.BR O_PATH ,
> > +meaning it can also be used as a
> > +.I dirfd
> > +argument to
> > +"*at()" system calls.
> > +.P
> > +As with "*at()" system calls,
> > +.BR open_tree ()
> > +uses the
> > +.I dirfd
> > +argument in conjunction with the
> > +.I path
> > +argument to determine the path to operate on, as follows:
> > +.IP \[bu] 3
> > +If the pathname given in
> > +.I path
> > +is absolute, then
> > +.I dirfd
> > +is ignored.
> > +.IP \[bu]
> > +If the pathname given in
> > +.I path
> > +is relative and
> > +.I dirfd
> > +is the special value
> > +.BR \%AT_FDCWD ,
> > +then
> > +.I path
> > +is interpreted relative to
> > +the current working directory
> > +of the calling process (like
> > +.BR open (2)).
> > +.IP \[bu]
> > +If the pathname given in
> > +.I path
> > +is relative,
> > +then it is interpreted relative to
> > +the directory referred to by the file descriptor
> > +.I dirfd
> > +(rather than relative to
> > +the current working directory
> > +of the calling process,
> > +as is done by
> > +.BR open (2)
> > +for a relative pathname).
> > +In this case,
> > +.I dirfd
> > +must be a directory
> > +that was opened for reading
> > +.RB ( O_RDONLY )
> > +or using the
> > +.B O_PATH
> > +flag.
> > +.IP \[bu]
> > +If
> > +.I path
> > +is an empty string,
> > +and
> > +.I flags
> > +contains
> > +.BR \%AT_EMPTY_PATH ,
> > +then the file descriptor
> > +.I dirfd
> > +is operated on directly.
> > +In this case,
> > +.I dirfd
> > +may refer to any type of file,
> > +not just a directory.
> > +.P
> > +See
> > +.BR openat (2)
> > +for an explanation of why the
> > +.I dirfd
> > +argument is useful.
> > +.P
> > +.I flags
> > +can be used to control aspects of the path lookup
> > +and properties of the returned file descriptor.
> > +A value for
> > +.I flags
> > +is constructed by bitwise ORing
> > +zero or more of the following constants:
> > +.RS
> > +.TP
> > +.B \%AT_EMPTY_PATH
> > +If
> > +.I path
> > +is an empty string, operate on the file referred to by
> > +.I dirfd
> > +(which may have been obtained from
> > +.BR open (2),
> > +.BR fsmount(2),
> > +or from another
> > +.BR open_tree ()
> > +call).
> > +In this case,
> > +.I dirfd
> > +may refer to any type of file, not just a directory.
> > +If
> > +.I dirfd
> > +is
> > +.BR \%AT_FDCWD ,
> > +.BR open_tree ()
> > +will operate on the current working directory
> > +of the calling process.
> > +This flag is Linux-specific; define
> > +.B \%_GNU_SOURCE
> > +to obtain its definition.
> > +.TP
> > +.B \%AT_NO_AUTOMOUNT
> > +Do not automount the terminal ("basename") component of
> > +.I path
> > +if it is a directory that is an automount point.
> > +This allows you to create a handle to the automount point itself,
> > +rather than the location it would mount.
> > +This flag has no effect if the mount point has already been mounted over.
> > +This flag is Linux-specific; define
> > +.B \%_GNU_SOURCE
> > +to obtain its definition.
> > +.TP
> > +.B \%AT_SYMLINK_NOFOLLOW
> > +If
> > +.I path
> > +is a symbolic link, do not dereference it; instead,
> > +create either a handle to the link itself
> > +or a bind-mount of it.
> > +The resultant file descriptor is indistinguishable from one produced by
> > +.BR openat (2)
> > +with
> > +.BR \%O_PATH | O_NOFOLLLOW .
> > +.TP
> > +.B \%OPEN_TREE_CLOEXEC
> > +Set the close-on-exec
> > +.RB ( FD_CLOEXEC )
> > +flag on the new file descriptor.
> > +See the description of the
> > +.B O_CLOEXEC
> > +flag in
> > +.BR open (2)
> > +for reasons why this may be useful.
> > +.TP
> > +.B \%OPEN_TREE_CLONE
> > +Rather than creating an
> > +.BR openat (2)-style
> > +.B O_PATH
> > +file descriptor,
> > +create a bind-mount of
> > +.I path
> > +(akin to
> > +.IR "mount --bind" )
> 
> You need to escape dashes in manual pages.  Otherwise, they're formatted
> as hyphens, which can't be pasted into the terminal (and another
> consequence is not being able to search for them in the man(1) reader
> with literal dashes).
> 
> Depending on your system, you might be able to search for them or paste
> them to the terminal, because some distros patch this in
> /etc/local/an.tmac, at the expense of generating lower quality pages,
> but in general don't rely on that.
> 
> I've noticed now, but this probably also happens in previous pages in
> this patch set.
> 
> While at it, you should also use a non-breaking space, to keep the
> entire command in the same line.
> 
> 	.IR \%mount\~\-\-bind )

My bad, I think my terminal font doesn't distinguish between them well
enough for it to be obvious. I'll go through and fix up all of these
cases.

Thanks.

> Cheers,
> Alex
> 
> > +as a detached mount object.
> > +In order to do this operation,
> > +the calling process must have the
> > +.BR \%CAP_SYS_ADMIN
> > +capability.
> > +.TP
> > +.B \%AT_RECURSIVE
> > +Create a recursive bind-mount of the path
> > +(akin to
> > +.IR "mount --rbind" )
> > +as a detached mount object.
> > +This flag is only permitted in conjunction with
> > +.BR \%OPEN_TREE_CLONE .
> > +.SH RETURN VALUE
> > +On success, a new file descriptor is returned.
> > +On error, \-1 is returned, and
> > +.I errno
> > +is set to indicate the error.
> > +.SH ERRORS
> > +.TP
> > +.B EACCES
> > +Search permission is denied for one of the directories
> > +in the path prefix of
> > +.IR path .
> > +(See also
> > +.BR path_resolution (7).)
> > +.TP
> > +.B EBADF
> > +.I path
> > +is relative but
> > +.I dirfd
> > +is neither
> > +.B \%AT_FDCWD
> > +nor a valid file descriptor.
> > +.TP
> > +.B EFAULT
> > +.I path
> > +is NULL
> > +or a pointer to a location
> > +outside the calling process's accessible address space.
> > +.TP
> > +.B EINVAL
> > +Invalid flag specified in
> > +.IR flags .
> > +.TP
> > +.B ELOOP
> > +Too many symbolic links encountered when resolving
> > +.IR path .
> > +.TP
> > +.B EMFILE
> > +The calling process has too many open files to create more.
> > +.TP
> > +.B ENAMETOOLONG
> > +.I path
> > +is longer than
> > +.BR PATH_MAX .
> > +.TP
> > +.B ENFILE
> > +The system has too many open files to create more.
> > +.TP
> > +.B ENOENT
> > +A component of
> > +.I path
> > +does not exist, or is a dangling symbolic link.
> > +.TP
> > +.B ENOENT
> > +.I path
> > +is an empty string, but
> > +.B AT_EMPTY_PATH
> > +is not specified in
> > +.IR flags .
> > +.TP
> > +.B ENOTDIR
> > +A component of the path prefix of
> > +.I path
> > +is not a directory, or
> > +.I path
> > +is relative and
> > +.I dirfd
> > +is a file descriptor referring to a file other than a directory.
> > +.TP
> > +.B ENOSPC
> > +The "anonymous" mount namespace
> > +necessary to contain the
> > +.B \%OPEN_TREE_CLONE
> > +detached bind-mount mount object
> > +could not be allocated,
> > +as doing so would exceed
> > +the configured per-user limit on
> > +the number of mount namespaces in the current user namespace.
> > +(See also
> > +.BR namespaces (7).)
> > +.TP
> > +.B ENOMEM
> > +The kernel could not allocate sufficient memory to complete the operation.
> > +.TP
> > +.B EPERM
> > +.I flags
> > +contains
> > +.B \%OPEN_TREE_CLONE
> > +but the calling process does not have the required
> > +.B CAP_SYS_ADMIN
> > +capability.
> > +.SH STANDARDS
> > +Linux.
> > +.SH HISTORY
> > +Linux 5.2.
> > +.\" commit a07b20004793d8926f78d63eb5980559f7813404
> > +.\" commit 400913252d09f9cfb8cce33daee43167921fc343
> > +glibc 2.36.
> > +.SH NOTES
> > +.SS Mount propagation
> > +The bind-mount mount objects created by
> > +.BR open_tree ()
> > +with
> > +.B \%OPEN_TREE_CLONE
> > +are not associated with
> > +the mount namespace of the calling process.
> > +Instead, each mount object is placed
> > +in a newly allocated "anonymous" mount namespace
> > +associated with the calling process.
> > +.P
> > +One of the side-effects of this is that
> > +(unlike bind-mounts created with
> > +.BR mount (2)),
> > +mount propagation
> > +(as described in
> > +.BR mount_namespaces (7))
> > +will not be applied to bind-mounts created by
> > +.BR open_tree ()
> > +until the bind-mount is attached with
> > +.BR move_mount (2),
> > +at which point the mount object
> > +will be associated with the mount namespace
> > +where it was attached
> > +and mount propagation will resume.
> > +Note that any mount propagation events that occurred
> > +before the mount object was attached
> > +will
> > +.I not
> > +be propagated to the mount object,
> > +even after it is attached.
> > +.SH EXAMPLES
> > +The following examples show how
> > +.BR open_tree ()
> > +can be used in place of more traditional
> > +.BR mount (2)
> > +calls with
> > +.BR MS_BIND .
> > +.P
> > +.in +4n
> > +.EX
> > +int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE);
> > +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> > +.EE
> > +.in
> > +.P
> > +First,
> > +a detached bind-mount mount object of
> > +.I /var
> > +is created
> > +and associated with the file descriptor
> > +.IR srcfd .
> > +Then, the mount object is attached to
> > +.I /mnt
> > +using
> > +.BR move_mount (2)
> > +with
> > +.B \%MOVE_MOUNT_F_EMPTY_PATH
> > +to request that the detached mount object
> > +associated with the file descriptor
> > +.I srcfd
> > +be moved (and thus attached) to
> > +.IR /mnt .
> > +.P
> > +The above procedure is functionally equivalent to
> > +the following mount operation using
> > +.BR mount (2):
> > +.P
> > +.in +4n
> > +.EX
> > +mount("/var", "/mnt", NULL, MS_BIND, NULL);
> > +.EE
> > +.in
> > +.P
> > +.B \%OPEN_TREE_CLONE
> > +can be combined with
> > +.B \%AT_RECURSIVE
> > +to create recursive detached bind-mount mount objects,
> > +which in turn can be attached to mount points
> > +to create recursive bind-mounts.
> > +.P
> > +.in +4n
> > +.EX
> > +int srcfd = open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE | AT_RECURSIVE);
> > +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH);
> > +.EE
> > +.in
> > +.P
> > +The above procedure is functionally equivalent to
> > +the following mount operation using
> > +.BR mount (2):
> > +.P
> > +.in +4n
> > +.EX
> > +mount("/var", "/mnt", NULL, MS_BIND | MS_REC, NULL);
> > +.EE
> > +.in
> > +.P
> > +One of the primary benefits of using
> > +.BR open_tree ()
> > +and
> > +.BR move_mount (2)
> > +over the traditional
> > +.BR mount (2)
> > +is that operating with
> > +.IR dirfd -style
> > +file descriptors is far easier and more intuitive.
> > +.P
> > +.in +4n
> > +.EX
> > +int srcfd = open_tree(100, "", AT_EMPTY_PATH | OPEN_TREE_CLONE);
> > +move_mount(srcfd, "", 200, "foo", MOVE_MOUNT_F_EMPTY_PATH);
> > +.EE
> > +.in
> > +.P
> > +The above procedure is roughly equivalent to
> > +the following mount operation using
> > +.BR mount (2):
> > +.P
> > +.in +4n
> > +.EX
> > +mount("/proc/self/fd/100", "/proc/self/fd/200/foo", NULL, MS_BIND, NULL);
> > +.EE
> > +.in
> > +.P
> > +In addition, you can use the file descriptor returned by
> > +.BR open_tree ()
> > +as the
> > +.I dirfd
> > +argument to any "*at()" system calls:
> > +.P
> > +.in +4n
> > +.EX
> > +int dirfd, fd;
> > +\&
> > +dirfd = open_tree(AT_FDCWD, "/etc", OPEN_TREE_CLONE);
> > +fd = openat(dirfd, "passwd", O_RDONLY);
> > +fchmodat(dirfd, "shadow", 0000, 0);
> > +close(dirfd);
> > +close(fd);
> > +/* The bind-mount is now destroyed. */
> > +.EE
> > +.in
> > +.SH SEE ALSO
> > +.BR fsconfig (2),
> > +.BR fsmount (2),
> > +.BR fsopen (2),
> > +.BR fspick (2),
> > +.BR mount (2),
> > +.BR mount_setattr (2),
> > +.BR move_mount (2),
> > +.BR mount_namespaces (7)
> > 
> > -- 
> > 2.51.0
> > 
> > 
> 
> -- 
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).



-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-22 10:09     ` Aleksa Sarai
@ 2025-09-22 13:22       ` Alejandro Colomar
  2025-09-24  1:34         ` Aleksa Sarai
  0 siblings, 1 reply; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-22 13:22 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1748 bytes --]

Hi Aleksa,

On Mon, Sep 22, 2025 at 08:09:47PM +1000, Aleksa Sarai wrote:
> > > +is lazy\[em]akin to calling
> > 
> > I prefer em dashes in both sides of the parenthetical; it more clearly
> > denotes where it ends.
> > 
> > 	is lazy
> > 	\[em]akin to calling
> > 	.BR umount2 (2)
> > 	with
> > 	.BR MOUNT_DETACH \[em];
> 
> An \[em] next to a ";"? Let me see if I can rewrite it to avoid this...

You could use parentheses, maybe.

> > > +.IR "mount --bind" )
> > 
> > You need to escape dashes in manual pages.  Otherwise, they're formatted
> > as hyphens, which can't be pasted into the terminal (and another
> > consequence is not being able to search for them in the man(1) reader
> > with literal dashes).
> > 
> > Depending on your system, you might be able to search for them or paste
> > them to the terminal, because some distros patch this in
> > /etc/local/an.tmac, at the expense of generating lower quality pages,
> > but in general don't rely on that.
> > 
> > I've noticed now, but this probably also happens in previous pages in
> > this patch set.
> > 
> > While at it, you should also use a non-breaking space, to keep the
> > entire command in the same line.
> > 
> > 	.IR \%mount\~\-\-bind )
> 
> My bad, I think my terminal font doesn't distinguish between them well
> enough for it to be obvious. I'll go through and fix up all of these
> cases.

I should probably add an automated diagnostic.  At least the case of two
'--' together, which I've never seen useful unescaped, should be
diagnosed.  I'll add a make(1) 'lint-man-dash' target that catches this
with a regex.


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-22 13:22       ` Alejandro Colomar
@ 2025-09-24  1:34         ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-24  1:34 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

On 2025-09-22, Alejandro Colomar <alx@kernel.org> wrote:
> Hi Aleksa,
> 
> On Mon, Sep 22, 2025 at 08:09:47PM +1000, Aleksa Sarai wrote:
> > > > +is lazy\[em]akin to calling
> > > 
> > > I prefer em dashes in both sides of the parenthetical; it more clearly
> > > denotes where it ends.
> > > 
> > > 	is lazy
> > > 	\[em]akin to calling
> > > 	.BR umount2 (2)
> > > 	with
> > > 	.BR MOUNT_DETACH \[em];
> > 
> > An \[em] next to a ";"? Let me see if I can rewrite it to avoid this...
> 
> You could use parentheses, maybe.

I tried it a few different ways and I think it reads best with a single
em dash as a parenthetical -- since ";" indicates the end of a clause I
don't think you need to "close" the parenthetical with a corresponding
em dash.

Here is the parentheses version, but I plan to just keep the em dash
version in the patchset. If you really prefer the parenthesis version
feel free to replace it.

  This implicit unmount operation is lazy
  (akin to calling
  .BR umount2 (2)
  with
  .BR MNT_DETACH );
  thus,
  any existing open references to files
  from the mount object
  will continue to work,
  and the mount object will only be completely destroyed
  once it ceases to be busy.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API
  2025-09-21  9:55   ` Alejandro Colomar
@ 2025-09-24  6:31     ` Aleksa Sarai
  2025-09-24  8:51       ` Alejandro Colomar
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-24  6:31 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1389 bytes --]

On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> On Fri, Sep 19, 2025 at 11:59:50AM +1000, Aleksa Sarai wrote:
> > diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
> > index 7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4..60de4313a9d5be4ef3ff1217051f252506a2ade9 100644
> > --- a/man/man2/open_tree.2
> > +++ b/man/man2/open_tree.2
> > @@ -15,7 +15,19 @@ .SH SYNOPSIS
> >  .B #include <sys/mount.h>
> >  .P
> >  .BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
> > +.P
> > +.BR "#include <sys/syscall.h>" "    /* Definition of " SYS_* " constants */"
> > +.P
> > +.BI "int syscall(SYS_open_tree_attr, int " dirfd ", const char *" path ,
> > +.BI "            unsigned int " flags ", struct mount_attr *_Nullable " attr ", \
> > +size_t " size );
> 
> Do we maybe want to move this to its own separate page?
> 
> The separate page could perfectly contain the same exact text you're
> adding here; you don't need to repeat open_tree() descriptions.
> 
> In general, I feel that while this improves discoverability of related
> functions, it produces more complex pages.

I tried it and I don't think it is a better experience as a reader when
split into two pages because of the huge overlap between the two
syscalls.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 04/10] man/man2/fsconfig.2: document "new" mount API
  2025-09-21  9:26   ` Alejandro Colomar
@ 2025-09-24  6:41     ` Aleksa Sarai
  2025-09-24  8:52       ` Alejandro Colomar
  0 siblings, 1 reply; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-24  6:41 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> On Fri, Sep 19, 2025 at 11:59:45AM +1000, Aleksa Sarai wrote:
> > +The list of valid
> > +.I cmd
> > +values are:
> 
> I think I would have this page split into one page per command.
> 
> I would keep an overview in this page, of the main system call, and the
> descriptions of each subcommand would go into each separate page.
> 
> You could have a look at fcntl(2), which has been the most recent page
> split, and let me know what you think.

To be honest, I think this makes the page less useful to most readers.

I get that you want to try to improve the "wall of text" problem but as
a very regular reader of man-pages, I find indirections annoying every
time I have to do deal with them. Maybe there is an argument for
fcntl(2) to undergo this treatment (as it has a menagerie of disparate
commands) but this applies even less to fsconfig(2) in my view.

If you feel strongly that fsconfig(2) needs this treatment, it would
probably be better for you to do it instead. In particular, I would've
expected to only have two extra pages if we went that route (one for
FSCONFIG_SET_* commands and one for FSCONFIG_CMD_* commands) so I'm not
quite sure what you'd like the copy to look like for 10 man-pages...

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API
  2025-09-24  6:31     ` Aleksa Sarai
@ 2025-09-24  8:51       ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-24  8:51 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1546 bytes --]

Hi Aleksa,

On Wed, Sep 24, 2025 at 04:31:15PM +1000, Aleksa Sarai wrote:
> On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> > On Fri, Sep 19, 2025 at 11:59:50AM +1000, Aleksa Sarai wrote:
> > > diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2
> > > index 7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4..60de4313a9d5be4ef3ff1217051f252506a2ade9 100644
> > > --- a/man/man2/open_tree.2
> > > +++ b/man/man2/open_tree.2
> > > @@ -15,7 +15,19 @@ .SH SYNOPSIS
> > >  .B #include <sys/mount.h>
> > >  .P
> > >  .BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " flags );
> > > +.P
> > > +.BR "#include <sys/syscall.h>" "    /* Definition of " SYS_* " constants */"
> > > +.P
> > > +.BI "int syscall(SYS_open_tree_attr, int " dirfd ", const char *" path ,
> > > +.BI "            unsigned int " flags ", struct mount_attr *_Nullable " attr ", \
> > > +size_t " size );
> > 
> > Do we maybe want to move this to its own separate page?
> > 
> > The separate page could perfectly contain the same exact text you're
> > adding here; you don't need to repeat open_tree() descriptions.
> > 
> > In general, I feel that while this improves discoverability of related
> > functions, it produces more complex pages.
> 
> I tried it and I don't think it is a better experience as a reader when
> split into two pages because of the huge overlap between the two
> syscalls.

Okay.  Thanks!


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 04/10] man/man2/fsconfig.2: document "new" mount API
  2025-09-24  6:41     ` Aleksa Sarai
@ 2025-09-24  8:52       ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-09-24  8:52 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Michael T. Kerrisk, Alexander Viro, Jan Kara, Askar Safin,
	G. Branden Robinson, linux-man, linux-api, linux-fsdevel,
	linux-kernel, David Howells, Christian Brauner

[-- Attachment #1: Type: text/plain, Size: 1575 bytes --]

Hi Aleksa,

On Wed, Sep 24, 2025 at 04:41:16PM +1000, Aleksa Sarai wrote:
> On 2025-09-21, Alejandro Colomar <alx@kernel.org> wrote:
> > On Fri, Sep 19, 2025 at 11:59:45AM +1000, Aleksa Sarai wrote:
> > > +The list of valid
> > > +.I cmd
> > > +values are:
> > 
> > I think I would have this page split into one page per command.
> > 
> > I would keep an overview in this page, of the main system call, and the
> > descriptions of each subcommand would go into each separate page.
> > 
> > You could have a look at fcntl(2), which has been the most recent page
> > split, and let me know what you think.
> 
> To be honest, I think this makes the page less useful to most readers.
> 
> I get that you want to try to improve the "wall of text" problem but as
> a very regular reader of man-pages, I find indirections annoying every
> time I have to do deal with them. Maybe there is an argument for
> fcntl(2) to undergo this treatment (as it has a menagerie of disparate
> commands) but this applies even less to fsconfig(2) in my view.
> 
> If you feel strongly that fsconfig(2) needs this treatment, it would
> probably be better for you to do it instead. In particular, I would've
> expected to only have two extra pages if we went that route (one for
> FSCONFIG_SET_* commands and one for FSCONFIG_CMD_* commands) so I'm not
> quite sure what you'd like the copy to look like for 10 man-pages...

Okay, let's keep it as a single page for now.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 00/10] man2: document "new" mount API
  2025-09-21  9:58   ` Aleksa Sarai
@ 2025-09-24 11:11     ` Aleksa Sarai
  0 siblings, 0 replies; 43+ messages in thread
From: Aleksa Sarai @ 2025-09-24 11:11 UTC (permalink / raw)
  To: Askar Safin
  Cc: alx, brauner, dhowells, g.branden.robinson, jack, linux-api,
	linux-fsdevel, linux-kernel, linux-man, mtk.manpages, safinaskar,
	viro

[-- Attachment #1: Type: text/plain, Size: 1928 bytes --]

On 2025-09-21, Aleksa Sarai <cyphar@cyphar.com> wrote:
> On 2025-09-21, Askar Safin <safinaskar@gmail.com> wrote:
> > * open_tree(2) still says:
> > > If flags does not contain OPEN_TREE_CLONE, open_tree() returns a file descriptor
> > > that is exactly equivalent to one produced by openat(2) when called with the same dirfd and path.
> > 
> > This is not true if automounts are involved. I suggest adding "modulo automounts". But you may
> > keep everything, of course.
> 
> Hmmm. As we discussed last time, this sentence is more intended to
> indicate that the file descriptor is just a regular open file (with no
> dissolve_on_fput() + FMODE_NEED_UNMOUNT magic) rather than the exact
> behaviour you get with regards to path lookup.
> 
> I would honestly prefer to remove "when called with the same dirfd and
> path" rather than add caveats, but I think it makes the sentence less
> readable... I'll think about it and try to fix this wording up somehow
> for v5.

I've gone with the following:

   In either case, the resultant file descriptor
   acts the same as one produced by
   .BR open (2)
   with
   .BR O_PATH ,
   meaning it can also be used as a
   .I dirfd
   argument to
   "*at()" system calls.
  +However,
  +unlike
  +.BR open (2)
  +called with
  +.BR O_PATH ,
  +automounts will
  +by default
  +be triggered by
  +.BR open_tree ()
  +unless
  +.B \%AT_NO_AUTOMOUNT
  +is included in
  +.IR flags .

After looking at it a few times, I decided adding it to the proceeding
paragraph (as you suggested) didn't really make sense since the O_PATH
equivalence is only mentioned in this following paragraph.

Also, the automount behaviour also applies to OPEN_TREE_CLONE, so it's
best to not mislead a reader into thinking it only applies to one of the
cases.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-21  9:47   ` Alejandro Colomar
  2025-09-22 10:09     ` Aleksa Sarai
@ 2025-09-30 12:32     ` G. Branden Robinson
  2025-10-02 22:20       ` Alejandro Colomar
  1 sibling, 1 reply; 43+ messages in thread
From: G. Branden Robinson @ 2025-09-30 12:32 UTC (permalink / raw)
  To: Alejandro Colomar; +Cc: Aleksa Sarai, linux-man, groff

[-- Attachment #1: Type: text/plain, Size: 5811 bytes --]

[CC list trimmed since I'm covering only English style and typesetting
issues here; adding groff@gnu accordingly]

At 2025-09-21T11:47:21+0200, Alejandro Colomar wrote:
> > +(Note that the unmount operation on
> 
> Maybe I would make this note a paragraph of its own; this would give
> it more visibility, I think.  And I'd remove 'Note that', and start
> directly with the noted contents (everything in a manual page must be
> noteworthy, in general).

Good advice.  I call such removal a "Kemper notectomy", in honor of the
groff colleague who brought the superfluity of such constructions to my
attention.  Also see
<https://en.wikipedia.org/wiki/Wikipedia:It_should_be_noted>.

> > +is lazy\[em]akin to calling
> 
> I prefer em dashes in both sides of the parenthetical; it more clearly
> denotes where it ends.
> 
> 	is lazy
> 	\[em]akin to calling
> 	.BR umount2 (2)
> 	with
> 	.BR MOUNT_DETACH \[em];
> 
> (I assume that's where it ends.)

I'm uneasy about the acceptability of setting an em dash and semicolon
adjacently like this.  I checked my copy of the Chicago Manual of Style
(18th ed., 2024) and it has something close, but not squarely, on point.

---snip---
§6.95.  Em dashes and other punctuation.  In modern usage, a question
mark or an exclamation point—but never a comma, a colon, or a semicolon—
may precede an em dash.  ... If the context calls for an em dash where a
comma would ordinarily separate a dependent clause from an independent
clause, the comma is omitted.
---end snip---

Here's the full sentence at issue.

>>> +(Note that the unmount operation on
>>> +.BR close (2)
>>> +is lazy\[em]akin to calling
>>> +.BR umount2 (2)
>>> +with
>>> +.BR MOUNT_DETACH ;
>>> +any existing open references to files
>>> +from the mount object
>>> +will continue to work,
>>> +and the mount object will only be completely destroyed
>>> +once it ceases to be busy.)

This construction uses a semicolon rather than a comma, and that
semicolon separates two independent clauses.  (Whether "Note that" is
retained or deleted makes no difference.)

CMoS therefore doesn't flag this as "wrong", but I think it'll look
weird to a native English reader and maybe to non-native ones, too.

The best solution might be to recast.  It's seldom wrong to break a
sentence using a semicolon into two sentences when the purpose of the
prose is to explain rather than specify.

> You need to escape dashes in manual pages.  Otherwise, they're
> formatted as hyphens,

s/they're/they can be/

Some distributions, like Arch[1], Debian[2], and Fedora[3], won't
exhibit this problem.  Why doesn't groff do this upstream?  Because some
groff users _do_ care about correct typesetting, and it's more
straightforward for distributors to patch their packages as the
aforementioned have done than for groff to adopt that as a default and
leave people to their own devices to revert it.[4]

(Maybe I should copy some of that rationale into groff's "PROBLEMS"
file.  Opinions?)

> which can't be pasted into the terminal (and another consequence is
> not being able to search for them in the man(1) reader with literal
> dashes).

...but if you're looking for an incorrectly hyphenated term like
"foo-bar", you _can_ search for "foo.bar".  Both the more(1) and less(1)
in the (now over 2 years old) Debian Bookworm can manage this on
UTF-8-encoded input.

> Depending on your system, you might be able to search for them or paste
> them to the terminal, because some distros patch this in
> /etc/local/an.tmac,

That's not the file name any of the foregoing distributors use, but
you've got the right idea.

> at the expense of generating lower quality pages, but in general don't
> rely on that.
> 
> I've noticed now, but this probably also happens in previous pages in
> this patch set.
> 
> While at it, you should also use a non-breaking space, to keep the
> entire command in the same line.
> 
> 	.IR \%mount\~\-\-bind )

Here's where I disagree a little.  Lengthy unhyphenable character
sequences like this can lead to ugly formatting, even when adjustment is
disabled.  (A ragged right margin can get _really_ ragged.)

While I personally would not use italics for inline examples (and don't
in groff man pages), regardless of the font style you use, you can
surround multi-word inline examples with quotation marks so that they
are properly understood even if broken.  (Further, no font style
selection survives copy-and-paste into plain text email.)

The Linux man-pages don't require portability to AT&T troff, so you can
achieve this with groff's `lq` and `rq` special characters.

I therefore propose:

.RI \[lq] \%mount
.IR \-\-bind \[rq]

If you _did_ require portability to AT&T troff, you could use the `lq`
and `rq` _strings_ instead.

.RI \*(lq \%mount
.IR \-\-bind \*(rq

These strings are _almost_ universally portable.  The only line of *roff
descent that didn't incorporate them into man(7) was that of Research
Unix, of which the only surviving specimen is Plan 9 troff.   In other
words, BSD troff (1980), Unix System V Release 4 and its descendants
(1988)--including Solaris 10, and groff (1989) all support them.  I'm
working up a patch to add it to Plan 9 from User Space a.k.a. plan9port.

Quotation is a useful and important linguistic facility.  It's a shame
man page authors have neglected it so long.

Regards,
Branden

[1] https://gitlab.archlinux.org/archlinux/packaging/packages/groff/-/commit/e474b541a32fc905b4f748de0313acfb8b98c081
[2] https://salsa.debian.org/debian/groff/-/commit/d5394c68d70e6c5199b01d2522e094c8fd52e64e
[3] https://bodhi.fedoraproject.org/updates/FEDORA-2023-f5d1e63191
[4] https://lwn.net/Articles/948616/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API
  2025-09-30 12:32     ` G. Branden Robinson
@ 2025-10-02 22:20       ` Alejandro Colomar
  0 siblings, 0 replies; 43+ messages in thread
From: Alejandro Colomar @ 2025-10-02 22:20 UTC (permalink / raw)
  To: G. Branden Robinson; +Cc: Aleksa Sarai, linux-man, groff

[-- Attachment #1: Type: text/plain, Size: 13274 bytes --]

Hi Branden,

[Disclaimer: this message contains a large amount of Spanish quoted text]

On Tue, Sep 30, 2025 at 07:32:46AM -0500, G. Branden Robinson wrote:
> > > +is lazy\[em]akin to calling
> > 
> > I prefer em dashes in both sides of the parenthetical; it more clearly
> > denotes where it ends.
> > 
> > 	is lazy
> > 	\[em]akin to calling
> > 	.BR umount2 (2)
> > 	with
> > 	.BR MOUNT_DETACH \[em];
> > 
> > (I assume that's where it ends.)
> 
> I'm uneasy about the acceptability of setting an em dash and semicolon
> adjacently like this.  I checked my copy of the Chicago Manual of Style
> (18th ed., 2024) and it has something close, but not squarely, on point.

I think Spanish rules are much *much* better in this regard.

<https://www.rae.es/ortograf%C3%ADa/concurrencia-con-otros-signos-4>
| 3.4.7.3. Concurrencia con otros signos
|
| Cuando las rayas que encierran incisos concurren con otros signos de
| puntuación, siguen las mismas reglas expuestas para los paréntesis
| (v. § 3.4.5.3).
|
| Para las peculiaridades propias de los textos narrativos,
| v. § 3.4.7.1.3. 

It says it follows the same rules as parentheses.
If we follow § 3.4.5.3 --rules for parentheses--, it has some gems.

<https://www.rae.es/ortograf%C3%ADa/concurrencia-con-otros-signos-2#3.4.5.3>
| 3.4.5.3. Concurrencia con otros signos
| 
| Cuando, en su uso delimitador, los paréntesis concurren con otros
| signos de puntuación, deben seguirse las siguientes pautas:
| 
|	a) Los delimitadores principales (punto, coma, punto y coma, y
|	   dos puntos) se escriben siempre después del paréntesis de
|	   cierre:
| 
|		Llevaban casados mucho tiempo (el año pasado cumplieron
|		sus bodas de oro), pero nunca lograron entenderse.
| 
|		Se llama Elvira Muñoz (si no recuerdo mal). 
|
[... here go some paragraphs that I'll skip, as they're unrelated to
     what we're discussing, and also I find them a bit inconsistent with
     the rest of the rules, which are quite consistent.
...]
|
|	b) Si coinciden tras una palabra dos o más signos dobles, se
|	   cerrará primero el que se haya abierto en último lugar:
|
|		«Creo que el motor de esos fenómenos es la vanidad (la
|		tentación del “¿por qué no yo?”), que lleva a preferir
|		el cero al número dos» (Abc [Esp.] 16.1.1987). 
|
|	c) Como ya se ha señalado, el texto contenido dentro de los
|	   paréntesis tiene su propia puntuación, independiente de la
|	   del enunciado principal:
|
|		La manía de Ernesto por el coleccionismo (lo colecciona
|		todo: sellos, monedas, relojes, plumas, llaveros…) ha
|		convertido su casa en un almacén. 
|
|	   Por ello, si la secuencia escrita entre paréntesis es
|	   interrogativa o exclamativa, los signos de interrogación o de
|	   exclamación deben colocarse dentro de los paréntesis:
|
|		Su facilidad para los idiomas (¡habla con fluidez siete
|		lenguas!) le ha abierto muchas puertas. 
|
|	   En cambio, cuando la secuencia interrogativa o exclamativa
|	   constituye una unidad mayor que la encerrada entre
|	   paréntesis, los signos de interrogación y exclamación se
|	   escriben fuera de los paréntesis:
|
|		¿Cuántos países integran la ONU (Organización de las
|		Naciones Unidas)?
|
|	   Lo mismo cabe decir de los puntos suspensivos, que preceden
|	   al paréntesis de cierre si corresponden solo al segundo
|	   discurso, mientras que se escriben tras el paréntesis si
|	   afectan al discurso principal:
|
|		Conté lentamente (uno, dos, tres…) y me di la vuelta.
|
|		Visitamos el museo de la ciudad, las calles del centro
|		histórico, las montañas cercanas (¡impresionantes!)…
|		Acabamos agotados. 

And then if we read the description of "la raya", it has some nice
examples of use:

<https://www.rae.es/ortograf%C3%ADa/la-raya-como-signo-delimitador#3.4.7.1>
| 3.4.7.1. La raya como signo delimitador
| 
| Cuando es un signo doble, la raya forma parte del grupo de signos
| delimitadores que introducen o encierran un segundo discurso.  Su
| función principal es indicar que las unidades lingüísticas que aísla
| no son una parte central del mensaje, sino que constituyen un discurso
| secundario que se inserta en el discurso principal para introducir
| información complementaria:
|
|	«El segundo festival de música mexicana es —¡cómo ponerlo en
|	duda!— el acontecimiento musical más importante del año» (Melo
|	Notas [Méx. 1990]).
|
|	«No daba importancia a los viajes de su mujer.  “Que se
|	distraiga —decía—, que se distraiga”» (Palou Carne [Esp. 1975]). 
|
| Como discurso autónomo con respecto al principal, las secuencias
| encerradas entre rayas pueden tener puntuación propia:
|
|	«Aguzando el oído —en el fondo, sin necesidad de concentrarse mucho:
|	bastaba con dejar de prestar atención a los latidos del propio
|	corazón, con sosegarse—, era fácil percibir un rumor de voces»
|	(Azancot Amores [Esp. 1980]). 
|
| En su función delimitadora de un segundo discurso, las rayas tienen
| los usos que se comentan a continuación.
|
| 3.4.7.1.1 En incisos
| 
| Las rayas se emplean, como signo doble, para encerrar incisos:
| 
|	Para él la fidelidad —cualidad que valoraba por encima de
|	cualquier otra— era algo sagrado.
| 
| En este uso pueden utilizarse también las comas o los paréntesis.  Los
| incisos entre rayas suponen un aislamiento mayor con respecto al texto
| en el que se insertan que los que se escriben entre comas, pero menor
| que los que se escriben entre paréntesis (sobre otras diferencias
| entre el uso del paréntesis o la raya, frente a la coma, para
| delimitar incisos, v. § 3.4.5.1.1).  La raya de cierre en los incisos
| no se suprime aunque detrás de ella deba aparecer un punto o cualquier
| otro signo de puntuación:
| 
|	Esperaba a Emilio —un gran amigo—.  Lamentablemente, no vino.
| 
|	Esperaba a Emilio —un gran amigo—, que, lamentablemente, no vino. 
|
| Tampoco debe suprimirse la raya de cierre cuando el inciso ocupa la
| posición final del enunciado:
| 
|	La editorial ha publicado este año varias obras del autor —todas
|	ellas de su primera época—.
| 
| Deben evitarse puntuaciones redundantes, como la del siguiente
| ejemplo, donde debieron escribirse o comas o rayas, pero no ambos
| signos: ⊗ «La muerte vivida por los demás, —sea la muerte por
| accidente o de otros modos—, es una cosa que sufrimos mucho» (Burgos
| Rigoberta [Guat. 1983]). 
|
| 3.4.7.1.2 En incisos incluidos en otros incisos
| 
| Deben usarse las rayas para introducir una nueva aclaración o inciso
| en un texto ya encerrado entre paréntesis:
| 
|	Si desea más información sobre este tema (la bibliografía
|	existente —incluso en español— es bastante extensa), deberá
|	acudir a otras fuentes.
| 
| Para intercalar algún dato o precisión en un inciso escrito entre
| rayas, han de usarse los paréntesis (v. § 3.4.5.1.2): Venezuela
| —primer lugar de tierra firme avistado por Colón en su tercer viaje a
| América (1498)— tenía, por aquel entonces, unos 300 000 habitantes. 
|
| 3.4.7.1.3 Para enmarcar comentarios de un narrador o transcriptor
|
| Las rayas se usan para enmarcar, en medio de una cita textual
| entrecomillada, las aclaraciones del transcriptor con respecto a su
| autoría:
|
|	«Es imprescindible —señaló el ministro— que se refuercen los
|	sistemas de control sanitario en las fronteras».
| 
| Se usa, en cambio, la coma cuando la aclaración aparece en posición
| final, fuera del texto entrecomillado:
| 
|	«Es imprescindible que se refuercen los sistemas de control
|	sanitario en las fronteras», señaló el ministro. 
|
[...]

I plan to eventually go through all the pages and make sure em dashes
follow these rules.

> 
> ---snip---
> §6.95.  Em dashes and other punctuation.  In modern usage, a question
> mark or an exclamation point—but never a comma, a colon, or a semicolon—
> may precede an em dash.  ... If the context calls for an em dash where a
> comma would ordinarily separate a dependent clause from an independent
> clause, the comma is omitted.
> ---end snip---

I think the Spanish rules for the em dash result in more logic text,
which is easier to parse.  I'm going to ignore this style, even if it's
the more common one in English texts.

> 
> Here's the full sentence at issue.
> 
> >>> +(Note that the unmount operation on
> >>> +.BR close (2)
> >>> +is lazy\[em]akin to calling
> >>> +.BR umount2 (2)
> >>> +with
> >>> +.BR MOUNT_DETACH ;
> >>> +any existing open references to files
> >>> +from the mount object
> >>> +will continue to work,
> >>> +and the mount object will only be completely destroyed
> >>> +once it ceases to be busy.)
> 
> This construction uses a semicolon rather than a comma, and that
> semicolon separates two independent clauses.  (Whether "Note that" is
> retained or deleted makes no difference.)
> 
> CMoS therefore doesn't flag this as "wrong", but I think it'll look
> weird to a native English reader and maybe to non-native ones, too.
> 
> The best solution might be to recast.  It's seldom wrong to break a
> sentence using a semicolon into two sentences when the purpose of the
> prose is to explain rather than specify.
> 
> > You need to escape dashes in manual pages.  Otherwise, they're
> > formatted as hyphens,
> 
> s/they're/they can be/
> 
> Some distributions, like Arch[1], Debian[2], and Fedora[3], won't
> exhibit this problem.  Why doesn't groff do this upstream?  Because some
> groff users _do_ care about correct typesetting, and it's more
> straightforward for distributors to patch their packages as the
> aforementioned have done than for groff to adopt that as a default and
> leave people to their own devices to revert it.[4]
> 
> (Maybe I should copy some of that rationale into groff's "PROBLEMS"
> file.  Opinions?)
> 
> > which can't be pasted into the terminal (and another consequence is
> > not being able to search for them in the man(1) reader with literal
> > dashes).
> 
> ...but if you're looking for an incorrectly hyphenated term like
> "foo-bar", you _can_ search for "foo.bar".  Both the more(1) and less(1)
> in the (now over 2 years old) Debian Bookworm can manage this on
> UTF-8-encoded input.

Except that if "foo bar" appears frequently in the text, using foo.bar
will be painful (has happened to me).  But at least it's better than
not being able to search it at all.  :)

> 
> > Depending on your system, you might be able to search for them or paste
> > them to the terminal, because some distros patch this in
> > /etc/local/an.tmac,
> 
> That's not the file name any of the foregoing distributors use, but
> you've got the right idea.

Oops, I meant to say */groff/*, not */local/*.  'local' was hot in
cache in my brain for some reason, and overwrote some register.

> > at the expense of generating lower quality pages, but in general don't
> > rely on that.
> > 
> > I've noticed now, but this probably also happens in previous pages in
> > this patch set.
> > 
> > While at it, you should also use a non-breaking space, to keep the
> > entire command in the same line.
> > 
> > 	.IR \%mount\~\-\-bind )
> 
> Here's where I disagree a little.  Lengthy unhyphenable character
> sequences like this can lead to ugly formatting, even when adjustment is
> disabled.  (A ragged right margin can get _really_ ragged.)

I see what you mean, but I wouldn't count this as lengthy.

> While I personally would not use italics for inline examples (and don't
> in groff man pages), regardless of the font style you use, you can
> surround multi-word inline examples with quotation marks so that they
> are properly understood even if broken.  (Further, no font style
> selection survives copy-and-paste into plain text email.)
> 
> The Linux man-pages don't require portability to AT&T troff, so you can
> achieve this with groff's `lq` and `rq` special characters.
> 
> I therefore propose:
> 
> .RI \[lq] \%mount
> .IR \-\-bind \[rq]

I would still make it unbreakable.  It's short enough that it's more
readable in the same line.  And if it were long enough, I'd make it an
EX/EE example.

The portability of \[lq] and \[rq] is fine; I'm already relying on
groffisms.  :)

[...]

> Quotation is a useful and important linguistic facility.  It's a shame
> man page authors have neglected it so long.

I might switch to it, eventually.  I'd consider using both I and quotes,
maybe.  We'll see.  As a colleague I had at nginx says, the term is
"eventually".  ;-)

> Regards,
> Branden

Have a lovely night!
Alex

> 
> [1] https://gitlab.archlinux.org/archlinux/packaging/packages/groff/-/commit/e474b541a32fc905b4f748de0313acfb8b98c081
> [2] https://salsa.debian.org/debian/groff/-/commit/d5394c68d70e6c5199b01d2522e094c8fd52e64e
> [3] https://bodhi.fedoraproject.org/updates/FEDORA-2023-f5d1e63191
> [4] https://lwn.net/Articles/948616/



-- 
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2025-10-02 22:20 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19  1:59 [PATCH v4 00/10] man2: document "new" mount API Aleksa Sarai
2025-09-19  1:59 ` [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Aleksa Sarai
2025-09-19 16:04   ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API Aleksa Sarai
2025-09-19 16:24   ` Alejandro Colomar
2025-09-19 21:19     ` Aleksa Sarai
2025-09-21  1:33       ` Aleksa Sarai
2025-09-21  8:39         ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 03/10] man/man2/fspick.2: " Aleksa Sarai
2025-09-21  8:53   ` Alejandro Colomar
2025-09-21 14:55     ` Aleksa Sarai
2025-09-21 15:17       ` Alejandro Colomar
2025-09-22  0:25   ` Askar Safin
2025-09-19  1:59 ` [PATCH v4 04/10] man/man2/fsconfig.2: " Aleksa Sarai
2025-09-21  9:26   ` Alejandro Colomar
2025-09-24  6:41     ` Aleksa Sarai
2025-09-24  8:52       ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 05/10] man/man2/fsmount.2: " Aleksa Sarai
2025-09-21 23:08   ` Askar Safin
2025-09-22  1:10     ` Askar Safin
2025-09-19  1:59 ` [PATCH v4 06/10] man/man2/move_mount.2: " Aleksa Sarai
2025-09-19  1:59 ` [PATCH v4 07/10] man/man2/open_tree.2: " Aleksa Sarai
2025-09-21  9:47   ` Alejandro Colomar
2025-09-22 10:09     ` Aleksa Sarai
2025-09-22 13:22       ` Alejandro Colomar
2025-09-24  1:34         ` Aleksa Sarai
2025-09-30 12:32     ` G. Branden Robinson
2025-10-02 22:20       ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2) Aleksa Sarai
2025-09-21  9:50   ` Alejandro Colomar
2025-09-21 10:03     ` Aleksa Sarai
2025-09-21 10:19       ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API Aleksa Sarai
2025-09-21  9:55   ` Alejandro Colomar
2025-09-24  6:31     ` Aleksa Sarai
2025-09-24  8:51       ` Alejandro Colomar
2025-09-19  1:59 ` [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Aleksa Sarai
2025-09-19 12:23   ` Aleksa Sarai
2025-09-21  9:59   ` Alejandro Colomar
2025-09-22  1:06   ` Askar Safin
2025-09-21  2:43 ` [PATCH v4 00/10] man2: document "new" mount API Askar Safin
2025-09-21  9:58   ` Aleksa Sarai
2025-09-24 11:11     ` Aleksa Sarai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox