From: Al Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
Cc: "Eric W. Biederman" <ebiederm@xmission.com>,
Christian Brauner <brauner@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [RFC] separate the internal mount flags from the rest
Date: Tue, 3 Jun 2025 21:47:04 +0100 [thread overview]
Message-ID: <20250603204704.GB299672@ZenIV> (raw)
Currently we use ->mnt_flags for all kinds of things, including
the ones only fs/{namespace,pnode}.c care about.
That wouldn't be a problem if not for the locking. ->mnt_flags is
protected by mount_lock. All writers MUST grab that. Having lockless
readers is unsurprising - after all, for something like noexec we want
the current state of mount, whatever it happens to be. If userland
remounts something noexec and that races with execve(2), there's nothing
to be done - it is a race, but not the kernel one.
However, for a bunch of flags we rely upon the fact that all
changes in one of those are always done under namespace_sem (exclusive).
It's not a lockless read - we do depend upon namespace_sem (shared) being
sufficient to stabilize those particular bits. MNT_SHARED is one such.
Note that this flag is a part of propagation graph representation and
namespace_sem is what protects the entire thing, so setting or clearing
it under mount_lock alone would be a very bad idea.
Since MNT_SHARED sits in ->mnt_flags, we have to take mount_lock to set or
clear it, leading to places like this:
if (IS_MNT_SHARED(from)) {
to->mnt_group_id = from->mnt_group_id;
list_add(&to->mnt_share, &from->mnt_share);
lock_mount_hash();
set_mnt_shared(to);
unlock_mount_hash();
}
Non-empty ->mnt_share on a mount without MNT_SHARED would be a hard bug;
these changes belong together. Incidentally, lock_mount_hash() is an
overkill here - read_seqlock_excl(&mount_lock) would be better...
The rules would be easier to follow if we took MNT_SHARED, MNT_UNBINDABLE,
MNT_MARKED and possibly MNT_LOCKED and MNT_LOCK_* to separate field, protected
by namespace_sem alone (MNT_LOCKED - depending upon how the path_parent() mess
settles).
Yes, it's 4 bytes added into struct mount. However, this
int mnt_id; /* mount identifier, reused */
u64 mnt_id_unique; /* mount ID unique until reboot */
int mnt_group_id; /* peer group identifier */
int mnt_expiry_mark; /* true if marked for expiry */
is preceded and followed by a pointer, so we already have a gap there,
_and_ there are other pending changes that kill ->mnt_umounting. So the
size of struct mount won't grow even on 32bit and would actually go down
on 64bit.
Objections? The thing I really want is clear locking rules for that
stuff. Reduced contention on mount_lock and fewer increments of its
seqcount component wouldn't hurt either, but that's secondary...
PS: despite being strictly namespace.c-internal, two flags must stay in
->mnt_flags - MNT_DOOMED and MNT_SYNC_UMOUNT - __legitimize_mnt() needs
them there. Could be folded together with a bit of massage, though, but
that's a separate story...
next reply other threads:[~2025-06-03 20:47 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-03 20:47 Al Viro [this message]
2025-06-03 23:17 ` [RFC] separate the internal mount flags from the rest Linus Torvalds
2025-06-04 7:36 ` Christian Brauner
2025-06-07 11:48 ` Eric W. Biederman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250603204704.GB299672@ZenIV \
--to=viro@zeniv.linux.org.uk \
--cc=brauner@kernel.org \
--cc=ebiederm@xmission.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.