public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: Qu Wenruo <quwenruo.btrfs@gmx.com>,
	Christoph Hellwig <hch@infradead.org>,
	Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
	Chris Mason <clm@fb.com>, David Sterba <dsterba@suse.com>,
	linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 0/3] fanotify support for btrfs sub-volumes
Date: Thu, 2 Nov 2023 10:48:35 +0100	[thread overview]
Message-ID: <20231102-ankurbeln-eingearbeitet-cbeb018bfedc@brauner> (raw)
In-Reply-To: <20231102051349.GA3292886@perftesting>

> We'll be converted to the new mount API tho, so I suppose that's something.
> Thanks,

Just in case you forgot about it. I did send a patch to convert btrfs to
the new mount api in June:

https://lore.kernel.org/all/20230626-fs-btrfs-mount-api-v1-0-045e9735a00b@kernel.org

Can I ask you to please please copy just two things from that series:

(1) Please get rid of the second filesystems type.
(2) Please fix the silent remount behavior when mounting a subvolume.

You might need my first patch for that from that series for (2).

+static int btrfs_get_tree_common(struct fs_context *fc)
+{
+	struct vfsmount *root_mnt = NULL;
+	struct fs_context *root_fc;
+	struct dentry *root_dentry;
+	struct btrfs_fs_context *ctx = fc->fs_private;
+	int ret;
+
+	if (WARN_ON(ctx->phase != BTRFS_FS_CONTEXT_PREPARE))
+		return -EINVAL;
+
+	root_fc = vfs_dup_fs_context(fc);
+	if (IS_ERR(root_fc))
+		return PTR_ERR(root_fc);
+
+	/*
+	 * We've duplicated the security mount options above and we only
+	 * need them to be set when we really create a new superblock.
+	 * They're irrelevant when we mount the subvolume as the
+	 * superblock does already exist at that point. So free the
+	 * security blob here.
+	 */
+	security_free_mnt_opts(&fc->security);
+	fc->security = NULL;
+
+	/* Create the superblock so we can mount a subtree later. */
+	ctx->phase = BTRFS_FS_CONTEXT_SUPER;
+
+	root_mnt = fc_mount(root_fc);
+	if (PTR_ERR_OR_ZERO(root_mnt) == -EBUSY) {
+		bool ro2rw = !(root_fc->sb_flags & SB_RDONLY);
+
+		if (ro2rw)
+			root_fc->sb_flags |= SB_RDONLY;
+		else
+			root_fc->sb_flags &= ~SB_RDONLY;
+
+		root_mnt = fc_mount(root_fc);
+		if (IS_ERR(root_mnt)) {
+			put_fs_context(root_fc);
+			return PTR_ERR(root_mnt);
+		}
+		ctx->root_mnt = root_mnt;
+
+		/*
+		 * Ever since commit 0723a0473fb4 ("btrfs: allow
+		 * mounting btrfs subvolumes with different ro/rw
+		 * options") the following works:
+		 *
+		 *        (i) mount /dev/sda3 -o subvol=foo,ro /mnt/foo
+		 *       (ii) mount /dev/sda3 -o subvol=bar,rw /mnt/bar
+		 *
+		 * which looks nice and innocent but is actually pretty
+		 * intricate and deserves a long comment.
+		 *
+		 * On another filesystem a subvolume mount is close to
+		 * something like:
+		 *
+		 *	(iii) # create rw superblock + initial mount
+		 *	      mount -t xfs /dev/sdb /opt/
+		 *
+		 *	      # create ro bind mount
+		 *	      mount --bind -o ro /opt/foo /mnt/foo
+		 *
+		 *	      # unmount initial mount
+		 *	      umount /opt
+		 *
+		 * Of course, there's some special subvolume sauce and
+		 * there's the fact that the sb->s_root dentry is really
+		 * swapped after mount_subtree(). But conceptually it's
+		 * very close and will help us understand the issue.
+		 *
+		 * The old mount api didn't cleanly distinguish between
+		 * a mount being made ro and a superblock being made ro.
+		 * The only way to change the ro state of either object
+		 * was by passing MS_RDONLY. If a new mount was created
+		 * via mount(2) such as:
+		 *
+		 *      mount("/dev/sdb", "/mnt", "xfs", MS_RDONLY, NULL);
+		 *
+		 * the MS_RDONLY flag being specified had two effects:
+		 *
+		 * (1) MNT_READONLY was raised -> the resulting mount
+		 *     got @mnt->mnt_flags |= MNT_READONLY raised.
+		 *
+		 * (2) MS_RDONLY was passed to the filesystem's mount
+		 *     method and the filesystems made the superblock
+		 *     ro. Note, how SB_RDONLY has the same value as
+		 *     MS_RDONLY and is raised whenever MS_RDONLY is
+		 *     passed through mount(2).
+		 *
+		 * Creating a subtree mount via (iii) ends up leaving a
+		 * rw superblock with a subtree mounted ro.
+		 *
+		 * But consider the effect on the old mount api on btrfs
+		 * subvolume mounting which combines the distinct step
+		 * in (iii) into a a single step.
+		 *
+		 * By issuing (i) both the mount and the superblock are
+		 * turned ro. Now when (ii) is issued the superblock is
+		 * ro and thus even if the mount created for (ii) is rw
+		 * it wouldn't help. Hence, btrfs needed to transition
+		 * the superblock from ro to rw for (ii) which it did
+		 * using an internal remount call (a bold choice...).
+		 *
+		 * IOW, subvolume mounting was inherently messy due to
+		 * the ambiguity of MS_RDONLY in mount(2). Note, this
+		 * ambiguity has mount(8) always translate "ro" to
+		 * MS_RDONLY. IOW, in both (i) and (ii) "ro" becomes
+		 * MS_RDONLY when passed by mount(8) to mount(2).
+		 *
+		 * Enter the new mount api. The new mount api
+		 * disambiguates making a mount ro and making a
+		 * superblock ro.
+		 *
+		 * (3) To turn a mount ro the MOUNT_ATTR_RDONLY flag can
+		 *     be used with either fsmount() or mount_setattr().
+		 *     This is a pure VFS level change for a specific
+		 *     mount or mount tree that is never seen by the
+		 *     filesystem itself.
+		 *
+		 * (4) To turn a superblock ro the "ro" flag must be
+		 *     used with fsconfig(FSCONFIG_SET_FLAG, "ro"). This
+		 *     option is seen by the filesytem in fc->sb_flags.
+		 *
+		 * This disambiguation has rather positive consequences.
+		 * Mounting a subvolume ro will not also turn the
+		 * superblock ro. Only the mount for the subvolume will
+		 * become ro.
+		 *
+		 * So, if the superblock creation request comes from the
+		 * new mount api the caller must've explicitly done:
+		 *
+		 *      fsconfig(FSCONFIG_SET_FLAG, "ro")
+		 *      fsmount/mount_setattr(MOUNT_ATTR_RDONLY)
+		 *
+		 * IOW, at some point the caller must have explicitly
+		 * turned the whole superblock ro and we shouldn't just
+		 * undo it like we did for the old mount api. In any
+		 * case, it lets us avoid this nasty hack in the new
+		 * mount api.
+		 *
+		 * Consequently, the remounting hack must only be used
+		 * for requests originating from the old mount api and
+		 * should be marked for full deprecation so it can be
+		 * turned off in a couple of years.
+		 *
+		 * The new mount api has no reason to support this hack.
+		 */
+		if (root_fc->oldapi && ro2rw) {
+			/*
+			 * This magic internal remount is a pretty bold
+			 * move as the VFS reserves the right to protect
+			 * ro->rw transitions on the VFS layer similar
+			 * to how it protects rw->ro transitions.
+			 */
+			ret = btrfs_legacy_reconfigure(root_fc);
+			if (ret)
+				root_mnt = ERR_PTR(ret);
+		}
+	}
+	put_fs_context(root_fc);
+	if (IS_ERR(root_mnt))
+		return PTR_ERR(root_mnt);
+	ctx->root_mnt = root_mnt;
+
+	root_dentry = mount_subvol(fc);
+	if (IS_ERR(root_dentry))
+		return PTR_ERR(root_dentry);
+
+	fc->root = root_dentry;
+	return 0;
+}

  parent reply	other threads:[~2023-11-02  9:48 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-25 13:50 [PATCH 0/3] fanotify support for btrfs sub-volumes Amir Goldstein
2023-10-25 13:50 ` [PATCH 1/3] fs: define a new super operation to get fsid Amir Goldstein
2023-10-25 13:50 ` [PATCH 2/3] btrfs: implement " Amir Goldstein
2023-10-25 13:50 ` [PATCH 3/3] fanotify: support reporting events with fid on btrfs sub-volumes Amir Goldstein
2023-10-25 15:34 ` [PATCH 0/3] fanotify support for " Christoph Hellwig
2023-10-25 17:04   ` Jan Kara
2023-10-27  5:44     ` Christoph Hellwig
2023-10-27 10:58       ` Jan Kara
2023-10-25 21:06   ` Josef Bacik
2023-10-25 23:02     ` Qu Wenruo
2023-10-26  5:49       ` Amir Goldstein
2023-10-27  5:46     ` Christoph Hellwig
     [not found]       ` <20231027131726.GA2915471@perftesting>
2023-10-28  5:57         ` Amir Goldstein
2023-10-30 13:25         ` Christoph Hellwig
2023-10-31 12:14           ` Christian Brauner
2023-10-31 12:22             ` Christoph Hellwig
2023-10-31 12:50               ` Christian Brauner
2023-10-31 17:06                 ` Christoph Hellwig
2023-11-01  0:03                   ` Qu Wenruo
2023-11-03 14:21                     ` Christoph Hellwig
2023-11-01  8:16                   ` Christian Brauner
2023-11-01  8:41                     ` Qu Wenruo
2023-11-01  9:52                       ` Christian Brauner
2023-11-02  5:13                         ` Josef Bacik
2023-11-02  8:53                           ` Amir Goldstein
2023-11-02  9:48                           ` Christian Brauner [this message]
2023-11-02 12:34                             ` Josef Bacik
2023-11-02 17:07                               ` David Sterba
2023-11-02 20:32                                 ` Josef Bacik
2023-11-03  6:56                                 ` Christian Brauner
2023-11-03 13:52                                   ` Josef Bacik
2023-11-02 11:07                           ` Christian Brauner
2023-11-03 14:28                             ` Christoph Hellwig
2023-11-03 15:47                               ` Christian Brauner
2023-11-06  7:53                                 ` Christoph Hellwig
2023-11-06  8:18                                   ` Qu Wenruo
2023-11-06  9:56                                     ` Christian Brauner
2023-11-06 12:25                                     ` Christoph Hellwig
2023-11-06 10:03                                   ` Christian Brauner
2023-11-06 10:41                                     ` Qu Wenruo
2023-11-06 10:59                                       ` Christian Brauner
2023-11-06 12:30                                         ` Christoph Hellwig
2023-11-06 13:05                                           ` Christian Brauner
2023-11-06 17:10                                             ` Christoph Hellwig
2023-11-07  8:58                                               ` Christian Brauner
2023-11-08  7:56                                                 ` Christoph Hellwig
2023-11-08  8:09                                                   ` Christian Brauner
2023-11-08  8:12                                                     ` Christoph Hellwig
2023-11-08  8:22                                                       ` Christian Brauner
2023-11-08 14:07                                                         ` Christoph Hellwig
2023-11-08 15:57                                                           ` Christian Brauner
2023-11-06 12:29                                     ` Christoph Hellwig
2023-11-06 13:47                                       ` Christian Brauner
2023-11-06 17:13                                         ` Christoph Hellwig
2023-11-06 22:42                                           ` Josef Bacik
2023-11-07  9:06                                             ` Christian Brauner
2023-11-08  7:52                                               ` Christoph Hellwig
2023-11-08  8:27                                                 ` Christian Brauner
2023-11-08 14:08                                                   ` Christoph Hellwig
2023-11-08 16:16                                                     ` Christian Brauner
2023-11-08 16:20                                                       ` Christian Brauner
2023-11-09  6:55                                                         ` Christoph Hellwig
2023-11-09  9:07                                                           ` Christian Brauner
2023-11-09 14:41                                                             ` Christoph Hellwig
2023-11-10  9:33                                                               ` Christian Brauner
2023-11-10 10:31                                                                 ` Amir Goldstein
2023-11-09  6:53                                                       ` Christoph Hellwig
2023-11-08  7:51                                             ` Christoph Hellwig
2023-11-08 11:08                                               ` Jan Kara
2023-11-08 14:11                                                 ` Christoph Hellwig
2023-11-06  9:03                                 ` Jan Kara
2023-11-06  9:52                                   ` Christian Brauner
2023-11-06 12:22                                     ` Jan Kara
2023-11-03 14:23                       ` Christoph Hellwig
2023-11-03 14:22                     ` Christoph Hellwig
2023-10-25 17:17 ` Amir Goldstein
2023-10-25 18:02   ` Amir Goldstein
2023-10-26 12:17     ` Jan Kara
2023-10-26 12:36       ` Amir Goldstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231102-ankurbeln-eingearbeitet-cbeb018bfedc@brauner \
    --to=brauner@kernel.org \
    --cc=amir73il@gmail.com \
    --cc=clm@fb.com \
    --cc=dsterba@suse.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox