From: Miklos Szeredi <mszeredi@redhat.com>
To: David Howells <dhowells@redhat.com>
Cc: viro <viro@zeniv.linux.org.uk>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
linux-nfs@vger.kernel.org, lkml <linux-kernel@vger.kernel.org>,
Jeff Layton <jlayton@redhat.com>
Subject: Re: [PATCH 03/14] VFS: Implement a filesystem superblock creation/configuration context [ver #6]
Date: Fri, 27 Oct 2017 11:24:21 +0200 [thread overview]
Message-ID: <CAOssrKdm-znRnR80rxvzH8WVWf=sEkL1EGMzV4c3dyp_FK2-rA@mail.gmail.com> (raw)
In-Reply-To: <28791.1509035055@warthog.procyon.org.uk>
On Thu, Oct 26, 2017 at 6:24 PM, David Howells <dhowells@redhat.com> wrote:
> Miklos Szeredi <mszeredi@redhat.com> wrote:
>
>> > +/**
>> > + * vfs_parse_mount_option - Add a single mount option to a superblock config
>>
>> Mount options are those that refer to the mount
>> (nosuid,nodev,noatime,etc..); this function is not parsing those,
>> AFAICT.
>
> I'd quibble that those are "mountpoint" options, not "mount" options. Mount
> options are the options you can pass to mount and are a mixed bag. But fair
> enough, it's probably worth avoiding such terminology where we can.
>
>> How about vfs_parse_fs_option()?
>
> Sure. Can we please not rename it again?
I promise ;)
Also, how about moving calls to vfs_parse_fs_option() into filesystem
code? Even those options are not generic, some filesystem wants
this, some that. It's just a historical accident that those are set
with MS_FOO and not "foo". Filesystems that don't have any option
parsing could have a generic version that deals with "ro/rw", the
others can handle these options along with the rest.
>
>> We probably also need a "reset" type of option that clears all bits
>> and is also passed onto the filesystem's parsing routine so it can
>> reset all options as well.
>
> Reset what? To what? To a blank slate? To the state on the medium? What if
> it's a netfs?
>
> This operation isn't well defined and I'm not sure it's useful because:
>
> (1) Unless we can preload options from some source, the starting context is
> blank, so why do you need a reset on a new mount?
Reset only makes sense in the context of reconfig (fka. remount).
> (2) We need to find out what state the options are currently in. Reset today
> doesn't necessarily mean the same as reset tomorrow.
>
> (3) Not all options are simple on/off switches. Some of them are multistate,
> some are strings/numbers that have non-zero defaults and some have
> dependencies on other options.
>
> (4) Not all options can be simply reset to "0", particularly if the
> filesystem is live. Take an option that points to a network server or a
> separate journalling device for example.
I'd think reset restores the state to a default. Default state is
what a new fs instance without any specified options starts out with.
Yes it's different today and different tomorrow. Yes, some options
are not binary. Yes, some options are not mutable on a live
filesystem. Regardless of those, I think it makes sense to allow a
reconfig that results in a configuration that would have been reached
by setting the same options on a new mount. I.e. have a "replace
configuration" as well as a "change bits of current configuration"
mode.
But lets leave to later if it's not something trivial.
>> 1/a) New sb:
>> 1/b) New sb for legacy mount(2)
>
> Looking at this in terms of ext4, I would make the parser create an "option
> change" script prior to loading the superblock. The reason for that with ext4
> is that ext4 stores an additional option string that must be parsed and
> applied first - except that we potentially need some of the mount-supplied
> options to be able to mount the fs.
>
> So in the new-mount-of-new-sb case, I would actually create two scripts, one
> for the options written to the context fd, then one for the on-disk script,
> then validate the context and then apply them both atomically.
Agree.
>
>> 2/a) Shared sb:
>> 2/b) Shared sb for legacy mount(2)
>
> In the new-mount-of-live-sb case, I would validate the context script and
> ignore any options that try to change things that can't be changed because the
> fs is live.
Your sentence seems to imply that we do change those that can be
changed. That's not what legacy does, it ignores *all* options
(except rw/ro for which it errors out on mismatch). I don't think
that's a nice behavior, but we definitely need to keep it for legacy.
For non-legacy, do we want to extend the "error out on mismatch"
behavior to all options, rather than ignoring them?
> It might be nice to report them also, but that requires a mechanism to do so.
>
>> 3/a) Reconfig
>> 3/b) Reconfig for legacy mount(2) (i.e. MS_REMOUNT)
>
> In the reconfigure case, I only need to create one script, validate it and
> then apply it atomically (well, as atomically as possible, given the fs is
> actually live at this point).
Yep.
>
> There's the question of how far you allow a happens-to-share mount to effect a
> reconfigure. Seems a reasonable distinction to say that in your case 2 you
> just ignore conflicts but possibly warn or reject in case 3.
Not sure I understand why we'd want to ignore conflicts in case 2 and
not in 3. Can we not have consistency (error out on all conflicts)?
>> > +int generic_parse_monolithic(struct fs_context *ctx, void *data)
>> > +{
>> > + char *options = data, *p;
>> > + int ret;
>> > +
>> > + if (!options)
>> > + return 0;
>> > +
>> > + while ((p = strsep(&options, ",")) != NULL) {
>> > + if (*p) {
>> > + ret = vfs_parse_mount_option(ctx, p);
>>
>> Monolithic option block is the legacy thing.
>
> Yes, I know.
>
>> It shouldn't be parsing the common flags. It should instead be treating
>> them as forbidden (although it probably doesn't really matter, since no
>> filesystem will accept these anyway).
>
> Except that ext4, f2fs, 9p, ... do take at least some of them. I'm not sure
> whether they ever see them, but without auditing userspace, there's no way to
> know.
So moving possibly dead code to the level of VFS fixes things how?
Let filesystems deal with that crap and make sure they deal with it
only for legacy mount and not for the new, supposedly clean one.
Making it generic also possibly breaks uABI by allowing an option that
was rejected previously for some other fs.
>
>> So probably best to expand vfs_parse_mount_option() here and skip the
>> sb flag parsing part.
>
> You need to prove they are never seen here :-/
>
>> > + * @sb_flags: Superblock flags and op flags (such as MS_REMOUNT)
>>
>> I'm confused: MS_REMOUNT in sb_flags and FS_CONTEXT_FOR_REMOUNT in purpose?
>>
>> I hope that's just a stale comment, sb_flags should really be just the
>> superblock flags and not any op flags.
>
> Yeah - that's stale.
>
>> Also, can FS_CONTEXT_FOR_REMOUNT be renamed to ..._RECONFIG?
>
> If you really want ;-)
Yes. I think clean naming results in clean concepts in one's head,
which results in clean interfaces. Which is *the* purpose of this
exercise.
Thanks,
Miklos
next prev parent reply other threads:[~2017-10-27 9:24 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-06 15:49 [PATCH 00/14] VFS: Introduce filesystem context [ver #6] David Howells
2017-10-06 15:49 ` [PATCH 01/14] VFS: Introduce the structs and doc for a " David Howells
2017-10-06 15:49 ` [PATCH 02/14] VFS: Add LSM hooks for " David Howells
2017-10-06 20:37 ` Randy Dunlap
2017-10-06 15:49 ` [PATCH 03/14] VFS: Implement a filesystem superblock creation/configuration " David Howells
2017-10-06 20:34 ` Randy Dunlap
2017-10-06 23:13 ` David Howells
2017-10-07 0:08 ` Randy Dunlap
2017-10-10 7:49 ` Miklos Szeredi
2017-10-10 15:24 ` David Howells
2017-10-26 16:24 ` David Howells
2017-10-27 9:24 ` Miklos Szeredi [this message]
2017-10-27 14:35 ` David Howells
2017-10-27 15:33 ` Miklos Szeredi
2017-10-27 16:03 ` David Howells
2017-10-30 8:44 ` Miklos Szeredi
2017-10-06 15:49 ` [PATCH 04/14] VFS: Remove unused code after filesystem context changes " David Howells
2017-10-06 15:49 ` [PATCH 05/14] VFS: Implement fsopen() to prepare for a mount " David Howells
2017-10-26 17:11 ` Jeff Layton
2017-10-26 19:01 ` Jeff Layton
2017-10-06 15:49 ` [PATCH 06/14] VFS: Implement fsmount() to effect a pre-configured " David Howells
2017-10-10 8:00 ` Miklos Szeredi
2017-10-10 9:51 ` Karel Zak
2017-10-10 13:38 ` Miklos Szeredi
2017-10-11 8:54 ` Karel Zak
2017-10-06 15:50 ` [PATCH 07/14] VFS: Add a sample program for fsopen/fsmount " David Howells
2017-10-26 17:21 ` Jeff Layton
2017-10-26 22:40 ` David Howells
2017-10-06 15:50 ` [PATCH 08/14] procfs: Move proc_fill_super() to fs/proc/root.c " David Howells
2017-10-06 15:50 ` [PATCH 09/14] proc: Add fs_context support to procfs " David Howells
2017-10-06 15:50 ` [PATCH 10/14] ipc: Convert mqueue fs to fs_context " David Howells
2017-10-06 15:50 ` [PATCH 11/14] cpuset: Use " David Howells
2017-10-06 15:50 ` [PATCH 12/14] kernfs, sysfs, cgroup, intel_rdt: Support " David Howells
2017-10-06 15:50 ` [PATCH 13/14] hugetlbfs: Convert to " David Howells
2017-10-06 15:50 ` [PATCH 14/14] VFS: Remove kern_mount_data() " David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAOssrKdm-znRnR80rxvzH8WVWf=sEkL1EGMzV4c3dyp_FK2-rA@mail.gmail.com' \
--to=mszeredi@redhat.com \
--cc=dhowells@redhat.com \
--cc=jlayton@redhat.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).