From: Luis Chamberlain <mcgrof@kernel.org>
To: Menglong Dong <menglong8.dong@gmail.com>
Cc: Jan Kara <jack@suse.cz>, Jens Axboe <axboe@kernel.dk>,
hare@suse.de, gregkh@linuxfoundation.org, tj@kernel.org,
Menglong Dong <dong.menglong@zte.com.cn>,
song@kernel.org, NeilBrown <neilb@suse.de>,
Andrew Morton <akpm@linux-foundation.org>,
wangkefeng.wang@huawei.com, f.fainelli@gmail.com, arnd@arndb.de,
Barret Rhoden <brho@google.com>,
Rasmus Villemoes <linux@rasmusvillemoes.dk>,
mhiramat@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
Kees Cook <keescook@chromium.org>,
vbabka@suse.cz, Alexander Potapenko <glider@google.com>,
pmladek@suse.com, Chris Down <chris@chrisdown.name>,
ebiederm@xmission.com, jojing64@gmail.com,
LKML <linux-kernel@vger.kernel.org>,
palmerdabbelt@google.com, linux-fsdevel@vger.kernel.org,
Alexander Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root
Date: Mon, 24 May 2021 22:58:27 +0000 [thread overview]
Message-ID: <20210524225827.GA4332@42.do-not-panic.com> (raw)
In-Reply-To: <CADxym3Z7bdEJECEejPqg-15ycghgX3ZEmOGWYwxZ1_HPWLU1NA@mail.gmail.com>
On Sat, May 22, 2021 at 12:09:30PM +0800, Menglong Dong wrote:
> On Fri, May 21, 2021 at 11:50 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
> >
> > > That's a solution, but I don't think it is feasible. Users may create many
> > > containers, and you can't make docker create all the containers first
> > > and create network namespace later, as you don't know if there are any
> > > containers to create later.
> >
> > It doesn't seem impossible, but worth noting why inside the commit log
> > this was not a preferred option.
> >
>
> In fact, the network namespace is just a case for the problem that the
> 'mount leak' caused. And this kind modification is not friendly to
> current docker users, it makes great changes to the usage of docker.
You mean an upgrade of docker? If so... that does not seem like a
definitive reason to do something new in the kernel *always*.
However, if you introduce it as a kconfig option so that users
who want to use this new feature can enable it, and then use it,
the its sold as a new feature.
Should this always be enabled, or done this way? Should we never have
the option to revert back to the old behaviour? If not, why not?
> > We still have:
> >
> > start_kernel() --> vfs_caches_init() --> mnt_init() -->
> >
> > mnt_init()
> > {
> > ...
> > shmem_init();
> > init_rootfs();
> > init_mount_tree();
> > }
> >
> > You've now modified init_rootfs() to essentially just set the new user_root,
> > and that's it. But we stil call init_mount_tree() even if we did set the
> > rootfs to use tmpfs.
>
> The variate of 'is_tmpfs' is only used in 'rootfs_init_fs_context'. I used
> ramfs_init_fs_context directly for rootfs,
I don't see you using any context directly, where are you specifying the
context directly?
> so it is not needed any more
> and I just removed it in init_rootfs().
>
> The initialization of 'user_root' in init_rootfs() is used in:
> do_populate_rootfs -> mount_user_root, which set the file system(
> ramfs or tmpfs) of the second mount.
>
> Seems it's not suitable to place it in init_rootfs()......
OK I think I just need to understand how you added the context of the
first mount explicitly now and where, as I don't see it.
> > > In do_populate_ro
> > > tmpfs, and that's the real rootfs for initramfs. And I call this root
> > > as 'user_root',
> > > because it is created for user space.
> > >
> > > int __init mount_user_root(void)
> > > {
> > > return do_mount_root(user_root->dev_name,
> > > user_root->fs_name,
> > > root_mountflags,
> > > root_mount_data);
> > > }
> > >
> > > In other words, I moved the realization of 'rootfs_fs_type' here to
> > > do_populate_rootfs(), and fixed this 'rootfs_fs_type' with
> > > ramfs_init_fs_context, as it is a fake root now.
> >
> > do_populate_rootfs() is called from populate_rootfs() and that in turn
> > is a:
> >
> > rootfs_initcall(populate_rootfs);
> >
> > In fact the latest changes have made this to schedule asynchronously as
> > well. And so indeed, init_mount_tree() always kicks off first. So its
> > still unclear to me why the first mount now always has a fs context of
> > ramfs_init_fs_context, even if we did not care for a ramdisk.
> >
> > Are you suggesting it can be arbitrary now?
>
> With the existence of the new user_root, the first mount is not directly used
> any more, so the filesystem type of it doesn't matter.
What do you mean? init_mount_tree() is always called, and it has
statically:
static void __init init_mount_tree(void)
{
struct vfsmount *mnt;
...
mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL);
...
}
And as I noted, this is *always* called earlier than
do_populate_rootfs(). Your changes did not remove the init_mount_tree()
or modify it, and so why would the context of the above call always
be OK to be used now with a ramfs context now?
> So it makes no sense to make the file system of the first mount selectable.
Why? I don't see why, nor is it explained, we're always caling
vfs_kern_mount(&rootfs_fs_type, ...) and you have not changed that
either.
> To simplify the code here, I make it ramfs_init_fs_context directly. In fact,
> it's fine to make it shmen_init_fs_context here too.
So indeed you're suggesting its arbitrary now.... I don't see why.
Luis
next prev parent reply other threads:[~2021-05-24 22:58 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-05-20 15:42 [PATCH RESEND] init/initramfs.c: make initramfs support pivot_root menglong8.dong
2021-05-20 21:41 ` Luis Chamberlain
2021-05-21 0:41 ` Menglong Dong
2021-05-21 15:50 ` Luis Chamberlain
2021-05-22 4:09 ` Menglong Dong
2021-05-24 22:58 ` Luis Chamberlain [this message]
2021-05-25 0:55 ` Menglong Dong
2021-05-25 1:43 ` Luis Chamberlain
2021-05-25 6:09 ` Menglong Dong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210524225827.GA4332@42.do-not-panic.com \
--to=mcgrof@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=axboe@kernel.dk \
--cc=brho@google.com \
--cc=chris@chrisdown.name \
--cc=dong.menglong@zte.com.cn \
--cc=ebiederm@xmission.com \
--cc=f.fainelli@gmail.com \
--cc=glider@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=hare@suse.de \
--cc=jack@suse.cz \
--cc=jojing64@gmail.com \
--cc=keescook@chromium.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=menglong8.dong@gmail.com \
--cc=mhiramat@kernel.org \
--cc=neilb@suse.de \
--cc=palmerdabbelt@google.com \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=song@kernel.org \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=viro@zeniv.linux.org.uk \
--cc=wangkefeng.wang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.