All of lore.kernel.org
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Menglong Dong <menglong8.dong@gmail.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Kees Cook <keescook@chromium.org>,
	Sami Tolvanen <samitolvanen@google.com>,
	ojeda@kernel.org, johan@kernel.org,
	Bjorn Helgaas <bhelgaas@google.com>,
	masahiroy@kernel.org, Menglong Dong <dong.menglong@zte.com.cn>,
	joe@perches.com, Jens Axboe <axboe@kernel.dk>,
	hare@suse.de, Jan Kara <jack@suse.cz>,
	tj@kernel.org, gregkh@linuxfoundation.org, song@kernel.org,
	NeilBrown <neilb@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	f.fainelli@gmail.com, arnd@arndb.de,
	Rasmus Villemoes <linux@rasmusvillemoes.dk>,
	wangkefeng.wang@huawei.com, Barret Rhoden <brho@google.com>,
	mhiramat@kernel.org, Steven Rostedt <rostedt@goodmis.org>,
	vbabka@suse.cz, Alexander Potapenko <glider@google.com>,
	pmladek@suse.com, Chris Down <chris@chrisdown.name>,
	jojing64@gmail.com, terrelln@fb.com, geert@linux-m68k.org,
	mingo@kernel.org, linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	jeyu@kernel.org
Subject: Re: [PATCH v2 2/3] init/do_cmounts.c: introduce 'user_root' for initramfs
Date: Tue, 25 May 2021 22:23:09 -0500	[thread overview]
Message-ID: <m11r9umb4y.fsf@fess.ebiederm.org> (raw)
In-Reply-To: <CADxym3a5nsuw2hiDF=ZS51Wpjs-i_VW+OGd-sgGDVrKYw2AiHQ@mail.gmail.com> (Menglong Dong's message of "Wed, 26 May 2021 09:51:22 +0800")

Menglong Dong <menglong8.dong@gmail.com> writes:

> On Wed, May 26, 2021 at 2:50 AM Eric W. Biederman <ebiederm@xmission.com> wrote:
>>
> ......
>>
>> What is the flow where docker uses an initramfs?
>>
>> Just thinking about this I am not being able to connect the dots.
>>
>> The way I imagine the world is that an initramfs will be used either
>> when a linux system boots for the first time, or an initramfs would
>> come from the distribution you are running inside a container.  In
>> neither case do I see docker being in a position to add functionality
>> to the initramfs as docker is not responsible for it.
>>
>> Is docker doing something creating like running a container in a VM,
>> and running some directly out of the initramfs, and wanting that code
>> to exactly match the non-VM case?
>>
>> If that is the case I think the easy solution would be to actually use
>> an actual ramdisk where pivot_root works.
>
> In fact, nowadays, initramfs is widely used by embedded devices in the
> production environment, which makes the whole system run in ram.
>
> That make sense. First, running in ram will speed up the system. The size
> of the system won't be too large for embedded devices, which makes this
> idea work. Second, this will reduce the I/O of disk devices, which can
> extend the life of the disk. Third, RAM is getting cheaper.
>
> So in this scene, Docker runs directly in initramfs.

That is the piece of the puzzle I was missing.  An small system
with it's root in an initramfs.

>> I really don't see why it makes sense for docker to be a special
>> snowflake and require kernel features that no other distribution does.
>>
>> It might make sense to create a completely empty filesystem underneath
>> an initramfs, and use that new rootfs as the unchanging root of the
>> mount tree, if it can be done with a trivial amount of code, and
>> generally make everything cleaner.
>>
>> As this change sits it looks like a lot of code to handle a problem
>> in the implementation of docker.   Which quite frankly will be a pain
>> to have to maintain if this is not a clean general feature that
>> other people can also use.
>>
>
> I don't think that it's all for docker, pivot_root may be used by other
> users in the above scene. It may work to create an empty filesystem, as you
> mentioned above. But I don't think it's a good idea to make all users,
> who want to use pivot_root, do that. After all, it's not friendly to
> users.
>
> As for the code, it may look a lot, but it's not complex. Maybe a clean
> up for the code I add can make it better?

If we are going to do this something that is so small and clean it can
be done unconditionally always.

I will see if I can dig in and look at little more.  I think there is
a reason Al Viro and H. Peter Anvin implemeted initramfs this way.
Perhaps it was just a desire to make pivot_root unnecessary.

Container filesystem setup does throw a bit of a wrench in the works as
unlike a initramfs where you can just delete everything there is not
a clean way to get rid of a root filesystem you don't need without
pivot_root.


The net request as I understand it: Make the filesystem the initramfs
lives in be an ordinary filesystem so it can just be used as the systems
primary filesystem.

There might be technical reasons why that is a bad idea and userspace
would be requested to move everything into another ramfs manually (which
would have the same effect).  But it is take a good look to see if it
can be accomplished cleanly.

Eric

  reply	other threads:[~2021-05-26  3:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25 14:15 [PATCH v2 0/3] init/initramfs.c: make initramfs support pivot_root menglong8.dong
2021-05-25 14:15 ` [PATCH v2 1/3] init/main.c: introduce function ramdisk_exec_exist() menglong8.dong
2021-05-25 14:15 ` [PATCH v2 2/3] init/do_cmounts.c: introduce 'user_root' for initramfs menglong8.dong
2021-05-25 18:49   ` Eric W. Biederman
2021-05-26  1:51     ` Menglong Dong
2021-05-26  3:23       ` Eric W. Biederman [this message]
2021-05-26  4:32         ` Josh Triplett
2021-05-26  8:33           ` Menglong Dong
2021-05-26  9:03             ` Luis Chamberlain
2021-05-27  7:29               ` Menglong Dong
2021-05-26 22:33             ` Josh Triplett
2021-05-26  8:23         ` Menglong Dong
2021-05-25 14:15 ` [PATCH v2 3/3] init/do_mounts.c: fix rootfs_fs_type with ramfs menglong8.dong
2021-05-25 17:43 ` [PATCH v2 0/3] init/initramfs.c: make initramfs support pivot_root Josh Triplett
2021-05-28  7:10 ` Masami Hiramatsu
2021-05-28  7:37   ` Menglong Dong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m11r9umb4y.fsf@fess.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bhelgaas@google.com \
    --cc=brho@google.com \
    --cc=chris@chrisdown.name \
    --cc=dong.menglong@zte.com.cn \
    --cc=f.fainelli@gmail.com \
    --cc=geert@linux-m68k.org \
    --cc=glider@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hare@suse.de \
    --cc=jack@suse.cz \
    --cc=jeyu@kernel.org \
    --cc=joe@perches.com \
    --cc=johan@kernel.org \
    --cc=jojing64@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=menglong8.dong@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=neilb@suse.de \
    --cc=ojeda@kernel.org \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=song@kernel.org \
    --cc=terrelln@fb.com \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkefeng.wang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.