From: Tejun Heo <tj@kernel.org>
To: Pavel Emelyanov <xemul@parallels.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Andrey Vagin <avagin@parallels.com>,
James Bottomley <jbottomley@parallels.com>,
Glauber Costa <glommer@parallels.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@elte.hu>,
Dave Hansen <dave@linux.vnet.ibm.com>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Daniel Lezcano <dlezcano@fr.ibm.com>,
Alexey Dobriyan <adobriyan@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Oleg Nesterov <oleg@redhat.com>
Subject: Re: [patch 5/5] elf: Add support for loading ET_CKPT files
Date: Wed, 19 Oct 2011 11:22:28 -0700 [thread overview]
Message-ID: <20111019182228.GJ25124@google.com> (raw)
In-Reply-To: <4E9E9255.7090601@parallels.com>
Hello, Pavel.
On Wed, Oct 19, 2011 at 01:03:17PM +0400, Pavel Emelyanov wrote:
> > I don't think this is a good idea. We already have most of interface
> > necessary for restoring task state and there's no need to put it into
> > the kernel as one piece. If there's something which can't be done
> > from userland using existing interfaces, let's please discuss what
> > they are and whether they can be resolved differently first.
>
> The rest API we will require will look too special-purpose (like Cyrill
> mentioned the ability to set the mm's fields such as start_code, etc).
I find that quite difficult to agree with. We're talking about some
minor additions to prctl against updating exec path to do something it
was never designed to do + new binary format.
> Besides, putting a parasite code to restore the memory state will drive
> us into certain limitations. E.g. - what address range in the target task
> address space should the parasite occupy? What if we fail in finding one,
> how should we act next?
How is that different from snapshotting time? Unless the address is
completely packed, I can't see how that can be a problem. In the
worst case, you can use the parasite to map what's not overlapping and
then let the ptracer restore the overlapping areas after parasite is
done.
> > The exec path is very intricate as it is and it would be very easy to
> > introduce security or other bugs by altering its basic behavior.
>
> Can you elaborate a bit more on this? "Other" bugs can be introduced by
> any piece of code injected into the kernel, but what about security?
exec is a pretty important security boundary. A lot of resources (fd,
VMAs, credentails) have security-sensitive policies forced across exec
boundary and there are enough places which depends on the all
resetting property of exec (e.g. no other user of thread-group wide
resources).
> > exec presumes destruction of (most of) the current process and all its
> > other threads and replacing them w/ fresh states from an executable.
>
> This is *exactly* what is expected from the restoration process.
Umm.... so why does the patch skip zapping? And if you are gonna be
zapping, how are you gonna do multiple threads?
> > I see that you removed zapping to allow restoring multi-threaded
> > process, which seems quite scary to me. It might be okay, I don't
> > know, but IMHO it just isn't a very good idea to introduce such
> > variation to basic exec behavior for this rather narrow use case.
>
> Why? Can you describe your thought on this more, maybe I'm really missing
> smth important.
Because it breaks one of the very basic assumptions - there are no
other tasks in the thread group and thus resources usually shared
among threadgroup can be assumed to be exclusive.
> > In addition, leaving it alone doesn't really solve multi-threaded
> > restoring, does it?
>
> So your main concern is about multy-threaded tasks, right? I think we can
> prepare the exec handler capable to show how this can look like.
I'm strongly against that direction regardless of implementation
details.
> > The current code doesn't seem to be doing anything but if you're gonna add
> > it later, how are you gonna synchronize other threads?
>
> Synchronize what? If we're providing to a userspace a way to put things right
> it's up to userspace to use it correctly. Like we saw this wit pid namespaces -
> there's are plenty of ways how to kill the app with CLONE_NEWPID clone flag, but
> the intention of one is not the be 100% fool-proof, but to provide an ability
> to do what userspace need.
To shared kernel data structures.
> One of the abilities to restore multy-threaded task can be - clone threads from
> the userspace, then wait for them to prepare themselves the way they need it,
> then the threads go to sleep and the leader one calls execve. Thus the userspace
> state consistency is OK.
If you're gonna let userspace do it, why bother with in-kernel thing
at all?
> > * It's still calling exec_mmap(), so the exec'ing thread will be on
> > the new mm while other threads would still be on the old one.
>
> Why do you think it's a problem?
Ummm.... they aren't on the same address space? How can that be okay?
It's not only wrong from CR POV. The kernel disallows
CLONE_THREAD/SIGHAND w/o CLONE_VM and depend on that assumption,
you're breaking that.
> > * There are operations which assume that the calling task is
> > de-threaded and thus has exclusive access to data structures which
> > can be shared among the thread group (e.g. signal struct). How are
> > accesses to them synchronized?
>
> If we're talking about keeping the userspace state solid, then stopping the
> other threads solves this.
Again, there are kernel exclusivity assumptions.
> That's a pity. As I stated above, we'll try to demonstrate the exec handler
> capable to restore the multy-threaded APP and send the 2nd RFC. Can you answer
> my questions above so we keep your concerns in mind while doing this, please.
IMHO, this is a fundamentally broken approach which isn't even
necessary. So, FWIW, NACK.
Thank you.
--
tejun
next prev parent reply other threads:[~2011-10-19 18:22 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-14 11:04 [patch 0/5] [RFC] Checkpoint/restore and Elf extension Cyrill Gorcunov
2011-10-14 11:04 ` [patch 1/5] proc: Introduce the Children: line in /proc/<pid>/status Cyrill Gorcunov
2011-10-14 16:36 ` Tejun Heo
2011-10-14 11:04 ` [patch 2/5] fs: Add do_close helper Cyrill Gorcunov
2011-10-14 11:04 ` [patch 3/5] fs, proc: Add /proc/$pid/tls entry Cyrill Gorcunov
2011-10-14 16:40 ` Tejun Heo
2011-10-14 16:43 ` Cyrill Gorcunov
2011-10-14 11:04 ` [patch 4/5] fs, proc: Add start_data, end_data, start_brk members to /proc/$pid/stat Cyrill Gorcunov
2011-10-14 11:04 ` [patch 5/5] elf: Add support for loading ET_CKPT files Cyrill Gorcunov
2011-10-14 17:10 ` Tejun Heo
2011-10-14 17:33 ` Tejun Heo
2011-10-19 9:03 ` Pavel Emelyanov
2011-10-19 18:22 ` Tejun Heo [this message]
2011-10-19 18:49 ` Cyrill Gorcunov
2011-10-19 18:52 ` Cyrill Gorcunov
2011-10-19 18:53 ` Tejun Heo
2011-10-19 19:56 ` Cyrill Gorcunov
2011-10-21 18:26 ` Tejun Heo
2011-10-21 18:36 ` Cyrill Gorcunov
2011-10-21 18:42 ` Cyrill Gorcunov
2011-10-21 18:48 ` Tejun Heo
2011-10-21 18:53 ` Cyrill Gorcunov
2011-10-22 6:34 ` Pavel Emelyanov
2011-10-20 8:33 ` Pavel Emelyanov
2011-10-20 15:56 ` Tejun Heo
2011-10-20 16:04 ` Cyrill Gorcunov
2011-10-20 17:30 ` Pavel Emelyanov
2011-10-15 18:59 ` Cyrill Gorcunov
2011-10-21 11:06 ` Glauber Costa
2011-10-21 11:20 ` Cyrill Gorcunov
2011-10-21 11:21 ` Glauber Costa
2011-10-21 11:35 ` Cyrill Gorcunov
2011-10-22 16:49 ` Dan Merillat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111019182228.GJ25124@google.com \
--to=tj@kernel.org \
--cc=adobriyan@gmail.com \
--cc=avagin@parallels.com \
--cc=dave@linux.vnet.ibm.com \
--cc=dlezcano@fr.ibm.com \
--cc=ebiederm@xmission.com \
--cc=glommer@parallels.com \
--cc=gorcunov@openvz.org \
--cc=hpa@zytor.com \
--cc=jbottomley@parallels.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.