From: Arnd Bergmann <arnd@arndb.de>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Dave Hansen <dave@linux.vnet.ibm.com>,
containers@lists.linux-foundation.org,
Theodore Tso <tytso@mit.edu>,
linux-kernel@vger.kernel.org
Subject: Re: checkpoint/restart ABI
Date: Thu, 21 Aug 2008 10:43:40 +0200 [thread overview]
Message-ID: <200808211043.41387.arnd@arndb.de> (raw)
In-Reply-To: <48AD0379.9030705@cs.columbia.edu>
On Thursday 21 August 2008, Oren Laadan wrote:
>
> Arnd Bergmann wrote:
> Extending this view in the context of security - we can require sysadmin
> privilege to restart, and then sysadmin is responsible for the contents
> of the file. The kernel will ensure the the data isn't corrupted. Much
> like with loading a kenrel module - the admin may load any sort of crap.
> Then, sysadmin may, for instance, add a signature on a checkpointed file
> to verify it's integrity.
>
> (Well, one problem with this scheme in the context of self-checkpoint
> would be - who can be trusted to generate the signature in that case).
Sorry, I don't buy that argument. I'm convinced that an implementation
is possible where any user can load checkpoints of tasks that he could
create by starting the processes directly. If you argue that loading
a corrupted checkpoint can cause any problems, then I would assume
the restart code needs better permission and sanity checks.
> Using a single handle (crid or a special file descriptor) to identify
> the whole checkpoint is very useful - to be able to stream it (eg. over
> the network, or through filters). It is also very important for future
> features and optimizations. For example, to reduce downtime of the
> application during checkpoint, one can use COW for dirty pages, and
> only write-back the entire data after the application resumes execution.
> Or imagine a use-case where one would like to keep the entire checkpoint
> in memory. These are pretty hard to do if you split the handling between
> multiple files or handles.
right.
> > On the restart side, I think the most consistent interface would
> > be a new binfmt_chkpt implementation that you can use to execve
> > a checkpoint, just like you execute an ELF file today. The binfmt
> > can be a module (unlike a syscall), so an administrator that is
> > afraid of the security implications can just disable it by not
> > loading the module. In an execve model, the parent process can
> > set up anything related to credentials as good as it's allowed
> > to and then let the kernel do the rest.
>
> This is an interesting idea but not without its problems. In particular,
> a successful execve() by one thread destroys all the others.
Right, execve currently assumes that the new process starts up with
a single thread, but a potential binfmt_chkpt would need to potentially
start multithreaded. I guess this either requires execve to reuse
the existing threads (assuming they have been set up correctly in
advance) or to create new ones according to the context of the
checkpoint data. It may not be as easy as I thought initially, but
both seem possible.
Restarting a whole set of processes from a checkpoint would be
a relatively simple extension of that.
> Also, it isn't clear how this can work with pre-copying and live-migration;
> And finally, I'm not sure how to handle shared objects in this manner.
What do you mean with pre-copying?
How is live-migration different from restarting a previously saved
task from the same machine?
> As for kernel module - it is easy to implement most of the checkpoint
> restart functionality in a kernel module, leaving only the syscall stubs
> in the kernel.
Yeah, I've done the same in spufs, but I still think it's ugly ;-)
Arnd <><
next prev parent reply other threads:[~2008-08-21 8:44 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-07 22:40 [RFC][PATCH 0/4] kernel-based checkpoint restart Dave Hansen
2008-08-07 22:40 ` [RFC][PATCH 1/4] checkpoint-restart: general infrastructure Dave Hansen
2008-08-08 9:46 ` Arnd Bergmann
2008-08-08 18:50 ` Dave Hansen
2008-08-08 20:59 ` Oren Laadan
2008-08-08 22:17 ` Dave Hansen
2008-08-08 23:27 ` Oren Laadan
2008-08-08 22:23 ` Arnd Bergmann
2008-08-14 8:09 ` [Devel] " Pavel Emelyanov
2008-08-14 15:16 ` Dave Hansen
2008-08-08 22:13 ` Arnd Bergmann
2008-08-08 22:26 ` Dave Hansen
2008-08-08 22:39 ` Arnd Bergmann
2008-08-09 0:43 ` Dave Hansen
2008-08-09 6:37 ` Arnd Bergmann
2008-08-09 13:39 ` Dave Hansen
2008-08-11 15:07 ` Serge E. Hallyn
2008-08-11 15:25 ` Arnd Bergmann
2008-08-14 5:53 ` Pavel Machek
2008-08-14 15:12 ` Dave Hansen
2008-08-20 21:40 ` Oren Laadan
2008-08-11 15:22 ` Serge E. Hallyn
2008-08-11 16:53 ` Arnd Bergmann
2008-08-11 17:11 ` Dave Hansen
2008-08-11 19:48 ` checkpoint/restart ABI Dave Hansen
2008-08-11 21:47 ` Arnd Bergmann
2008-08-11 23:14 ` Jonathan Corbet
2008-08-11 23:23 ` Dave Hansen
2008-08-21 5:56 ` Oren Laadan
2008-08-21 8:43 ` Arnd Bergmann [this message]
2008-08-21 15:43 ` Oren Laadan
2008-08-11 21:54 ` Oren Laadan
2008-08-11 23:38 ` Jeremy Fitzhardinge
2008-08-11 23:54 ` Peter Chubb
2008-08-12 14:49 ` Serge E. Hallyn
2008-08-28 23:40 ` Eric W. Biederman
2008-08-12 15:11 ` Dave Hansen
2008-08-12 14:58 ` Dave Hansen
2008-08-12 16:32 ` Jeremy Fitzhardinge
2008-08-12 16:46 ` Dave Hansen
2008-08-12 17:04 ` Jeremy Fitzhardinge
2008-08-20 21:52 ` Oren Laadan
2008-08-20 21:54 ` Oren Laadan
2008-08-20 22:11 ` Dave Hansen
2008-08-11 18:03 ` [RFC][PATCH 1/4] checkpoint-restart: general infrastructure Jonathan Corbet
2008-08-11 18:38 ` Dave Hansen
2008-08-12 3:44 ` Oren Laadan
2008-08-18 9:26 ` [Devel] " Pavel Emelyanov
2008-08-20 19:10 ` Dave Hansen
2008-08-07 22:40 ` [RFC][PATCH 2/4] checkpoint/restart: x86 support Dave Hansen
2008-08-08 12:09 ` Arnd Bergmann
2008-08-08 20:28 ` Oren Laadan
2008-08-08 22:29 ` Arnd Bergmann
2008-08-08 23:04 ` Oren Laadan
2008-08-09 0:38 ` Dave Hansen
2008-08-09 1:20 ` Oren Laadan
2008-08-09 2:20 ` Dave Hansen
2008-08-09 2:35 ` Oren Laadan
2008-08-10 14:55 ` Jeremy Fitzhardinge
2008-08-11 15:36 ` Dave Hansen
2008-08-11 16:07 ` Jeremy Fitzhardinge
2008-08-09 6:43 ` Arnd Bergmann
2008-08-07 22:40 ` [RFC][PATCH 3/4] checkpoint/restart: memory management Dave Hansen
2008-08-08 12:12 ` Arnd Bergmann
2008-08-07 22:40 ` [RFC][PATCH 4/4] introduce sys_checkpoint and sys_restore Dave Hansen
2008-08-08 12:15 ` Arnd Bergmann
2008-08-08 20:33 ` Oren Laadan
2008-08-08 9:25 ` [RFC][PATCH 0/4] kernel-based checkpoint restart Arnd Bergmann
2008-08-08 18:06 ` Dave Hansen
2008-08-08 18:18 ` Arnd Bergmann
2008-08-08 19:44 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200808211043.41387.arnd@arndb.de \
--to=arnd@arndb.de \
--cc=containers@lists.linux-foundation.org \
--cc=dave@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=orenl@cs.columbia.edu \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox