From: Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
To: Linux Containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Checkpoint/Restart mini-summit
Date: Tue, 15 Jul 2008 12:49:45 +0200 [thread overview]
Message-ID: <487C80C9.2040105@fr.ibm.com> (raw)
Hi all,
Here is a proposition a more detailed agenda for the checkpoint/restart
mini-summit. If everybody is ok with it, I will update the wiki.
Comments are welcome :)
Thanks
-- Daniel
========================================================================
The Checkpoint/restart is a very big topic and the time at the
mini-summit is short, so I propose a list of document pointers to be
read before the mini-summit, so we can address the checkpoint/restart
topic directly and save precious time :)
* Documentation
* Zap : www.ncl.cs.columbia.edu/publications/usenix2007_fordist.pdf
* Metacluster : lxc.sourceforge.net/doc/ols2006/lxc-ols2006.pdf
* OpenVZ : http://wiki.openvz.org/Checkpointing_and_live_migration
* Checkpoint/Restart technology :
http://en.wikipedia.org/wiki/Application_checkpointing
* Virtual Servers and Checkpoint/Restart in Mainstream Linux : Sigops
document
-------------------------------------------------------------------------------
This section is about how to prepare the kernel to implement the
checkpoint/restart
* Preparing the kernel internals
* Identifying the kernel subsystems
* Identifying the process resources
* Identifying the frameworks for the CR
* Identifying the pieces to target first
Actually, one of the big interrogation is how we transmit the internal
state to and from the kernel. There are some little patches doing the
checkpoint/restart, taking into account a small part of the kernel
resources. Some were made through netlink, others via /proc, others
directly with a syscall. There were solutions proposed in the
containers mailing list to use a core dump like file, or a CR
filesystem. This section is to discuss about that.
* Passing the kernel internal state to/from userspace
* coredump like file
* swap per container
* netlinks
* CR filesystem
* army of different call for the CR (proc, existing syscalls, ...)
The following sections addresse the checkpoint/restart itself which
can be split into three parts: the quescient point, the checkpoint and
the restart.
* Checkpointing / Restarting
* Reaching a quescient point
* for the network
* for the processes
* for the asynchronous IO
* Checkpoint
* Preinstalled checkpoint signal handler ?
* syscall ?
* tar of a CR filesystem ?
* monolithic ?
* Dumping processes hierarchy
* Identifying the kernel resource dependencies
* Dumping system wide resources (per namespace ?)
* Dumping process wide resources (from process context ?)
(Memory is in between system and process resource)
* Restarting
* New binary format handler ?
* Identifying the kernel resource dependencies
* Restoring the processes hierarchy
* Restoring system wide resources
* Restoring process wide resources
There is a posix draft, 1003.1m, which specify a CR semantic. This can
be interesting to take it into account and provide an user API based
on this specification so we can keep in mind this when we implement
the CR in the kernel. I was not able to find the posix draft itself
but the man of the CR IRIX implementation sticks to this
specification.
* Determining the userspace API
* Posix 1003.1m (implementation in IRIX) ?
http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/0650/bks/SGI_Admin/CPR_OG/sgi_html/ch03.html
next reply other threads:[~2008-07-15 10:49 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-15 10:49 Daniel Lezcano [this message]
[not found] ` <487C80C9.2040105-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-07-15 18:44 ` Checkpoint/Restart mini-summit Eric W. Biederman
[not found] ` <m1prpfc7hj.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-07-16 15:15 ` Serge E. Hallyn
[not found] ` <20080716151530.GA26496-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 19:04 ` Eric W. Biederman
[not found] ` <m18ww14pm7.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-07-16 19:35 ` Serge E. Hallyn
[not found] ` <20080716193538.GA28393-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 21:27 ` Eric W. Biederman
[not found] ` <m1sku9zfhc.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-07-17 2:27 ` Serge E. Hallyn
[not found] ` <20080717022729.GC21726-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-17 3:02 ` C. Craig Ross
2008-07-17 16:15 ` Daniel Lezcano
2008-07-15 18:54 ` Eric W. Biederman
[not found] ` <m1lk03c71k.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-07-16 15:17 ` Serge E. Hallyn
[not found] ` <20080716151707.GB26496-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-16 15:26 ` Daniel Lezcano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=487C80C9.2040105@fr.ibm.com \
--to=dlezcano-nmtc/0zbporqt0dzr+alfa@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.