From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
Alexey Dobriyan
<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Dave Hansen
<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Subject: Re: [RFC v14][PATCH 00/54] Kernel based checkpoint/restart
Date: Wed, 29 Apr 2009 18:47:24 -0400 [thread overview]
Message-ID: <49F8D8FC.8010400@cs.columbia.edu> (raw)
In-Reply-To: <20090429081815.GA1813-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
Hi Louis,
Louis Rilling wrote:
> Hi,
>
> On 28/04/09 19:23 -0400, Oren Laadan wrote:
>> Here is the latest and greatest of checkpoint/restart (c/r) patchset.
>> The logic and image format reworked and simplified, code refactored,
>> support for PPC, s390, sysvipc, shared memory of all sorts, namespaces
>> (uts and ipc).
>
> I should have asked before, but what are the reasons to checkpoint SYSV IPCs
> in the same file/stream as tasks? Would it be better to checkpoint them
> independently, like the file system state?
>
> In Kerrighed we chose to checkpoint SYSV IPCs independently, a bit like the file
> system state, because SYSV IPCs objects' lifetime do not depend on tasks
> lifetime, and we can gain more flexibility this way. In particular we envision
> cases in which two applications share a state in a SYSV SHM (something like a
> producer-consumer scheme), but do not need to be checkpointed together. In such
> a case the SYSV SHM itself could even need more high-availability (using
> active replication) than a checkpoint/restart facility.
>
Thanks for the feedback, this is actually an interesting idea.
Indeed in the past I also considered SYSV IPC to be a "global" resource
that was checkpointed before iterating through the tasks.
However, in the presence of namespaces, the lifetime of an IPC namespace
does depend on on tasks lifetime - when the last task referring to a
given namespace exits - that namespace is destroyed. Of course, the
root namespace is truly global, because init(1) never exits.
What would 'checkpoint them independently' mean in this case ?
In your use-case, can you restart either application without first
restoring the relevant SYSVIPC ?
Can you think of other use-cases for such a division ? Am I right to
guess that your use case is specific to the distributed (and SSI-)
nature of your system ? (Active-replication of SYSV_SHM sounds
awfully related to DSM :)
While not focusing on such use cases, I want to keep the design flexible
enough to not exclude them a-priori, and be able to address them later
on. Indeed, the code is split such that the the function to save a given
IPC namespace does not depend on the task that uses it. Future code
could easily use the same functionality.
One way to be flexible to support your use case, is by having some
mechanism in place to select whether a resource (virtually any) is
to be chekcpointed/restored.
For example, you could imagine checkpoint(..., CHECKPOINT_SYSVIPC)
to checkpoint (also) IPC, and not checkpoint IPC in its absence.
So normally you'd have checkpoint(..., CHECKPOINT_ALL). When you don't
want IPC, you'd use CHECKPOINT_ALL & ~CHECKPOINT_SYSVIPC. When you
want only IPC, you'd use CHECKPOINT_SYSVIPC only.
Same thing for restart, only that it will get trickier in the "only IPC"
case, since you will need to tell which IPC namespace is affected.
Also, I envision a task saying cradvise(CHECKPOINT_SYSVIPC, false),
telling the kernel to not c/r its IPC namespace. (Or any other
resource). Again there would need to be a way to add a restored
namespace.
Does this address your concerns ?
Oren.
next prev parent reply other threads:[~2009-04-29 22:47 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-28 23:23 [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Oren Laadan
[not found] ` <1240961064-13991-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-28 23:23 ` [RFC v14][PATCH 01/54] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 02/54] Checkpoint/restart: initial documentation Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 03/54] Make file_pos_read/write() public Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 04/54] General infrastructure for checkpoint restart Oren Laadan
[not found] ` <1240961064-13991-5-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 0:58 ` Serge E. Hallyn
[not found] ` <20090429005826.GA23583-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 17:49 ` Oren Laadan
[not found] ` <49F8932D.4040506-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 18:15 ` Serge E. Hallyn
2009-04-29 17:12 ` Serge E. Hallyn
2009-05-06 20:39 ` Sukadev Bhattiprolu
[not found] ` <20090506203955.GA6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-06 20:57 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 05/54] x86 support for checkpoint/restart Oren Laadan
[not found] ` <1240961064-13991-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:12 ` Dave Hansen
2009-04-28 23:23 ` [RFC v14][PATCH 06/54] Introduce method 'checkpoint' in struct vm_operations_struct Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 07/54] cr: extend arch_setup_additional_pages() Oren Laadan
[not found] ` <1240961064-13991-8-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:13 ` Dave Hansen
2009-05-01 15:42 ` Serge E. Hallyn
[not found] ` <20090501154220.GA26771-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-01 15:57 ` Dave Hansen
2009-05-01 16:18 ` Serge E. Hallyn
[not found] ` <20090501161813.GA27516-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 7:25 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 08/54] Dump memory address space Oren Laadan
[not found] ` <1240961064-13991-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 4:11 ` Serge E. Hallyn
[not found] ` <20090429041128.GA28018-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 6:42 ` Guenter Roeck
[not found] ` <20090429064241.GA17482-gvzKVTG1yJJBDgjK7y7TUQ@public.gmane.org>
2009-04-29 20:00 ` Oren Laadan
2009-04-30 4:54 ` Matt Helsley
2009-05-01 15:25 ` Dave Hansen
2009-05-01 15:27 ` Dave Hansen
2009-05-04 7:58 ` Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 09/54] Restore " Oren Laadan
[not found] ` <1240961064-13991-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:28 ` Dave Hansen
2009-04-28 23:23 ` [RFC v14][PATCH 10/54] Infrastructure for shared objects Oren Laadan
[not found] ` <1240961064-13991-11-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 1:03 ` Serge E. Hallyn
2009-04-29 16:21 ` Serge E. Hallyn
2009-04-28 23:23 ` [RFC v14][PATCH 11/54] Introduce 'checkpoint' method in 'struct file_operations' Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 12/54] Dump open file descriptors Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 13/54] add generic checkpoint f_op to ext fses Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 14/54] Restore open file descriptors Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 15/54] Record 'struct file' object instead of the file name for VMAs Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 16/54] External checkpoint of a task other than ourself Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 17/54] c/r of restart-blocks: export functionality used in next patch Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 18/54] c/r of restart-blocks Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 19/54] Checkpoint multiple processes Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 20/54] Restart " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 21/54] Define subtree flag and unpriv_allowed sysctl Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 22/54] Checkpoint open pipes Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 23/54] Restore " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 24/54] Prepare to support shared memory Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 25/54] Dump anonymous- and file-mapped- " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 26/54] Restore " Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 27/54] s390: Expose a constant for the number of words representing the CRs Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 28/54] c/r: Add CKPT_COPY() macro (v4) Oren Laadan
2009-04-28 23:23 ` [RFC v14][PATCH 29/54] s390: define s390-specific checkpoint-restart code (v7) Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 30/54] powerpc: provide APIs for validating and updating DABR Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 31/54] powerpc: checkpoint/restart implementation Oren Laadan
[not found] ` <1240961064-13991-32-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 6:54 ` Nathan Lynch
[not found] ` <m34ow8ueyk.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-04-29 15:49 ` Serge E. Hallyn
2009-04-29 18:05 ` Oren Laadan
[not found] ` <49F896E8.7020802-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:55 ` Nathan Lynch
2009-04-29 18:18 ` Oren Laadan
[not found] ` <49F899E1.2030207-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:33 ` Nathan Lynch
2009-04-28 23:24 ` [RFC v14][PATCH 32/54] powerpc: wire up checkpoint and restart syscalls Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 33/54] powerpc: enable checkpoint support in Kconfig Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 34/54] Export fs/exec.c:exec_mmap() Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 35/54] Support for share memory address spaces Oren Laadan
[not found] ` <1240961064-13991-36-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-20 17:55 ` Dave Hansen
2009-05-20 18:23 ` Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 36/54] Make ckpt_may_checkpoint_task() check each namespace individually Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 37/54] c/r: Add UTS support (v6) Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 38/54] Stub implementation of IPC namespace c/r Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 39/54] deferqueue: generic queue to defer work Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 40/54] ipc: allow allocation of an ipc object with desired identifier Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 41/54] ipc: helpers to save and restore kern_ipc_perm structures Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 42/54] ipc namespace: save and restore ipc namespace basics Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 43/54] sysvipc-shm: checkpoint Oren Laadan
[not found] ` <1240961064-13991-44-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-15 19:20 ` Serge E. Hallyn
2009-04-28 23:24 ` [RFC v14][PATCH 44/54] sysvipc-shm: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 45/54] sysvipc-shm: export interface from ipc/shm.c to delete ipc shm Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 46/54] sysvipc-shm: correctly handle deleted (active) ipc shared memory Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 47/54] sysvipc-msg: make 'struct msg_msgseg' visible in ipc/util.h Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 48/54] sysvipc-msq: checkpoint Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 49/54] sysvipc-msq: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 50/54] sysvipc-sem: export interface from ipc/sem.c to cleanup ipc sem Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 51/54] sysvipc-sem: checkpoint Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 52/54] sysvipc-sem: restart Oren Laadan
2009-04-28 23:24 ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-container checkpoint Oren Laadan
[not found] ` <1240961064-13991-54-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 17:26 ` Dave Hansen
2009-05-07 3:50 ` Sukadev Bhattiprolu
[not found] ` <20090507035026.GB6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07 4:11 ` Oren Laadan
[not found] ` <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-07 6:13 ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint Sukadev Bhattiprolu
[not found] ` <20090507061321.GA13725-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07 6:24 ` Sukadev Bhattiprolu
2009-05-07 21:45 ` Matt Helsley
[not found] ` <20090507214501.GA29671-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-08 13:44 ` Oren Laadan
2009-05-08 4:56 ` Sukadev Bhattiprolu
[not found] ` <20090508045622.GA31731-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-08 8:12 ` [RFC v14][PATCH 53/54] Detect resource leaks forwhole-containercheckpoint Matt Helsley
2009-04-28 23:24 ` [RFC v14][PATCH 54/54] Report failures during checkpoint as an object in the output stream Oren Laadan
2009-04-29 8:18 ` [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Louis Rilling
[not found] ` <20090429081815.GA1813-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-04-29 22:47 ` Oren Laadan [this message]
[not found] ` <49F8D8FC.8010400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-30 9:41 ` Louis Rilling
[not found] ` <20090430094106.GC13896-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-04 8:03 ` Matthieu Fertré
[not found] ` <49FEA136.2040406-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2009-05-04 9:06 ` Oren Laadan
[not found] ` <49FEB01B.208-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-04 9:17 ` Matthieu Fertré
2009-05-04 13:01 ` Serge E. Hallyn
[not found] ` <20090504130108.GA21521-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 20:13 ` Oren Laadan
2009-05-05 8:20 ` Louis Rilling
[not found] ` <20090505082057.GA11377-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-05 13:49 ` Serge E. Hallyn
[not found] ` <20090505134920.GB10136-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-05 14:26 ` Louis Rilling
2009-05-04 19:13 ` Oren Laadan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49F8D8FC.8010400@cs.columbia.edu \
--to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
--cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.