Linux Container Development
 help / color / mirror / Atom feed
From: "Matthieu Fertré" <matthieu.fertre-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Alexey Dobriyan
	<adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Dave Hansen
	<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Subject: Re: [RFC v14][PATCH 00/54] Kernel based checkpoint/restart
Date: Mon, 04 May 2009 11:17:00 +0200	[thread overview]
Message-ID: <49FEB28C.90301@kerlabs.com> (raw)
In-Reply-To: <49FEB01B.208-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5832 bytes --]

Oren Laadan a écrit :
> 
> Matthieu Fertré wrote:
>> Hi,
>>
>> Louis Rilling a écrit :
>>> On 29/04/09 18:47 -0400, Oren Laadan wrote:
>>>> Hi Louis,
>>>>
>>>> Louis Rilling wrote:
>>>>> Hi,
>>>>>
>>>>> On 28/04/09 19:23 -0400, Oren Laadan wrote:
>>>>>> Here is the latest and greatest of checkpoint/restart (c/r) patchset.
>>>>>> The logic and image format reworked and simplified, code refactored,
>>>>>> support for PPC, s390, sysvipc, shared memory of all sorts, namespaces
>>>>>> (uts and ipc).
>>>>> I should have asked before, but what are the reasons to checkpoint SYSV IPCs
>>>>> in the same file/stream as tasks? Would it be better to checkpoint them
>>>>> independently, like the file system state?
>>>>>
>>>>> In Kerrighed we chose to checkpoint SYSV IPCs independently, a bit like the file
>>>>> system state, because SYSV IPCs objects' lifetime do not depend on tasks
>>>>> lifetime, and we can gain more flexibility this way. In particular we envision
>>>>> cases in which two applications share a state in a SYSV SHM (something like a
>>>>> producer-consumer scheme), but do not need to be checkpointed together. In such
>>>>> a case the SYSV SHM itself could even need more high-availability (using
>>>>> active replication) than a checkpoint/restart facility.
>>>>>
>>>> Thanks for the feedback, this is actually an interesting idea.
>>>>
>>>> Indeed in the past I also considered SYSV IPC to be a "global" resource
>>>> that was checkpointed before iterating through the tasks.
>>>>
>>>> However, in the presence of namespaces, the lifetime of an IPC namespace
>>>> does depend on on tasks lifetime - when the last task referring to a
>>>> given namespace exits - that namespace is destroyed. Of course, the
>>>> root namespace is truly global, because init(1) never exits.
>>>>
>>>> What would 'checkpoint them independently' mean in this case ?
>>> I mean that the producer and the consumer could have separate checkpointing
>>> policies (if any), and the IPC SHM as well.
>>>
>>>> In your use-case, can you restart either application without first
>>>> restoring the relevant SYSVIPC ?
>>> Probably not.
>>>
>> Well, it depends. It has no sense to restart the application without
>> restoring the relevant SHM but it may have for a message queue (this is
>> application specific of course). Message queue is not linked to the
>> process, it can disappear during the life of the application.
> 
> Agreed - the concern regards mainly the SHM case.
> 
>>>> Can you think of other use-cases for such a division ?  Am I right to
>>>> guess that your use case is specific to the distributed (and SSI-)
>>>> nature of your system ?  (Active-replication of SYSV_SHM sounds
>>>> awfully related to DSM :)
>>> The case of active-replication may be specific to DSM-based systems, but the
>>> case of independent policies is already interesting in standalone boxes.
>>>
>>>> While not focusing on such use cases, I want to keep the design flexible
>>>> enough to not exclude them a-priori, and be able to address them later
>>>> on. Indeed, the code is split such that the the function to save a given
>>>> IPC namespace does not depend on the task that uses it. Future code
>>>> could easily use the same functionality.
>>>>
>>>> One way to be flexible to support your use case, is by having some
>>>> mechanism in place to select whether a resource (virtually any) is
>>>> to be chekcpointed/restored.
>>>>
>>>> For example, you could imagine checkpoint(..., CHECKPOINT_SYSVIPC)
>>>> to checkpoint (also) IPC, and not checkpoint IPC in its absence.
>>>>
>>>> So normally you'd have checkpoint(..., CHECKPOINT_ALL). When you don't
>>>> want IPC, you'd use CHECKPOINT_ALL & ~CHECKPOINT_SYSVIPC. When you
>>>> want only IPC, you'd use CHECKPOINT_SYSVIPC only.
>>>>
>>>> Same thing for restart, only that it will get trickier in the "only IPC"
>>>> case, since you will need to tell which IPC namespace is affected.
>>>>
>>>> Also, I envision a task saying cradvise(CHECKPOINT_SYSVIPC, false),
>>>> telling the kernel to not c/r its IPC namespace. (Or any other
>>>> resource). Again there would need to be a way to add a restored
>>>> namespace.
>>>>
>>>> Does this address your concerns ?
>>> Yes this sounds flexible enough. Thanks for taking this into account.
>> I see one drawback with this approach if you allow checkpoint of
>> application that is not isolated in a container. In that case, you may
>> want to select which IPC objects to dump to not dump all the IPC objects
>> living in the system. Indeed, this is why we have chosen in Kerrighed to
>> checkpoint IPC objects independently of tasks, since we have no
>> container/namespaces support currently.
> 
> I assume that in this case it will be the application itself that
> will somehow tell the system which specific sysvipc objects (ids) it
> cares about.

Sure, the system can not know it.

> 
> (I'm not sure how would the system otherwise know what to dump and
> what to leave out).
> 
> I originally proposed the construct of cradvise() syscall to handle
> exactly those cases where the application would like to advise the
> kernel about certain resources. So, extending the previous example,
> a task may call something like:
> 
>    cradvise(CHECKPOINT_SYSVIPC_SHM, false);  /* generally skip shm */
>    cradvise(CHECKPOINT_SYSVIPC_SHMID, id, true);  /* but include this */
> 
> or:
>    cradvise(CHECKPOINT_SYSVIPC_SHM, true);  /* generally include shm */
>    cradvise(CHECKPOINT_SYSVIPC_SHMID, id, false);  /* but skip this */
> 
> Anyway, these are just examples of the concept and what sort of generic
> interface can be used to implement it; don't pick on the details...


Ok, seems good :)

Thanks,

Matthieu



[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

[-- Attachment #2: Type: text/plain, Size: 206 bytes --]

_______________________________________________
Containers mailing list
Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
https://lists.linux-foundation.org/mailman/listinfo/containers

  parent reply	other threads:[~2009-05-04  9:17 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-28 23:23 [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Oren Laadan
     [not found] ` <1240961064-13991-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-28 23:23   ` [RFC v14][PATCH 01/54] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 02/54] Checkpoint/restart: initial documentation Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 03/54] Make file_pos_read/write() public Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 04/54] General infrastructure for checkpoint restart Oren Laadan
     [not found]     ` <1240961064-13991-5-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  0:58       ` Serge E. Hallyn
     [not found]         ` <20090429005826.GA23583-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29 17:49           ` Oren Laadan
     [not found]             ` <49F8932D.4040506-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 18:15               ` Serge E. Hallyn
2009-04-29 17:12       ` Serge E. Hallyn
2009-05-06 20:39       ` Sukadev Bhattiprolu
     [not found]         ` <20090506203955.GA6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-06 20:57           ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 05/54] x86 support for checkpoint/restart Oren Laadan
     [not found]     ` <1240961064-13991-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:12       ` Dave Hansen
2009-04-28 23:23   ` [RFC v14][PATCH 06/54] Introduce method 'checkpoint' in struct vm_operations_struct Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 07/54] cr: extend arch_setup_additional_pages() Oren Laadan
     [not found]     ` <1240961064-13991-8-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:13       ` Dave Hansen
2009-05-01 15:42         ` Serge E. Hallyn
     [not found]           ` <20090501154220.GA26771-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-01 15:57             ` Dave Hansen
2009-05-01 16:18               ` Serge E. Hallyn
     [not found]                 ` <20090501161813.GA27516-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04  7:25                   ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 08/54] Dump memory address space Oren Laadan
     [not found]     ` <1240961064-13991-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  4:11       ` Serge E. Hallyn
     [not found]         ` <20090429041128.GA28018-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-04-29  6:42           ` Guenter Roeck
     [not found]             ` <20090429064241.GA17482-gvzKVTG1yJJBDgjK7y7TUQ@public.gmane.org>
2009-04-29 20:00               ` Oren Laadan
2009-04-30  4:54       ` Matt Helsley
2009-05-01 15:25       ` Dave Hansen
2009-05-01 15:27       ` Dave Hansen
2009-05-04  7:58         ` Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 09/54] Restore " Oren Laadan
     [not found]     ` <1240961064-13991-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 15:28       ` Dave Hansen
2009-04-28 23:23   ` [RFC v14][PATCH 10/54] Infrastructure for shared objects Oren Laadan
     [not found]     ` <1240961064-13991-11-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  1:03       ` Serge E. Hallyn
2009-04-29 16:21       ` Serge E. Hallyn
2009-04-28 23:23   ` [RFC v14][PATCH 11/54] Introduce 'checkpoint' method in 'struct file_operations' Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 12/54] Dump open file descriptors Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 13/54] add generic checkpoint f_op to ext fses Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 14/54] Restore open file descriptors Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 15/54] Record 'struct file' object instead of the file name for VMAs Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 16/54] External checkpoint of a task other than ourself Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 17/54] c/r of restart-blocks: export functionality used in next patch Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 18/54] c/r of restart-blocks Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 19/54] Checkpoint multiple processes Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 20/54] Restart " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 21/54] Define subtree flag and unpriv_allowed sysctl Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 22/54] Checkpoint open pipes Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 23/54] Restore " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 24/54] Prepare to support shared memory Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 25/54] Dump anonymous- and file-mapped- " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 26/54] Restore " Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 27/54] s390: Expose a constant for the number of words representing the CRs Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 28/54] c/r: Add CKPT_COPY() macro (v4) Oren Laadan
2009-04-28 23:23   ` [RFC v14][PATCH 29/54] s390: define s390-specific checkpoint-restart code (v7) Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 30/54] powerpc: provide APIs for validating and updating DABR Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 31/54] powerpc: checkpoint/restart implementation Oren Laadan
     [not found]     ` <1240961064-13991-32-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29  6:54       ` Nathan Lynch
     [not found]         ` <m34ow8ueyk.fsf-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org>
2009-04-29 15:49           ` Serge E. Hallyn
2009-04-29 18:05           ` Oren Laadan
     [not found]             ` <49F896E8.7020802-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:55               ` Nathan Lynch
2009-04-29 18:18           ` Oren Laadan
     [not found]             ` <49F899E1.2030207-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-29 20:33               ` Nathan Lynch
2009-04-28 23:24   ` [RFC v14][PATCH 32/54] powerpc: wire up checkpoint and restart syscalls Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 33/54] powerpc: enable checkpoint support in Kconfig Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 34/54] Export fs/exec.c:exec_mmap() Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 35/54] Support for share memory address spaces Oren Laadan
     [not found]     ` <1240961064-13991-36-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-20 17:55       ` Dave Hansen
2009-05-20 18:23         ` Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 36/54] Make ckpt_may_checkpoint_task() check each namespace individually Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 37/54] c/r: Add UTS support (v6) Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 38/54] Stub implementation of IPC namespace c/r Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 39/54] deferqueue: generic queue to defer work Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 40/54] ipc: allow allocation of an ipc object with desired identifier Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 41/54] ipc: helpers to save and restore kern_ipc_perm structures Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 42/54] ipc namespace: save and restore ipc namespace basics Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 43/54] sysvipc-shm: checkpoint Oren Laadan
     [not found]     ` <1240961064-13991-44-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-15 19:20       ` Serge E. Hallyn
2009-04-28 23:24   ` [RFC v14][PATCH 44/54] sysvipc-shm: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 45/54] sysvipc-shm: export interface from ipc/shm.c to delete ipc shm Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 46/54] sysvipc-shm: correctly handle deleted (active) ipc shared memory Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 47/54] sysvipc-msg: make 'struct msg_msgseg' visible in ipc/util.h Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 48/54] sysvipc-msq: checkpoint Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 49/54] sysvipc-msq: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 50/54] sysvipc-sem: export interface from ipc/sem.c to cleanup ipc sem Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 51/54] sysvipc-sem: checkpoint Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 52/54] sysvipc-sem: restart Oren Laadan
2009-04-28 23:24   ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-container checkpoint Oren Laadan
     [not found]     ` <1240961064-13991-54-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-01 17:26       ` Dave Hansen
2009-05-07  3:50       ` Sukadev Bhattiprolu
     [not found]         ` <20090507035026.GB6003-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07  4:11           ` Oren Laadan
     [not found]             ` <4A025F7D.3050403-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-07  6:13               ` [RFC v14][PATCH 53/54] Detect resource leaks for whole-containercheckpoint Sukadev Bhattiprolu
     [not found]                 ` <20090507061321.GA13725-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-07  6:24                   ` Sukadev Bhattiprolu
2009-05-07 21:45                   ` Matt Helsley
     [not found]                     ` <20090507214501.GA29671-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-08 13:44                       ` Oren Laadan
2009-05-08  4:56               ` Sukadev Bhattiprolu
     [not found]                 ` <20090508045622.GA31731-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-05-08  8:12                   ` [RFC v14][PATCH 53/54] Detect resource leaks forwhole-containercheckpoint Matt Helsley
2009-04-28 23:24   ` [RFC v14][PATCH 54/54] Report failures during checkpoint as an object in the output stream Oren Laadan
2009-04-29  8:18   ` [RFC v14][PATCH 00/54] Kernel based checkpoint/restart Louis Rilling
     [not found]     ` <20090429081815.GA1813-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-04-29 22:47       ` Oren Laadan
     [not found]         ` <49F8D8FC.8010400-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-04-30  9:41           ` Louis Rilling
     [not found]             ` <20090430094106.GC13896-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-04  8:03               ` Matthieu Fertré
     [not found]                 ` <49FEA136.2040406-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org>
2009-05-04  9:06                   ` Oren Laadan
     [not found]                     ` <49FEB01B.208-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2009-05-04  9:17                       ` Matthieu Fertré [this message]
2009-05-04 13:01                       ` Serge E. Hallyn
     [not found]                         ` <20090504130108.GA21521-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-04 20:13                           ` Oren Laadan
2009-05-05  8:20                           ` Louis Rilling
     [not found]                             ` <20090505082057.GA11377-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-05-05 13:49                               ` Serge E. Hallyn
     [not found]                                 ` <20090505134920.GB10136-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-05-05 14:26                                   ` Louis Rilling
2009-05-04 19:13   ` Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49FEB28C.90301@kerlabs.com \
    --to=matthieu.fertre-aw0bnhfmbspbdgjk7y7tuq@public.gmane.org \
    --cc=adobriyan-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox