All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Lezcano <dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	Dave Hansen
	<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Andrey Mirkin <major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
Subject: Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart
Date: Mon, 20 Oct 2008 18:37:32 +0200	[thread overview]
Message-ID: <48FCB3CC.9030804@fr.ibm.com> (raw)
In-Reply-To: <48FCA97C.1040108-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>

Oren Laadan wrote:
> 
> Daniel Lezcano wrote:
>> Louis Rilling wrote:
>>> On Fri, Oct 17, 2008 at 04:33:03PM -0700, Dave Hansen wrote:
>>>> On Wed, 2008-09-03 at 14:57 +0400, Andrey Mirkin wrote:
>>>>> This patchset introduces kernel based checkpointing/restart as it is
>>>>> implemented in OpenVZ project. This patchset has limited functionality and
>>>>> are able to checkpoint/restart only single process. Recently Oren Laaden
>>>>> sent another kernel based implementation of checkpoint/restart. The main
>>>>> differences between this patchset and Oren's patchset are:
>>>> Hi Andrey,
>>>>
>>>> I'm curious what you want to happen with this patch set.  Is there
>>>> something specific in Oren's set that deficient which you need
>>>> implemented?  Are there some technical reasons you prefer this code?
>>> To be fair, and since (IIRC) the initial intent was to start with OpenVZ's
>>> approach, shouldn't Oren answer the same questions with respect to Andrey's
>>> patchset?
>>>
>>> I'm afraid that we are forgetting to take the best from both approaches...
>> I agree with Louis.
>>
>> I played with Oren's patchset and tryed to port it on x86_64. I was able 
>> to sys_checkpoint/sys_restart but if you remove the restoring of the 
>> general registers, the restart still works. I am not an expert on asm, 
>> but my hypothesis is when we call sys_checkpoint the registers are saved 
>> on the stack by the syscall and when we restore the memory of the 
>> process, we restore the stack and the stacked registers are restored 
>> when exiting the sys_restart. That make me feel there is an important 
>> gap between external checkpoint and internal checkpoint.
> 
> This is a misconception: my patches are not "internal checkpoint". My
> patches are basically "external checkpoint" by design, which *also*
> accommodates self-checkpointing (aka internal). The same holds for the
> restart. The implementation is demonstrated with "self-checkpoint" to
> avoid complicating things at this early stage of proof-of-concept.

Yep, I read your patchset :)

I just want to clarify what we want to demonstrate with this patchset 
for the proof-of-concept ? A self CR does not show what are the 
complicate parts of the CR, we are just showing we can dump the memory 
from the kernel and do setcontext/getcontext.

We state at the container mini-summit on an approach:

    1. Pre-dump
    2. Freeze the container
    3. Dump
    4. Thaw/Kill the container
    5. Post-dump

We already have the freezer, and we can forget for now pre-dump and 
post-dump.

IMHO, for the proof-of-concept we should do a minimal CR (like you did), 
but conforming with these 5 points, but that means we have to do an 
external checkpoint.

If the POC conforms with that, the patchset will be a little different 
and that will show what are the difficult part for restarting a process, 
especially to restart it at the frozen state :) and that will give an 
idea from 10000 feets of the big picture.

> For multiple processes all that is needed is a container and a loop
> on the checkpoint side, and a method to recreate processes on the
> restart side. Andrew suggests to do it in kernel space, I still have
> doubts.

A question to Andrey, do you, in OpenVZ, restart "externally" or it is 
the first process of the pid namespace which calls sys_restart and then 
  populates the pid namespace ?

> While I held out the multi-process part of the patch so far because I
> was explicitly asked to do it, it seems like this would be a good time
> to push it out and get feedback.

IMHO it is too soon...

WARNING: multiple messages have this Message-ID (diff)
From: Daniel Lezcano <dlezcano@fr.ibm.com>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Louis.Rilling@kerlabs.com, linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org,
	Andrey Mirkin <major@openvz.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>
Subject: Re: [PATCH 0/9] OpenVZ kernel based checkpointing/restart
Date: Mon, 20 Oct 2008 18:37:32 +0200	[thread overview]
Message-ID: <48FCB3CC.9030804@fr.ibm.com> (raw)
In-Reply-To: <48FCA97C.1040108@cs.columbia.edu>

Oren Laadan wrote:
> 
> Daniel Lezcano wrote:
>> Louis Rilling wrote:
>>> On Fri, Oct 17, 2008 at 04:33:03PM -0700, Dave Hansen wrote:
>>>> On Wed, 2008-09-03 at 14:57 +0400, Andrey Mirkin wrote:
>>>>> This patchset introduces kernel based checkpointing/restart as it is
>>>>> implemented in OpenVZ project. This patchset has limited functionality and
>>>>> are able to checkpoint/restart only single process. Recently Oren Laaden
>>>>> sent another kernel based implementation of checkpoint/restart. The main
>>>>> differences between this patchset and Oren's patchset are:
>>>> Hi Andrey,
>>>>
>>>> I'm curious what you want to happen with this patch set.  Is there
>>>> something specific in Oren's set that deficient which you need
>>>> implemented?  Are there some technical reasons you prefer this code?
>>> To be fair, and since (IIRC) the initial intent was to start with OpenVZ's
>>> approach, shouldn't Oren answer the same questions with respect to Andrey's
>>> patchset?
>>>
>>> I'm afraid that we are forgetting to take the best from both approaches...
>> I agree with Louis.
>>
>> I played with Oren's patchset and tryed to port it on x86_64. I was able 
>> to sys_checkpoint/sys_restart but if you remove the restoring of the 
>> general registers, the restart still works. I am not an expert on asm, 
>> but my hypothesis is when we call sys_checkpoint the registers are saved 
>> on the stack by the syscall and when we restore the memory of the 
>> process, we restore the stack and the stacked registers are restored 
>> when exiting the sys_restart. That make me feel there is an important 
>> gap between external checkpoint and internal checkpoint.
> 
> This is a misconception: my patches are not "internal checkpoint". My
> patches are basically "external checkpoint" by design, which *also*
> accommodates self-checkpointing (aka internal). The same holds for the
> restart. The implementation is demonstrated with "self-checkpoint" to
> avoid complicating things at this early stage of proof-of-concept.

Yep, I read your patchset :)

I just want to clarify what we want to demonstrate with this patchset 
for the proof-of-concept ? A self CR does not show what are the 
complicate parts of the CR, we are just showing we can dump the memory 
from the kernel and do setcontext/getcontext.

We state at the container mini-summit on an approach:

    1. Pre-dump
    2. Freeze the container
    3. Dump
    4. Thaw/Kill the container
    5. Post-dump

We already have the freezer, and we can forget for now pre-dump and 
post-dump.

IMHO, for the proof-of-concept we should do a minimal CR (like you did), 
but conforming with these 5 points, but that means we have to do an 
external checkpoint.

If the POC conforms with that, the patchset will be a little different 
and that will show what are the difficult part for restarting a process, 
especially to restart it at the frozen state :) and that will give an 
idea from 10000 feets of the big picture.

> For multiple processes all that is needed is a container and a loop
> on the checkpoint side, and a method to recreate processes on the
> restart side. Andrew suggests to do it in kernel space, I still have
> doubts.

A question to Andrey, do you, in OpenVZ, restart "externally" or it is 
the first process of the pid namespace which calls sys_restart and then 
  populates the pid namespace ?

> While I held out the multi-process part of the patch so far because I
> was explicitly asked to do it, it seems like this would be a good time
> to push it out and get feedback.

IMHO it is too soon...


  parent reply	other threads:[~2008-10-20 16:37 UTC|newest]

Thread overview: 138+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-03 10:57 [PATCH 0/9] OpenVZ kernel based checkpointing/restart Andrey Mirkin
2008-09-03 10:57 ` Andrey Mirkin
2008-09-03 10:57 ` [PATCH 1/9] Introduce trivial sys_checkpoint and sys_restore system calls Andrey Mirkin
     [not found]   ` <1220439476-16465-2-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57     ` [PATCH 2/9] Make checkpoint/restart functionality modular Andrey Mirkin
2008-09-03 11:44     ` [PATCH 1/9] Introduce trivial sys_checkpoint and sys_restore system calls Cedric Le Goater
2008-09-03 10:57   ` [PATCH 2/9] Make checkpoint/restart functionality modular Andrey Mirkin
     [not found]     ` <1220439476-16465-3-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57       ` [PATCH 3/9] Introduce context structure needed during checkpointing/restart Andrey Mirkin
2008-09-03 14:27       ` [PATCH 2/9] Make checkpoint/restart functionality modular Serge E. Hallyn
2008-09-03 10:57     ` [PATCH 3/9] Introduce context structure needed during checkpointing/restart Andrey Mirkin
2008-09-03 10:57       ` [PATCH 4/9] Introduce container dump function Andrey Mirkin
     [not found]         ` <1220439476-16465-5-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57           ` [PATCH 5/9] Introduce function to dump process Andrey Mirkin
2008-09-03 10:57             ` Andrey Mirkin
     [not found]             ` <1220439476-16465-6-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57               ` [PATCH 6/9] Introduce functions to dump mm Andrey Mirkin
2008-09-03 10:57                 ` Andrey Mirkin
     [not found]                 ` <1220439476-16465-7-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57                   ` [PATCH 7/9] Introduce function for restarting a container Andrey Mirkin
2008-09-03 10:57                     ` Andrey Mirkin
     [not found]                     ` <1220439476-16465-8-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57                       ` [PATCH 8/9] Introduce functions to restart a process Andrey Mirkin
2008-09-03 10:57                         ` Andrey Mirkin
     [not found]                         ` <1220439476-16465-9-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57                           ` [PATCH 9/9] Introduce functions to restore mm Andrey Mirkin
2008-09-03 10:57                             ` Andrey Mirkin
2008-09-03 14:32                           ` [PATCH 8/9] Introduce functions to restart a process Louis Rilling
2008-09-03 14:32                         ` Louis Rilling
2008-09-13 17:34                           ` Pavel Machek
     [not found]                           ` <20080903143248.GU14473-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-09-13 17:34                             ` Pavel Machek
2008-09-03 14:17                   ` [PATCH 6/9] Introduce functions to dump mm Louis Rilling
2008-09-03 14:17                 ` Louis Rilling
2008-09-03 14:23           ` [PATCH 4/9] Introduce container dump function Serge E. Hallyn
2008-09-03 14:23         ` Serge E. Hallyn
     [not found]           ` <20080903142308.GB13425-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-09-03 14:45             ` Andrey Mirkin
2008-09-03 14:45               ` Andrey Mirkin
     [not found]       ` <1220439476-16465-4-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57         ` Andrey Mirkin
2008-09-03 12:29         ` [PATCH 3/9] Introduce context structure needed during checkpointing/restart Matthieu Fertré
2008-09-03 12:29           ` Matthieu Fertré
     [not found]           ` <48BE8315.6030907-7Ky3UMAtGjA@public.gmane.org>
2008-09-03 14:11             ` Andrey Mirkin
2008-09-03 14:11               ` Andrey Mirkin
2008-09-03 13:56         ` Louis Rilling
2008-09-03 14:13         ` Cedric Le Goater
2008-09-03 13:56       ` Louis Rilling
     [not found]         ` <20080903135616.GR14473-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-09-03 14:07           ` Andrey Mirkin
2008-09-03 14:07             ` Andrey Mirkin
2008-09-03 14:13       ` Cedric Le Goater
     [not found]         ` <48BE9B74.7010600-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 14:29           ` Andrey Mirkin
2008-09-03 14:29             ` Andrey Mirkin
2008-09-03 14:27     ` [PATCH 2/9] Make checkpoint/restart functionality modular Serge E. Hallyn
     [not found]       ` <20080903142720.GC13425-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-09-03 14:51         ` Andrey Mirkin
2008-09-03 14:51           ` Andrey Mirkin
2008-09-03 11:44   ` [PATCH 1/9] Introduce trivial sys_checkpoint and sys_restore system calls Cedric Le Goater
2008-09-03 13:05     ` [Devel] " Andrey Mirkin
     [not found]     ` <48BE7885.3070609-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 13:05       ` Andrey Mirkin
2008-09-03 12:28 ` [PATCH 0/9] OpenVZ kernel based checkpointing/restart Cedric Le Goater
2008-09-03 13:59   ` [Devel] " Andrey Mirkin
     [not found]     ` <200809031759.29132.major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-04 22:55       ` Dave Hansen
2008-09-04 22:55     ` Dave Hansen
     [not found]   ` <48BE82F9.4020808-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 13:59     ` Andrey Mirkin
2008-09-03 14:18     ` Serge E. Hallyn
2008-09-03 14:18   ` Serge E. Hallyn
2008-09-03 13:49 ` Louis Rilling
2008-09-03 14:06   ` Louis Rilling
2008-09-03 14:19     ` Andrey Mirkin
     [not found]     ` <20080903140636.GS14473-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-09-03 14:19       ` Andrey Mirkin
2008-09-03 14:26       ` Cedric Le Goater
2008-09-03 14:26     ` Cedric Le Goater
     [not found]       ` <48BE9E95.3020706-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 14:53         ` Andrey Mirkin
2008-09-03 14:53           ` Andrey Mirkin
     [not found]   ` <20080903134951.GQ14473-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-09-03 14:06     ` Louis Rilling
2008-09-04  8:14 ` Oren Laadan
2008-09-04 14:05 ` Dave Hansen
     [not found] ` <1220439476-16465-1-git-send-email-major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-09-03 10:57   ` [PATCH 1/9] Introduce trivial sys_checkpoint and sys_restore system calls Andrey Mirkin
2008-09-03 12:28   ` [PATCH 0/9] OpenVZ kernel based checkpointing/restart Cedric Le Goater
2008-09-03 13:49   ` Louis Rilling
2008-09-04  8:14   ` Oren Laadan
2008-09-04 14:05   ` Dave Hansen
2008-10-17 23:33   ` Dave Hansen
2008-10-17 23:33 ` Dave Hansen
2008-10-20 11:10   ` Louis Rilling
2008-10-20 11:10   ` Louis Rilling
2008-10-20 13:25     ` Daniel Lezcano
     [not found]       ` <48FC86B2.8000606-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-10-20 13:48         ` Cedric Le Goater
2008-10-20 13:48           ` Cedric Le Goater
     [not found]           ` <48FC8C30.6040409-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-10-20 13:49             ` Daniel Lezcano
2008-10-20 13:49           ` Daniel Lezcano
2008-10-20 15:53         ` Oren Laadan
2008-10-20 15:53           ` Oren Laadan
2008-10-20 16:51           ` Serge E. Hallyn
     [not found]           ` <48FCA97C.1040108-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-20 16:37             ` Daniel Lezcano [this message]
2008-10-20 16:37               ` Daniel Lezcano
     [not found]               ` <48FCB3CC.9030804-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-10-20 17:23                 ` Serge E. Hallyn
2008-10-20 17:23                   ` Serge E. Hallyn
     [not found]                   ` <20081020172358.GA29092-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-21  0:18                     ` Oren Laadan
2008-10-21  0:18                   ` Oren Laadan
2008-10-21  0:58                     ` Serge E. Hallyn
     [not found]                     ` <48FD1FBC.5050408-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-21  0:58                       ` Serge E. Hallyn
2008-10-21 13:24                       ` Daniel Lezcano
2008-10-21 13:24                     ` Daniel Lezcano
2008-10-27 14:45                 ` [Devel] " Andrey Mirkin
2008-10-27 14:45               ` Andrey Mirkin
2008-10-20 16:51             ` Serge E. Hallyn
2008-10-21  9:36             ` Cedric Le Goater
2008-10-21  9:36           ` Cedric Le Goater
     [not found]     ` <20081020111002.GQ15171-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-10-20 13:25       ` Daniel Lezcano
2008-10-20 16:36       ` Dave Hansen
2008-10-20 16:36     ` Dave Hansen
2008-10-20 12:14   ` [Devel] " Andrey Mirkin
2008-10-20 12:14   ` Andrey Mirkin
     [not found]     ` <200810201614.36911.major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-10-20 15:55       ` Dave Hansen
2008-10-20 17:17       ` Oren Laadan
2008-10-20 15:55     ` Dave Hansen
2008-10-27 14:07       ` Andrey Mirkin
2008-10-27 14:07       ` Andrey Mirkin
2008-10-27 14:39         ` Oren Laadan
2008-10-30  6:02           ` Andrey Mirkin
2008-10-30 11:47             ` Louis Rilling
2008-10-30 17:08               ` Dave Hansen
2008-10-30 18:01                 ` Louis Rilling
2008-10-30 18:01                 ` Louis Rilling
2008-10-30 18:28                   ` Oren Laadan
     [not found]                   ` <20081030180133.GN15171-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-10-30 18:28                     ` Oren Laadan
2008-10-30 17:45               ` Oren Laadan
     [not found]                 ` <4909F2B5.7040907-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 18:14                   ` Louis Rilling
2008-10-30 18:14                     ` Louis Rilling
     [not found]                     ` <20081030181418.GO15171-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-10-30 18:32                       ` Oren Laadan
2008-10-30 18:32                     ` Oren Laadan
     [not found]                       ` <4909FDD3.5090806-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-31 10:37                         ` Louis Rilling
2008-10-31 10:37                       ` Louis Rilling
     [not found]               ` <20081030114747.GL15171-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-10-30 17:08                 ` Dave Hansen
2008-10-30 17:45                 ` Oren Laadan
2008-10-30 14:08             ` Serge E. Hallyn
     [not found]             ` <200810300902.47067.major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-10-30 11:47               ` Louis Rilling
2008-10-30 14:08               ` Serge E. Hallyn
2008-10-30 17:03               ` Dave Hansen
2008-10-30 17:03                 ` Dave Hansen
     [not found]           ` <4905D2AD.1070309-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30  6:02             ` Andrey Mirkin
2008-11-03 19:35         ` Oren Laadan
     [not found]         ` <200810271707.13580.major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2008-10-27 14:39           ` Oren Laadan
2008-11-03 19:35           ` Oren Laadan
2008-10-20 17:17     ` Oren Laadan
     [not found]       ` <48FCBD24.7070902-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-27 14:38         ` Andrey Mirkin
2008-10-27 14:38       ` Andrey Mirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48FCB3CC.9030804@fr.ibm.com \
    --to=dlezcano-nmtc/0zbporqt0dzr+alfa@public.gmane.org \
    --cc=Louis.Rilling-aw0BnHfMbSpBDgjK7y7TUQ@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=major-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org \
    --cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.