From: Oren Laadan <orenl@cs.columbia.edu>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
containers@lists.linux-foundation.org, xemul@parallels.com
Subject: Re: [PATCH 18/38] C/R: core stuff
Date: Wed, 27 May 2009 16:56:27 -0400 [thread overview]
Message-ID: <4A1DA8FB.3000306@cs.columbia.edu> (raw)
In-Reply-To: <20090526193503.GB11909@x200.localdomain>
Alexey Dobriyan wrote:
> On Tue, May 26, 2009 at 08:16:44AM -0500, Serge E. Hallyn wrote:
>> Quoting Alexey Dobriyan (adobriyan@gmail.com):
>>> Introduction
>>> ------------
>>> Checkpoint/restart (C/R from now) allows to dump group of processes to disk
>>> for various reasons like saving process state in case of box failure or
>>> restoration of group of processes on another or same machine later.
>>>
>>> Unlike, let's say, hypervisor C/R style which only needs to freeze guest kernel
>>> and dump more or less raw pages, proposed C/R doesn't require hypervisor.
>>> For that C/R code needs to know about all little and big intimate kernel details.
>>>
>>> The good thing is that not all details needs to be serialized and saved
>>> like, say, readahead state. The bad things is still quite a few things
>>> need to be.
>> Hi Alexey,
>>
>> the last time you posted this, I went through and tried to discern the
>> meaningful differences between yours and Oren's patchsets. Then I sent some
>> patches to Oren to make his set configurable to act more like yours. And Oren
>> took them! But now you resend this patchset with no real changelog, no
>> acknowledgment that Oren's set even exists
>
> Is this a requirement? Everybody following topic already knows about
> Oren's patchset.
Some people do ack other people's work. See for example patches #1
and #24 in my recent post. You're welcome.
>
>> - or is much farther along and pretty widely reviewed and tested (which is
>> only because he started earlier and, when we asked for your counterpatches
>> at an earlier stage, you would never reply) - or, most importantly, what
>> it is that you think your patchset does that his does not and cannot.
>
> There are differences. And they're not small like you're trying to describe
> but pretty big compared the scale of the problem.
I've asked before, and I repeat now: can you enumerate these "big"
scary differences that make it such a "big" problem ?
So far, we identified two main "design" issues -
1) Whether or not allow c/r of sub-container (partial hierarchy)
2) Creation of restarting process hierarchy in kernel or in userspace
As for #1, you are the _only_ one who advocates restricting c/r to
a full container only. I guess you have your reasons, but I'm unsure
what they may be.
On the other hand, there has been a handful of use-cases and opinions
in favor of allowing both capabilities to co-exist. Not the mention
that nearly no additional code is necessary, on the contrary.
As for #2, you didn't even bother to reply to the discussion that I
had started about it. This decision is important to allow future
flexibility of the mechanism, and to address the needs of several
potential users, as seen in that discussion and others. Here, too,
you are the _only_ one that advocates that direction.
And the funniest thing -- *both* decisions can be *easily* overturned
in my patchset. In fact, regarding #2 - either way can be easily done
in it.
So I wonder, what are the "big" issues that bother you so much ?
"if there is a will, there is a way".
>
>> *Why* are you spending your time on this instead of helping with Oren's set?
>
> Because we disagree with some core directions Oren chose.
> ANK literally said: "I don't know how to dump live netns".
Eh... and you have it all sorted out ? (yeah, I do, but not in
this patchset).
>
> So, partly patchset was created so that absolutely nobody will tell us
> to shut up and show the code.
Oh well ... "the" code meaning "your" code I suppose.
>
> The other part, is that I looked at Oren patchset, found quite a lot of
> suspicious, broken and unclean places and decided that it'd be faster
> to start from scratch because sending patches will overhaul like 85% of
> the code.
So you actually took the time to read and review. And then you spent
even more time in .... calculating this number ! Feedback appreciated.
If you looked closely you would have seen that we do address your
concerns over time.
>
> One example, is why CKPT_HDR_CPU and CKPT_RESTART_BLOCK exist at all?
> Should objects in image be only what sharable objects are in kernel
> (expect VMAs, pages and possibly file descriptors)? pt_regs don't exist
> by themselves after all.
A good reason to break it into small pieces is for ease of maintenance
and debugging, as well as in the future easier transition between
incompatible kernel versions. I think it's better than a few-pages-long
single struct. And it encourages more naming of things.
But ... I'm confused ... is this your "big" concern ? Oh well, if
that's what stands in your way, we could even rework that (~1.3% of
the code ? I reckon...).
>
> And since you guys showed that just idea of in-kernel checkpointing is not
> rejected outright, it doesn't mean that you can drag every single idea too.
> Because history shows, that once something (especially user-visible,
> like restart syscall semantics) is in kernel it's nearly impossible
> to cut it out, so it's very-very important to get it right from the very
> beginning.
>
Yes. Let's indeed talk about how to get it "right". Please do
participate in the public discussions and efforts. Working together
would be better for everyone.
> Now here goes second version, with prefixes fixed (kstate_") like Ingo
> suggested and so Linus could look at the code and with C/R code moved
> close to usual code and with more checks added (which you should have
> already!) to not restore null selector in %cs for example.
It is far from perfect. In fact, it's even clearly commented as such,
and exactly there. It would have been helpful if you pointed that
out in a review, or even - god forbid - sent a patch to improve it.
But it works, and it lets people play with a more-than-a-toy
implementation and provide us with important feedback. Oh, and by
the way, it doesn't require that people use containers to try it out.
Pretty handy, don't you think ?
>
>> The code really isn't all that different...
>
>> Maybe you just think that two independently written patchsets will expose
>> more gotchas that we'll need to catch, so you're continuing on this effort
>> under the expectation that eventualy we'll merge the two sets?
>
> Well, it already exposes. Just print both, and watch for differences.
>
>> Honestly, I have great respect for your coding abilities. And if 'voices
>> from on high' tell us to base upon your code, I'd be fine with that, I
>> have no real problems with what I see on yet another cursory look. But
>> given the amount of collective time that's been spent developing, reviewing,
>> and testing Oren's set, it wouldn't make any sense to just jump. So
>> I'd still just like to know how you see this proceeding.
>
> Yes, please, someone decide on "checkpoint semi-live container" issue.
Have you not yet read enough opinion of users that would like to see
this capability ? What would be enough to convince you ? And really,
why not ?
Cheers,
Oren.
next prev parent reply other threads:[~2009-05-27 20:57 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-22 4:54 [PATCH 01/38] cred: #include init.h in cred.h Alexey Dobriyan
2009-05-22 4:54 ` [PATCH 02/38] utsns: extract create_uts_ns() Alexey Dobriyan
2009-05-24 22:37 ` Serge E. Hallyn
2009-05-22 4:54 ` [PATCH 03/38] ipcns 1/4: remove useless get/put while CLONE_NEWIPC Alexey Dobriyan
2009-05-22 9:00 ` Amerigo Wang
2009-05-22 4:54 ` [PATCH 04/38] ipcns 2/4: extract create_ipc_ns() Alexey Dobriyan
2009-05-22 8:59 ` Amerigo Wang
2009-05-22 4:54 ` [PATCH 05/38] ipcns 3/4: make free_ipc_ns() static Alexey Dobriyan
2009-05-24 22:40 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 06/38] ipcns 4/2: move free_ipcs() proto Alexey Dobriyan
2009-05-24 22:49 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 07/38] pidns 1/2: make create_pid_namespace() accept parent pidns Alexey Dobriyan
2009-05-22 9:20 ` Amerigo Wang
2009-05-24 22:44 ` Serge E. Hallyn
2009-06-04 0:20 ` Sukadev Bhattiprolu
2009-05-22 4:55 ` [PATCH 08/38] pidns 2/2: rewrite copy_pid_ns() Alexey Dobriyan
2009-05-22 9:14 ` Amerigo Wang
2009-05-24 22:45 ` Serge E. Hallyn
2009-06-04 0:17 ` Sukadev Bhattiprolu
2009-05-22 4:55 ` [PATCH 09/38] netns 1/2: don't get/put old netns on CLONE_NEWNET Alexey Dobriyan
2009-05-22 6:30 ` David Miller
2009-05-22 4:55 ` [PATCH 10/38] netns 2/2: extract net_create() Alexey Dobriyan
2009-05-22 6:30 ` David Miller
2009-05-22 4:55 ` [PATCH 11/38] nsproxy: extract create_nsproxy() Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 12/38] i386: ifdef out struct thread_struct::fs Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 13/38] x86_64: ifdef out struct thread_struct::ip Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 14/38] Remove struct mm_struct::exe_file et al Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 15/38] dcache: extract and use d_unlinked() Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 16/38] x86: ptrace debugreg checks rewrite Alexey Dobriyan
2009-05-26 23:25 ` Andrew Morton
2009-05-22 4:55 ` [PATCH 17/38] groups: move code to kernel/groups.c Alexey Dobriyan
2009-05-25 0:53 ` Serge E. Hallyn
2009-05-26 14:48 ` Serge E. Hallyn
2009-05-26 18:34 ` Alexey Dobriyan
2009-05-26 23:25 ` Serge E. Hallyn
2009-05-22 4:55 ` [PATCH 18/38] C/R: core stuff Alexey Dobriyan
2009-05-26 13:16 ` Serge E. Hallyn
2009-05-26 19:35 ` Alexey Dobriyan
2009-05-26 23:14 ` Serge E. Hallyn
2009-05-26 23:44 ` Serge E. Hallyn
2009-05-28 15:38 ` Alexey Dobriyan
2009-05-28 18:17 ` Serge E. Hallyn
2009-05-28 22:42 ` Oren Laadan
2009-05-27 18:52 ` Dave Hansen
2009-05-27 20:56 ` Oren Laadan [this message]
2009-05-27 22:17 ` Alexey Dobriyan
2009-05-27 22:40 ` Andrew Morton
2009-05-27 22:45 ` Oren Laadan
2009-05-28 15:33 ` Alexey Dobriyan
2009-05-28 22:20 ` Oren Laadan
2009-05-28 22:33 ` Matt Helsley
2009-05-29 6:01 ` Alexey Dobriyan
2009-05-29 17:26 ` Dave Hansen
2009-05-27 22:25 ` Alexey Dobriyan
2009-05-27 16:28 ` Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 19/38] C/R: multiple tasks Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 20/38] C/R: i386 support Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 21/38] C/R: i386 debug registers Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 22/38] C/R: i386 xstate Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 23/38] C/R: x86_64 support Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 24/38] C/R: x86_64 debug registers Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 25/38] C/R: x86_64 xstate Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 26/38] C/R: nsproxy Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 27/38] C/R: checkpoint/restore struct uts_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 28/38] C/R: formally checkpoint/restore struct ipc_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 29/38] C/R: formally checkpoint/restore struct mnt_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 30/38] C/R: checkpoint/restore struct pid_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 31/38] C/R: formally checkpoint/restore struct net_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 32/38] C/R: checkpoint/restore struct cred Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 33/38] C/R: checkpoint/restore aux groups (structy group_info) Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 34/38] C/R: checkpoint/restore struct user Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 35/38] C/R: checkpoint/restore struct user_namespace Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 36/38] C/R: checkpoint/restore struct pid Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 37/38] C/R: checkpoint/restore opened files Alexey Dobriyan
2009-05-22 4:55 ` [PATCH 38/38] C/R: checkpoint/restart struct sighand_struct Alexey Dobriyan
2009-05-22 5:02 ` [PATCH 01/38] cred: #include init.h in cred.h Alexey Dobriyan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4A1DA8FB.3000306@cs.columbia.edu \
--to=orenl@cs.columbia.edu \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=containers@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=serge@hallyn.com \
--cc=xemul@parallels.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).