From: Nathan Lynch <ntl@pobox.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Tejun Heo <tj@kernel.org>, Oren Laadan <orenl@cs.columbia.edu>,
ksummit-2010-discuss@lists.linux-foundation.org,
linux-kernel@vger.kernel.org
Subject: Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch
Date: Wed, 03 Nov 2010 20:47:38 -0500 [thread overview]
Message-ID: <1288835258.6132.56.camel@tp-t61> (raw)
In-Reply-To: <20101102214706.GA28593@lst.de>
On Tue, 2010-11-02 at 22:47 +0100, Christoph Hellwig wrote:
> Thanks Tejun,
>
> your writeup brought up a lot of the same issues that I see with
> the in-kernel C/R. Various C/R implementations that are entirely
> in userspace or with limited kernel assistance have been in production
> in HPC environments for years.
FWIW there are a couple of kernel-based C/R implementations (BLCR,
OpenVZ) in use in various contexts (not just HPC).
> I think especially for these workloads
> C/R is an extremly useful feature, and a standard implementation would
> do Linux well.
>
> But I think the "transparent" in-kernel one is the wrong approach. It
> tries to give the illusion that C/R will just work, while a lot of
> things are simply not support.
I think this is somewhat true of the implementation under consideration
here (although generally it should fail checkpoints that it can't
restart), but it needn't be true of all possible kernel-based
implementations.
> In this case whitelisting the allowed
> state by requiring special APIs for all I/O (or even just standard
> APIs as long as they are supposed by the C/R lib you're linked against)
> is the more pragmatic, and I think faithful aproach.
I don't think users will go for it. They'll continue to use dodgy
out-of-tree kernel modules and/or LD_PRELOAD hacks instead of porting
their applications to a new library. I think a C/R library is an
"ideal" solution, but it's one that nobody would use - especially in
HPC, unless the library somehow provides better performance.
The namespace/isolation features of Linux (CLONE_NEWPID et al) already
provide a pretty workable basis for creating tractably checkpoint-
and-restartable jobs, with a minimum of performance overhead and
application modification.
> In addition to
> the amount of state not supported despite looking transparant the
> other big problem with the patchset is that it saves the kernel internal
> state which changes all the time from one release to another.
Most of the objects that the patchset saves and restores are right at
the "border" of the user/kernel interface, and they're not apt to change
much quickly (e.g. vma start and end, task sigaltstack info). The
patchset certainly isn't serializing deep internal state such as wait
queues, locks, or reference counts.
> The handwaiving is that a userspace tool will solve it. I'm pretty sure
> that's not the case; it might solve a few cases but the general
> version n to version m conversion is impossible to maintain.
With this I agree, though. But if a change in kernel implementation
details forces an incompatible change in the checkpoint image format, is
that really a big deal? Would it be so bad to say that a checkpoint
image may be restarted only on the same kernel version that created it?
With -stable or enterprise kernels I suspect the issue is unlikely to
come up.
next prev parent reply other threads:[~2010-11-04 1:47 UTC|newest]
Thread overview: 123+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <Pine.LNX.4.64.1011021530470.12128@takamine.ncl.cs.columbia.edu>
2010-11-02 21:35 ` [Ksummit-2010-discuss] checkpoint-restart: naked patch Tejun Heo
2010-11-02 21:47 ` Christoph Hellwig
2010-11-04 1:47 ` Nathan Lynch [this message]
2010-11-04 7:36 ` Tejun Heo
2010-11-04 16:04 ` Gene Cooperman
2010-11-04 20:45 ` Nathan Lynch
2010-11-06 6:48 ` Matt Helsley
2010-11-04 4:34 ` Oren Laadan
2010-11-04 14:25 ` Christoph Hellwig
2010-11-04 3:40 ` Kapil Arya
2010-11-04 8:05 ` Tejun Heo
2010-11-04 16:44 ` Gene Cooperman
2010-11-05 9:28 ` Tejun Heo
2010-11-05 23:18 ` Oren Laadan
2010-11-06 10:13 ` Tejun Heo
2010-11-06 0:36 ` Kapil Arya
2010-11-06 22:55 ` Oren Laadan
2010-11-07 19:42 ` Gene Cooperman
2010-11-07 21:30 ` Oren Laadan
2010-11-07 23:05 ` Gene Cooperman
2010-11-08 3:55 ` Oren Laadan
2010-11-08 16:26 ` Gene Cooperman
2010-11-08 18:14 ` Oren Laadan
2010-11-08 18:37 ` Gene Cooperman
2010-11-08 19:34 ` Oren Laadan
2010-11-08 19:05 ` Dan Smith
2010-11-17 11:14 ` Tejun Heo
2010-11-17 15:33 ` Dan Smith
2010-11-17 15:40 ` Tejun Heo
2010-11-17 17:04 ` Alexey Dobriyan
2010-11-17 10:45 ` Tejun Heo
2010-11-17 12:12 ` Tejun Heo
2010-11-06 5:32 ` Matt Helsley
2010-11-06 15:01 ` Oren Laadan
2010-11-06 20:40 ` Gene Cooperman
2010-11-06 22:41 ` Oren Laadan
2010-11-07 18:49 ` Gene Cooperman
[not found] ` <20101107184927.GF31077-Rl5vdzG4YPwx/1z6v04GWfZ8FUJU4vz8@public.gmane.org>
2010-11-07 21:59 ` Oren Laadan
2010-11-07 21:59 ` Oren Laadan
2010-11-17 11:57 ` Tejun Heo
2010-11-17 15:39 ` Serge E. Hallyn
2010-11-17 15:46 ` Tejun Heo
2010-11-18 9:13 ` Pavel Emelyanov
[not found] ` <4CE4EE21.6050305-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2010-11-18 9:48 ` Tejun Heo
2010-11-18 9:48 ` Tejun Heo
2010-11-18 20:13 ` Jose R. Santos
2010-11-19 3:54 ` Serge Hallyn
2010-11-18 19:53 ` Oren Laadan
2010-11-19 4:10 ` Serge Hallyn
2010-11-19 14:04 ` Tejun Heo
[not found] ` <4CE683E1.6010500-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2010-11-19 14:36 ` Kirill Korotaev
2010-11-19 14:36 ` Kirill Korotaev
[not found] ` <04F4899E-B5C7-4BAF-8F2F-05D507A91408-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2010-11-19 15:33 ` Tejun Heo
2010-11-19 15:33 ` Tejun Heo
2010-11-19 16:00 ` Alexey Dobriyan
2010-11-19 16:01 ` Alexey Dobriyan
2010-11-19 16:10 ` Tejun Heo
2010-11-19 16:25 ` Alexey Dobriyan
2010-11-19 16:06 ` Tejun Heo
2010-11-19 16:16 ` Alexey Dobriyan
2010-11-19 16:19 ` Tejun Heo
2010-11-19 16:27 ` Alexey Dobriyan
[not found] ` <AANLkTin7kd3crS+fTLLea5PhAii7B3dz=n7p7YtQ6d4g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-11-19 16:32 ` Tejun Heo
2010-11-19 16:32 ` Tejun Heo
2010-11-19 16:38 ` Alexey Dobriyan
2010-11-19 16:50 ` Tejun Heo
2010-11-19 16:50 ` Tejun Heo
2010-11-19 16:55 ` Alexey Dobriyan
2010-11-20 17:58 ` Oren Laadan
2010-11-20 18:08 ` Oren Laadan
2010-11-20 18:08 ` Oren Laadan
2010-11-20 18:11 ` Oren Laadan
2010-11-20 18:11 ` Oren Laadan
[not found] ` <4CE69B8C.6050606-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2010-11-20 18:15 ` Oren Laadan
2010-11-20 18:15 ` Oren Laadan
2010-11-20 19:33 ` Tejun Heo
2010-11-21 8:18 ` Gene Cooperman
2010-11-21 8:18 ` Gene Cooperman
2010-11-21 8:21 ` Gene Cooperman
2010-11-22 18:02 ` Sukadev Bhattiprolu
2010-11-23 17:53 ` Oren Laadan
2010-11-24 3:50 ` Kapil Arya
2010-11-25 16:04 ` Oren Laadan
2010-11-29 4:09 ` Gene Cooperman
2010-11-21 22:41 ` Grant Likely
2010-11-22 17:34 ` Oren Laadan
2010-11-22 17:18 ` Oren Laadan
2010-11-20 18:05 ` Oren Laadan
2010-11-17 22:17 ` Matt Helsley
[not found] ` <20101117221713.GA27736-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2010-11-18 10:06 ` Tejun Heo
2010-11-18 10:06 ` Tejun Heo
2010-11-18 20:25 ` Oren Laadan
2010-11-18 20:25 ` Oren Laadan
2010-11-07 21:44 ` Oren Laadan
2010-11-07 23:31 ` Gene Cooperman
2010-11-05 22:24 ` Oren Laadan
2010-11-04 4:03 ` Oren Laadan
2010-11-04 9:43 ` Tejun Heo
2010-11-04 12:48 ` Luck, Tony
2010-11-04 13:06 ` Tejun Heo
2010-11-06 10:12 ` Matt Helsley
2010-11-06 11:03 ` Tejun Heo
2010-11-07 22:59 ` Davide Libenzi
2010-11-08 2:32 ` david
2010-11-18 20:41 ` Oren Laadan
2010-11-05 3:55 ` Kapil Arya
2010-11-05 11:57 ` Luck, Tony
2010-11-05 17:17 ` Gene Cooperman
2010-11-06 1:16 ` Matt Helsley
2010-11-06 4:06 ` Oren Laadan
2010-11-06 5:18 ` Matt Helsley
2010-11-06 21:00 ` Oren Laadan
2010-11-05 17:31 ` Sukadev Bhattiprolu
2010-11-06 21:05 ` Oren Laadan
2010-11-08 16:55 ` Grant Likely
2010-11-08 21:01 ` Nathan Lynch
2010-11-11 6:27 ` Nathan Lynch
2010-11-17 5:29 ` Anton Blanchard
2010-11-17 11:08 ` Tejun Heo
2010-11-18 9:53 ` Alan Cox
2010-11-18 12:27 ` Alexey Dobriyan
2010-11-19 6:33 ` Gene Cooperman
2010-11-21 23:20 ` Grant Likely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1288835258.6132.56.camel@tp-t61 \
--to=ntl@pobox.com \
--cc=hch@lst.de \
--cc=ksummit-2010-discuss@lists.linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=orenl@cs.columbia.edu \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.