All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue@us.ibm.com>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Serge Hallyn <serue@linux.vnet.ibm.com>,
	Containers <containers@lists.osdl.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Oren Laadan <orenl@cs.columbia.edu>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Christoph Hellwig <hch@infradead.org>,
	Alexey Dobriyan <adobriyan@gmail.com>
Subject: Re: Ensuring c/r maintainability (WAS Re: [RFC][PATCH 00/11] track files for checkpointability)
Date: Fri, 13 Mar 2009 12:53:01 -0500	[thread overview]
Message-ID: <20090313175301.GA13050@us.ibm.com> (raw)
In-Reply-To: <20090313063611.GH7561@us.ibm.com>

Quoting Matt Helsley (matthltc@us.ibm.com):
> On Thu, Mar 12, 2009 at 10:30:48AM -0500, Serge E. Hallyn wrote:
> > Quoting Cedric Le Goater (legoater@free.fr):
> > > >> And if Ingo's requirement is fulfilled, would any C/R patchset be acceptable ?
> > > > 
> > > > Yup, no matter how hideous  :)  Ok not really.
> > > > 
> > > > But the point was that it wasn't Dave not understanding Alexey's
> > > > suggestion, but Greg not understanding Ingo's.  If you think Ingo's
> > > > goal isn't worthwhile or achievable, then argue that (as I am), don't
> > > > keep elaborating on something we all agree will be needed (Alexey's
> > > > suggestion or some other way of doing a true may-be-checkpointed test).
> > > 
> > > I rather spend my time on enabling things rather than forbid them. 
> > 
> > That sure sounds productive.  How could I argue with that.
> > 
> > But wait, haven't several teams been doing that for years?  So why is
> > c/r not in the upstream kernel?  Could it be that ignoring the
> > upstream maintainers' concerns about (a) treating the feature as a
> > toy, (b) long-term maintainability, and (c) c/r becoming an impediment
> > to future features, and instead hacking away at our toy feature, is
> > *not* always the best course?
> 
> I've been thinking about how we could make checkpoint/restart (c/r) more
> maintainable in the long-term. I've only come up with two ideas:
> 
> I. Implement sparse-like __cr struct annotations for some compile-time checking.
> 
> First we annotate structures which c/r needs to save. For example we might have:
> 
> 	struct mm_struct {
> 		__cr struct vm_area_struct * mmap;
> 		struct rb_root mm_rb;
> 		struct vm_area_struct *mmap_cache;
> 		...
> 		__cr unsigned long mmap_base;
> 		__cr unsigned long task_size;
> 		..
> 	};
> 
> The __cr annotations indicate fields of the mm_struct which must be
> saved during checkpoint restart. In fact, for non-pointer fields these
> annotations would be sufficient to generate c/r code.
> 
> Next we would need a __cr_root annotation. These mark structures which
> the c/r code visits that determine the scope of c/r. If there is no path from a
> __cr annotation to a __cr_root annotation then we would conclude that c/r of
> this struct is broken. These path constraint checks could be done at compile
> time.

Hi Matt,

is what you're detecting here really something we're worried about?

Maybe that's something we should be doing - coming up with a list of
the things we are trying to detect or prevent.  I can only think of
a few offhand:

1. checkpoint (and restart) a task which has used a resource which we
do cannot (yet, or ever) safely checkpoint/restart.

2. kernel has a new feature for which we have not considered
checkpoint/restart.  Not only is it not safe to c/r a task using it,
but we haven't even implemented a check for tasks using it.

3. Some new kernel feature has an attribute which simply must be
stored away.  An example would be the vdso_base in s390 as of
recent 2.6.29 rc's, which was not present in 2.6.28.  So there are
two things to worry about in this one:

	a. detect that this happened and handle it, so c/r continues
	   to work.
	b. figure out a way to restart an older c/r image on a newer
	   kernel - or simply detect older images and call them
	   incompatible.

-serge

  reply	other threads:[~2009-03-13 17:53 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-05 16:38 [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
2009-03-05 16:38 ` Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 01/11] kill '_data' in cr_hdr_fd_data name Dave Hansen
2009-03-05 16:38   ` Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 02/11] breakout fdinfo sprintf() into its own function Dave Hansen
2009-03-05 16:38 ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 03/11] Introduce generic_file_checkpoint() Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 04/11] actually use f_op in checkpoint code Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 05/11] add generic checkpoint f_op to ext fses Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-13  2:50   ` Oren Laadan
2009-03-13  2:50     ` Oren Laadan
2009-03-05 16:39 ` [RFC][PATCH 06/11] add checkpoint_file_generic() to /proc Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 07/11] file c/r: expose functions to query fs support Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 08/11] expose file checkpointability and reasoning in /proc Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 09/11] check files for checkpointability Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-09 17:38   ` Matt Helsley
     [not found]     ` <20090309173837.GC7561-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-12 19:14       ` Dave Hansen
2009-03-12 19:14     ` Dave Hansen
2009-03-09 17:38   ` Matt Helsley
2009-03-05 16:39 ` [RFC][PATCH 10/11] add checkpoint/restart compile helper Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 11/11] optimize c/r check in dup_fd() Dave Hansen
2009-03-05 16:39   ` Dave Hansen
2009-03-05 17:40 ` [RFC][PATCH 00/11] track files for checkpointability Alexey Dobriyan
2009-03-05 17:40 ` Alexey Dobriyan
     [not found]   ` <20090305174037.GA2274-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 19:16     ` Dave Hansen
2009-03-05 19:16       ` Dave Hansen
2009-03-05 21:08       ` Alexey Dobriyan
2009-03-05 21:08         ` Alexey Dobriyan
2009-03-05 21:27         ` Dave Hansen
2009-03-05 22:00           ` Alexey Dobriyan
2009-03-05 22:00           ` Alexey Dobriyan
2009-03-05 22:24             ` Dave Hansen
2009-03-06 14:34               ` Serge E. Hallyn
2009-03-06 14:34               ` Serge E. Hallyn
     [not found]                 ` <20090306143425.GA31250-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 15:48                   ` Dave Hansen
2009-03-13  3:05                   ` Oren Laadan
2009-03-06 15:48                 ` Dave Hansen
2009-03-06 16:23                   ` Serge E. Hallyn
2009-03-06 16:23                     ` Serge E. Hallyn
     [not found]                     ` <20090306162337.GA3040-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 16:46                       ` Dave Hansen
2009-03-06 16:46                     ` Dave Hansen
2009-03-06 18:24                       ` Serge E. Hallyn
2009-03-06 19:42                         ` Dave Hansen
     [not found]                         ` <20090306182451.GA6307-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 19:42                           ` Dave Hansen
2009-03-06 18:24                       ` Serge E. Hallyn
2009-03-13  3:05                 ` Oren Laadan
     [not found]             ` <20090305220044.GA2819-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 22:24               ` Dave Hansen
2009-03-06 15:08               ` Greg Kurz
2009-03-06 15:08             ` Greg Kurz
2009-03-06 15:35               ` Serge E. Hallyn
     [not found]                 ` <20090306153549.GA898-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-06 17:36                   ` Cedric Le Goater
2009-03-06 17:36                     ` Cedric Le Goater
2009-03-06 18:30                     ` Serge E. Hallyn
2009-03-11  7:51                       ` Cedric Le Goater
     [not found]                         ` <49B76D91.1020807-GANU6spQydw@public.gmane.org>
2009-03-12 15:30                           ` Serge E. Hallyn
2009-03-12 15:30                             ` Serge E. Hallyn
2009-03-13  6:36                             ` Ensuring c/r maintainability (WAS Re: [RFC][PATCH 00/11] track files for checkpointability) Matt Helsley
2009-03-13 17:53                               ` Serge E. Hallyn [this message]
     [not found]                       ` <20090306183055.GA6729-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-11  7:51                         ` [RFC][PATCH 00/11] track files for checkpointability Cedric Le Goater
     [not found]                     ` <49B15F35.2010909-GANU6spQydw@public.gmane.org>
2009-03-06 18:30                       ` Serge E. Hallyn
2009-03-06 15:35               ` Serge E. Hallyn
     [not found]         ` <20090305210840.GA2499-2ev+ksY9ol182hYKe6nXyg@public.gmane.org>
2009-03-05 21:27           ` Dave Hansen
2009-03-05 19:44     ` Dave Hansen
2009-03-05 19:44   ` Dave Hansen
2009-03-05 18:13 ` Serge E. Hallyn
     [not found]   ` <20090305181325.GA10666-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-05 18:16     ` Dave Hansen
2009-03-05 18:16   ` Dave Hansen
2009-03-05 18:13 ` Serge E. Hallyn
2009-03-10 15:57 ` Nathan Lynch
2009-03-10 16:00   ` Nathan Lynch
     [not found]     ` <20090310110000.24893e0c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 16:23       ` Serge E. Hallyn
2009-03-10 16:23     ` Serge E. Hallyn
2009-03-10 16:20   ` Serge E. Hallyn
2009-03-10 17:23     ` Nathan Lynch
     [not found]       ` <20090310122320.313491ce-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 17:45         ` Serge E. Hallyn
2009-03-10 17:45       ` Serge E. Hallyn
     [not found]         ` <20090310174517.GA12101-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-10 17:47           ` Dave Hansen
2009-03-10 17:47         ` Dave Hansen
2009-03-10 22:54           ` what is CONFIG_VZ_GENCALLS Zhaohui Wang
     [not found]     ` <20090310162026.GA9354-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-03-10 17:23       ` [RFC][PATCH 00/11] track files for checkpointability Nathan Lynch
2009-03-10 16:22   ` Dave Hansen
     [not found]   ` <20090310105702.43eb1402-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-03-10 16:00     ` Nathan Lynch
2009-03-10 16:20     ` Serge E. Hallyn
2009-03-10 16:22     ` Dave Hansen
2009-03-10 15:57 ` Nathan Lynch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090313175301.GA13050@us.ibm.com \
    --to=serue@us.ibm.com \
    --cc=adobriyan@gmail.com \
    --cc=containers@lists.osdl.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=mingo@elte.hu \
    --cc=orenl@cs.columbia.edu \
    --cc=serue@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.