public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue@us.ibm.com>
To: Matt Helsley <matthltc@us.ibm.com>
Cc: Serge Hallyn <serue@linux.vnet.ibm.com>,
	Containers <containers@lists.osdl.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Oren Laadan <orenl@cs.columbia.edu>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	Ingo Molnar <mingo@elte.hu>,
	Christoph Hellwig <hch@infradead.org>,
	Alexey Dobriyan <adobriyan@gmail.com>
Subject: Re: Ensuring c/r maintainability (WAS Re: [RFC][PATCH 00/11] track files for checkpointability)
Date: Fri, 13 Mar 2009 12:53:01 -0500	[thread overview]
Message-ID: <20090313175301.GA13050@us.ibm.com> (raw)
In-Reply-To: <20090313063611.GH7561@us.ibm.com>

Quoting Matt Helsley (matthltc@us.ibm.com):
> On Thu, Mar 12, 2009 at 10:30:48AM -0500, Serge E. Hallyn wrote:
> > Quoting Cedric Le Goater (legoater@free.fr):
> > > >> And if Ingo's requirement is fulfilled, would any C/R patchset be acceptable ?
> > > > 
> > > > Yup, no matter how hideous  :)  Ok not really.
> > > > 
> > > > But the point was that it wasn't Dave not understanding Alexey's
> > > > suggestion, but Greg not understanding Ingo's.  If you think Ingo's
> > > > goal isn't worthwhile or achievable, then argue that (as I am), don't
> > > > keep elaborating on something we all agree will be needed (Alexey's
> > > > suggestion or some other way of doing a true may-be-checkpointed test).
> > > 
> > > I rather spend my time on enabling things rather than forbid them. 
> > 
> > That sure sounds productive.  How could I argue with that.
> > 
> > But wait, haven't several teams been doing that for years?  So why is
> > c/r not in the upstream kernel?  Could it be that ignoring the
> > upstream maintainers' concerns about (a) treating the feature as a
> > toy, (b) long-term maintainability, and (c) c/r becoming an impediment
> > to future features, and instead hacking away at our toy feature, is
> > *not* always the best course?
> 
> I've been thinking about how we could make checkpoint/restart (c/r) more
> maintainable in the long-term. I've only come up with two ideas:
> 
> I. Implement sparse-like __cr struct annotations for some compile-time checking.
> 
> First we annotate structures which c/r needs to save. For example we might have:
> 
> 	struct mm_struct {
> 		__cr struct vm_area_struct * mmap;
> 		struct rb_root mm_rb;
> 		struct vm_area_struct *mmap_cache;
> 		...
> 		__cr unsigned long mmap_base;
> 		__cr unsigned long task_size;
> 		..
> 	};
> 
> The __cr annotations indicate fields of the mm_struct which must be
> saved during checkpoint restart. In fact, for non-pointer fields these
> annotations would be sufficient to generate c/r code.
> 
> Next we would need a __cr_root annotation. These mark structures which
> the c/r code visits that determine the scope of c/r. If there is no path from a
> __cr annotation to a __cr_root annotation then we would conclude that c/r of
> this struct is broken. These path constraint checks could be done at compile
> time.

Hi Matt,

is what you're detecting here really something we're worried about?

Maybe that's something we should be doing - coming up with a list of
the things we are trying to detect or prevent.  I can only think of
a few offhand:

1. checkpoint (and restart) a task which has used a resource which we
do cannot (yet, or ever) safely checkpoint/restart.

2. kernel has a new feature for which we have not considered
checkpoint/restart.  Not only is it not safe to c/r a task using it,
but we haven't even implemented a check for tasks using it.

3. Some new kernel feature has an attribute which simply must be
stored away.  An example would be the vdso_base in s390 as of
recent 2.6.29 rc's, which was not present in 2.6.28.  So there are
two things to worry about in this one:

	a. detect that this happened and handle it, so c/r continues
	   to work.
	b. figure out a way to restart an older c/r image on a newer
	   kernel - or simply detect older images and call them
	   incompatible.

-serge

  reply	other threads:[~2009-03-13 17:53 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-05 16:38 [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 01/11] kill '_data' in cr_hdr_fd_data name Dave Hansen
2009-03-05 16:38 ` [RFC][PATCH 02/11] breakout fdinfo sprintf() into its own function Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 03/11] Introduce generic_file_checkpoint() Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 04/11] actually use f_op in checkpoint code Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 05/11] add generic checkpoint f_op to ext fses Dave Hansen
2009-03-13  2:50   ` Oren Laadan
2009-03-05 16:39 ` [RFC][PATCH 06/11] add checkpoint_file_generic() to /proc Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 07/11] file c/r: expose functions to query fs support Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 08/11] expose file checkpointability and reasoning in /proc Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 09/11] check files for checkpointability Dave Hansen
2009-03-09 17:38   ` Matt Helsley
2009-03-12 19:14     ` Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 10/11] add checkpoint/restart compile helper Dave Hansen
2009-03-05 16:39 ` [RFC][PATCH 11/11] optimize c/r check in dup_fd() Dave Hansen
2009-03-05 17:40 ` [RFC][PATCH 00/11] track files for checkpointability Alexey Dobriyan
2009-03-05 19:16   ` Dave Hansen
2009-03-05 21:08     ` Alexey Dobriyan
2009-03-05 21:27       ` Dave Hansen
2009-03-05 22:00         ` Alexey Dobriyan
2009-03-05 22:24           ` Dave Hansen
2009-03-06 14:34             ` Serge E. Hallyn
2009-03-06 15:48               ` Dave Hansen
2009-03-06 16:23                 ` Serge E. Hallyn
2009-03-06 16:46                   ` Dave Hansen
2009-03-06 18:24                     ` Serge E. Hallyn
2009-03-06 19:42                       ` Dave Hansen
2009-03-13  3:05               ` Oren Laadan
2009-03-06 15:08           ` Greg Kurz
2009-03-06 15:35             ` Serge E. Hallyn
2009-03-06 17:36               ` Cedric Le Goater
2009-03-06 18:30                 ` Serge E. Hallyn
2009-03-11  7:51                   ` Cedric Le Goater
2009-03-12 15:30                     ` Serge E. Hallyn
2009-03-13  6:36                       ` Ensuring c/r maintainability (WAS Re: [RFC][PATCH 00/11] track files for checkpointability) Matt Helsley
2009-03-13 17:53                         ` Serge E. Hallyn [this message]
2009-03-05 19:44   ` [RFC][PATCH 00/11] track files for checkpointability Dave Hansen
2009-03-05 18:13 ` Serge E. Hallyn
2009-03-05 18:16   ` Dave Hansen
2009-03-10 15:57 ` Nathan Lynch
2009-03-10 16:00   ` Nathan Lynch
2009-03-10 16:23     ` Serge E. Hallyn
2009-03-10 16:20   ` Serge E. Hallyn
2009-03-10 17:23     ` Nathan Lynch
2009-03-10 17:45       ` Serge E. Hallyn
2009-03-10 17:47         ` Dave Hansen
2009-03-10 16:22   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090313175301.GA13050@us.ibm.com \
    --to=serue@us.ibm.com \
    --cc=adobriyan@gmail.com \
    --cc=containers@lists.osdl.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=hch@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=matthltc@us.ibm.com \
    --cc=mingo@elte.hu \
    --cc=orenl@cs.columbia.edu \
    --cc=serue@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox