public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kurz <gkurz@fr.ibm.com>
To: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Oren Laadan <orenl@cs.columbia.edu>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave@linux.vnet.ibm.com>,
	containers@lists.osdl.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>
Subject: Re: C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel)
Date: Thu, 16 Apr 2009 00:42:17 +0200	[thread overview]
Message-ID: <1239835337.6610.6.camel@bahia> (raw)
In-Reply-To: <20090415195629.GD26994@x200.localdomain>

On Wed, 2009-04-15 at 23:56 +0400, Alexey Dobriyan wrote:
> > Again, so to checkpoint one task in the topmost pid-ns you need to
> > checkpoint (if at all possible) the entire system ?!
> 
> One more argument to not allow "leaks" and checkpoint whole container,
> no ifs, buts and woulditbenices.
> 
> Just to clarify, C/R with "leak" is for example when process has separate
> pidns, but shares, for example, netns with other process not involved in
> checkpoint.
> 
> If you allow this, you lose one important property of checkpoint part,
> namely, almost everything is frozen. Losing this property means suddenly
> much more stuff is alive during dump and you has to account to more stuff
> when checkpointing. You effectively checkpointing on live data structures
> and there is no guarantee you'll get it right.
> 
> Example 1: utsns is shared with the rest of the world.
> 
> utsns content is modifiable only by tasks (current->nsproxy->uts_ns).
> Consequently, someone can modify utsns content while you're dumping it
> if you allow "leaks".
> 
> Did you take precautions? Where?
> 
> 	static int cr_write_utsns(struct cr_ctx *ctx, struct uts_namespace *uts_ns)
> 	{
> 	        struct cr_hdr h;
> 	        struct cr_hdr_utsns *hh;
> 	        int domainname_len;
> 	        int nodename_len;
> 	        int ret;
> 
> 	        h.type = CR_HDR_UTSNS;
> 	        h.len = sizeof(*hh);
> 
> 	        hh = cr_hbuf_get(ctx, sizeof(*hh));
> 	        if (!hh)
> 	                return -ENOMEM;
> 
> 	        nodename_len = strlen(uts_ns->name.nodename) + 1;
> 	        domainname_len = strlen(uts_ns->name.domainname) + 1;
> 
> 	        hh->nodename_len = nodename_len;
> 	        hh->domainname_len = domainname_len;
> 
> 	        ret = cr_write_obj(ctx, &h, hh);
> 	        cr_hbuf_put(ctx, sizeof(*hh));
> 	        if (ret < 0)
> 	                return ret;
> 
> 	        ret = cr_write_string(ctx, uts_ns->name.nodename, nodename_len);
> 	        if (ret < 0)
> 	                return ret;
> 
> 	        ret = cr_write_string(ctx, uts_ns->name.domainname, domainname_len);
> 	        return ret;
> 	}
> 
> You should take uts_sem.
> 
> 
> Example 2: ipcns is shared with the rest of the world
> 
> Consequently, shm segment is visible outside and live. Someone already
> shmatted to it. What will end up in shm segment content? Anything.
> 
> You should check struct file refcount or something and disable attaching
> while dumping or something.
> 
> 
> Moral: Every time you do dump on something live you get complications.
> Every single time.
> 
> 
> There are sockets and live netns as the most complex example. I'm not
> prepared to describe it exactly, but people wishing to do C/R with
> "leaks" should be very careful with their wishes.

They should close their sockets before checkpoint and find/have some way
to reconnect after. This implies some kind of C/R awareness in the code
to be checkpointed.

-- 
Gregory Kurz                                     gkurz@fr.ibm.com
Software Engineer @ IBM/Meiosys                  http://www.ibm.com
Tel +33 (0)534 638 479                           Fax +33 (0)561 400 420

"Anarchy is about taking complete responsibility for yourself."
        Alan Moore.


  parent reply	other threads:[~2009-04-15 22:43 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-14  3:43 Creating tasks on restart: userspace vs kernel Oren Laadan
2009-04-14  9:59 ` Ingo Molnar
2009-04-14 14:53   ` Oren Laadan
2009-04-14 16:16     ` Serge E. Hallyn
2009-04-14 16:36 ` Alexey Dobriyan
2009-04-14 16:46   ` Alexey Dobriyan
2009-04-14 18:40   ` Oren Laadan
2009-04-14 19:59     ` Alexey Dobriyan
2009-04-14 20:10       ` Oren Laadan
2009-04-14 21:01         ` Alexey Dobriyan
2009-04-15 19:56     ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Alexey Dobriyan
2009-04-15 21:38       ` C/R without "leaks" Oren Laadan
2009-04-22  0:16         ` Nathan Lynch
2009-04-15 22:42       ` Greg Kurz [this message]
2009-04-16 16:12         ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Alexey Dobriyan
2009-04-16 18:10           ` C/R without "leaks" Chris Friesen
2009-04-16 18:39             ` Oren Laadan
2009-04-17  9:15               ` Greg Kurz
2009-04-17  9:48                 ` Oren Laadan
2009-04-17 12:25                   ` Greg Kurz
2009-04-17  8:46           ` C/R without "leaks" (was: Re: Creating tasks on restart: userspace vs kernel) Greg Kurz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1239835337.6610.6.camel@bahia \
    --to=gkurz@fr.ibm.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=containers@lists.osdl.org \
    --cc=dave@linux.vnet.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=orenl@cs.columbia.edu \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox