All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Subject: user-cr: Extra unshare() calls ?
Date: Mon, 8 Mar 2010 12:13:39 -0800	[thread overview]
Message-ID: <20100308201338.GA16446@us.ibm.com> (raw)


Came across this while testing LXC.


1. Does ckpt_remount_proc() need to unshare() ? Or can we have the
   clone() that calls __ckpt_coordinator() clone with CLONE_NEWNS|CLONE_FS
   instead ?

   The problem with the unshare() in ckpt_remount_proc() is that it
   creates an extra level in cgroup hierarchy (see below) after restart.
   So applications expecting the cgroup hierarchy before chckpoint will
   be surprised.

2. When --mount-pty (or --mntns) is specified, do we need to unshare() 
   in the parent process ? Considering only the full-container restart
   for now (ignore self-restart and subtree restart), can we just
   specify (CLONE_NEWNS|CLONE_FS) at the time of creating the first
   restarted process ?

Here is an example (using LXC) that shows the problems I am running into
Attached is a quick hack to point out the unshare() calls I am referring
to.
   
If I create a simple container with LXC

	$ lxc-execute --name foo --rcfile lxc-macvlan.conf -- /bin/sleep 1000

It creates the following three processes:

	PID   PPID  CMD

        3350  3239  lxc-execute --name foo -- /bin/sleep 1000
        3353  3350  /usr/local/libexec/lxc-init -- /bin/sleep 1000
        3357  3353  /bin/sleep 1000

A new cgroup is created named 'foo' (which is basically a user-space
rename of the pid of the lxc-init). This cgroup is in the root cgroup
directory and has two tasks (lxc-init, sleep)

        $ cat /cgroup/foo/tasks
        3353
        3357

When I checkpoint and restart this container (using the equivalent of
--pidns --pids --mount-pty options to /bin/restart). I get three
processes:

        3434  3375  ./lxc_restart --name bar --statefile=/root/foo.ckpt
        3436  3434  /usr/local/libexec/lxc-init -- /bin/sleep 1000
        3437  3436  /bin/sleep 1000

But the directory in /cgroup referring to lxc-init is 3 levels deep:

	ls /cgroup/3434/3436/1
	cgroup.procs  freezer.state  notify_on_release  tasks

Here is the complete hierarchy created after the restart:

        $ ls -R /cgroup/3434
        /cgroup/3434:
        3436  cgroup.procs  freezer.state  notify_on_release  tasks

        /cgroup/3434/3436:
        1  cgroup.procs  freezer.state  notify_on_release  tasks

        /cgroup/3434/3436/1:
        cgroup.procs  freezer.state  notify_on_release  tasks

        $ cat /cgroup/3434/tasks
        3434

        $ cat /cgroup/3434/3436/tasks   # empty

        $ cat /cgroup/3434/3436/1/tasks
        3436
        3437

I think we get the directory /cgroup/3434 due to the following unshare()

		/* private mounts namespace ? */
		if (args->mntns && unshare(CLONE_NEWNS | CLONE_FS) < 0) {
			ckpt_perror("unshare");
			exit(1);
		}

And we get the "3436/1" directory due to the unshare() in ckpt_remount_proc().

Following hack seems to fix both the levels and the lxc_restart command
correctly creates just the "/cgroup/3436" (which LXC renames to "/cgroup/bar"
cgroup).

---
From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Mon, 8 Mar 2010 12:03:46 -0800
Subject: [PATCH 1/1] Minimize unshare() calls

---
 restart.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/restart.c b/restart.c
index c82de21..6ac51e3 100644
--- a/restart.c
+++ b/restart.c
@@ -459,10 +459,12 @@ int app_restart(struct app_restart_args *args)
 		exit(1);
 
 	/* private mounts namespace ? */
+#if 0
 	if (args->mntns && unshare(CLONE_NEWNS | CLONE_FS) < 0) {
 		ckpt_perror("unshare");
 		exit(1);
 	}
+#endif
 
 	/* chroot ? */
 	if (args->root && chroot(args->root) < 0) {
@@ -717,10 +719,12 @@ static int ckpt_probe_child(pid_t pid, char *str)
  */
 static int ckpt_remount_proc(struct ckpt_ctx *ctx)
 {
+#if 0
 	if (unshare(CLONE_NEWNS | CLONE_FS) < 0) {
 		ckpt_perror("unshare");
 		return -1;
 	}
+#endif
 	/* this is unlikely, but we don't want to fail */
 	if (umount2("/proc", MNT_DETACH) < 0) {
 		if (ckpt_cond_fail(ctx, CKPT_COND_MNTPROC)) {
@@ -778,6 +782,7 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
 	int copy, ret;
 	genstack stk;
 	void *sp;
+	unsigned long flags = SIGCHLD;
 
 	ckpt_dbg("forking coordinator in new pidns\n");
 
@@ -802,7 +807,9 @@ static int ckpt_coordinator_pidns(struct ckpt_ctx *ctx)
 	copy = ctx->args->copy_status;
 	ctx->args->copy_status = 1;
 
-	coord_pid = clone(__ckpt_coordinator, sp, CLONE_NEWPID|SIGCHLD, ctx);
+	flags |= CLONE_NEWPID|CLONE_NEWNS|CLONE_FS;
+
+	coord_pid = clone(__ckpt_coordinator, sp, flags, ctx);
 	genstack_release(stk);
 	if (coord_pid < 0) {
 		ckpt_perror("clone coordinator");
-- 
1.6.6.1

             reply	other threads:[~2010-03-08 20:13 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-08 20:13 Sukadev Bhattiprolu [this message]
     [not found] ` <20100308201338.GA16446-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-08 20:58   ` user-cr: Extra unshare() calls ? Serge E. Hallyn
     [not found]     ` <20100308205851.GB21490-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-08 21:11       ` Sukadev Bhattiprolu
     [not found]         ` <20100308211135.GB14607-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-12  1:58           ` Sukadev Bhattiprolu
     [not found]             ` <20100312015823.GB6444-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-12 14:44               ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100308201338.GA16446@us.ibm.com \
    --to=sukadev-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.