All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6][lxc][v3] Link LXC with USERCR
@ 2010-03-31  7:04 Sukadev Bhattiprolu
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:04 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA
  Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA,
	sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8


lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
Following set of patches, when applied to LXC and built with USERCR as
described below, enable enable lxc-checkpoint and lxc-restart of some
simple containers

TODO:
	- Determine if lxc_checkpoint needs a --container option (see
	  TODOs in src/lxc/checkpoint.c)
	- This patchset was tested using lxc-nonetns.conf. I ran into a problem
	  creating a bridge with lxc-veth.conf and lxc-macvlan.conf.  I have
	  not debugged the problem with VNC and lxc-macvlan.conf.
	- 'global_send_sigint' is still a global variable in USERCR. We need
	  to define a better interface to expose its functionality to callers
	  of app_restart().
	- Choose better names and API for USERCR :-)
	- Additional TODOs specific to checkpoint/restart are listed in their
	  specific patches.

Changelog[v3]:
	Changed following based on feedback from Michel Normand, Daniel 
	Lezcano and others.

	- Added --with-libcr configuration option to specify the path to
	  usercr (see usage below).
	- lxc-checkpoint now implicitly freezes before and unfreezes after
	  checkpoint.
	- Implemented the --pause options to lxc-checkpoint and lxc-restart
	  and the --kill option to lxc-checkpoint.
	- Ported to ckpt-v20-dev (required adding CHECKPOINT_NONETNS flag to
	  app_checkpoint() to work around)

Changelog[v2]: 

	(Based on feedback from Oren Laadan, Serge Hallyn, Daniel Lezcano
	and Cedric Le Goater)

	- Rather than drop --directory option to lxc_checkpoint/lxc_restart
	  add a new option (--image).
	- Integrate lxc_checkpoint to work with USERCR
	- USERCR renamed usercr.h to "app-checkpoint.h"
	- USERCR does not create/install libcheckpoint.a and usercr.h for now.
	  So link directly with app-checkpoint.h, restart.o and checkpoint.o
	- USERCR renames the interfaces to app_checkpoint() and app_restart()
	  'struct app_checkpoint_args' 'struct app_restart_args'.
	  
USAGE:

1. Build USERCR

	$ cd /root

	$ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr

	$ cd user-cr

	$ git-checkout ckpt-v20-dev

	  	Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2
		as HEAD.
	
	# Apply following two patches:

	https://lists.linux-foundation.org/pipermail/containers/2010-March/024037.html
	https://lists.linux-foundation.org/pipermail/containers/2010-March/024038.html

	$ cd /root/user-cr

	$ KERNELSRC=/root/linux-2.6/ make 

		Build USERCR by pointing to corresponding kernel-source.
		This should create restart.o and checkpoint.o needed by LXC.

2. Build/install LXC

	$ cd /root/lxc.git

	Apply attached patches to LXC (I tested with these patches applied
	to commit 9ea8066aa67b808f71f46e346bd7a215e2a355f3)

	$ autogen.sh

	$ ./configure --with-libcr=/root/user-cr

		This will fail if /root/user-cr does not container checkpoint.o,
		restart.o and app-checkpoint.h files 
	$ make

	$ make install

3. Checkpoint/restart a simple LXC container

	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000

	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt

	$ lxc-stop --name foo

	$ lxc-restart --name foo --image /root/lxc-foo.ckpt

4. Checkpoint/restart other LXC containers such as:

	- a file-io session (see run-fileio1 in cr-tests[1])

	- process-tree (see run-ptree1 in cr-tests[1])

	- A vi session inside a VNC Server using "twm". i.e

		$ cat /root/.vnc/xstartup
		#!/bin/sh

		xsetroot -solid grey
		vncconfig -iconic &
		xterm -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" &
		twm &

		$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- \
			/usr/bin/vncserver :1

		$ vncviewer :1

			# Open a vi session

		$ lxc-checkpoint --name foo  --statefile /root/vnc.ckpt

		$ lxc-stop --name foo

		$ lxc-restart --pause --name foo --statefile /root/vnc.ckpt

			# Leaves the server frozen due to --pause


		$ lxc-unfreeze --name foo

		$ vncviewer :1

			# Should bring up the old VNC session with vi window

[1]: cr-tests:	git://git.sr71.net/~hallyn/cr_tests.git

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/6][lxc][v3] Add --with-libcr configure option
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31  7:06   ` Sukadev Bhattiprolu
       [not found]     ` <20100331070633.GA23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  7:07   ` [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option Sukadev Bhattiprolu
                     ` (9 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:06 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Wed, 24 Mar 2010 17:26:44 -0700
Subject: [PATCH 1/6][lxc][v3] Add --with-libcr configure option

Add a configure option, --with-libcr=dir which would allow linking
with external (i.e USERCR) implementation  of checkpoint/restart.

For now, USERCR "publishes" a app-checkpoint.h, checkpoint.o and
restart.o files which implement the functions app_checkpoint() and
app_restart().

Usage:
	$ ./autogen.sh

	$ ./configure --help |grep libcr
	--with-libcr=dir     use the Checkpoint/Restart implementation in 'dir'

	$ ls /home/guest/user-cr/
	app-checkpoint.h    checkpoint.o    restart.o

	$ ./configure --with-libcr=/home/guest/user-cr

TODO:
	If names of interfaces in USERCR change, we may want to rename
	the config option too ?

	LIBCR_CFLAGS are only needed for src/lxc/{checkpoint.c,restart.c}
	but not sure if there is an easy way to define autoconf CFLAGS
	just for those two files.

Changelog[v2]:
	- Rename --with-usercr to --with-libcr
	- Add libeclone.a to the LIBCR_OBJS variable since functions in
	  libeclone.a will be used by checkpoint() and restart() functions.
	- Add -I${with_libcr}/include to LIBCR_CFLAGS to pick up
	  checkpoint_hdr.h, checkpoint.h etc.

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 configure.ac        |   19 +++++++++++++++++++
 src/lxc/Makefile.am |   10 +++++++++-
 2 files changed, 28 insertions(+), 1 deletions(-)

diff --git a/configure.ac b/configure.ac
index f82e7df..fe6584c 100644
--- a/configure.ac
+++ b/configure.ac
@@ -12,6 +12,25 @@ AM_PROG_CC_C_O
 AC_GNU_SOURCE
 AC_CHECK_PROG(SETCAP, setcap, yes, no, $PATH$PATH_SEPARATOR/sbin)
 
+AC_ARG_WITH(libcr, [AS_HELP_STRING([--with-libcr=dir], \
+           [use the Checkpoint/Restart implementation in 'dir'])], [], \
+	   [with_libcr=no])
+
+if test "x$with_libcr" != "xno"; then
+       AS_AC_EXPAND(LIBCR_OBJS, "${with_libcr}/checkpoint.o ${with_libcr}/restart.o ${with_libcr}/libeclone.a")
+       AS_AC_EXPAND(LIBCR_CFLAGS, "-DLIBCR -I${with_libcr} -I$(with_libcr}/include")
+
+       AC_CHECK_FILE([$with_libcr/app-checkpoint.h], [], \
+               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but $with_libcr/app-checkpoint.h was not found]))
+
+       AC_CHECK_FILE([${with_libcr}/checkpoint.o], [], \
+               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but ${with_libcr}/checkpoint.o was not found]))
+
+       AC_CHECK_FILE([${with_libcr}/restart.o], [], \
+               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but ${with_libcr}/restart.o was not found]))
+fi
+
+
 AC_ARG_ENABLE([doc],
 	[AC_HELP_STRING([--enable-doc], [make mans (require docbook2man installed) [default=auto]])],
 	[], [enable_doc=auto])
diff --git a/src/lxc/Makefile.am b/src/lxc/Makefile.am
index 890f706..699c355 100644
--- a/src/lxc/Makefile.am
+++ b/src/lxc/Makefile.am
@@ -46,12 +46,20 @@ liblxc_so_SOURCES = \
 	mainloop.c mainloop.h \
 	af_unix.c af_unix.h
 
-AM_CFLAGS=-I$(top_srcdir)/src
+# We only need $(LIBCR_CFLAGS) for lxc_checkpoint and lxc_restart files
+# but for now, just set it for all.
+AM_CFLAGS=-I$(top_srcdir)/src $(LIBCR_CFLAGS)
 
 liblxc_so_CFLAGS = -fPIC -DPIC $(AM_CFLAGS)
 
+# TODO: Adding $(LIBCR_OBJS) here ensures we don't have undefined references
+# 	when building liblxc.so, but this has the side-effect of putting the
+# 	app_checkpoint/restart functions in liblxc.so. Or alternatively,
+# 	we could remove src/lxc/{checkpoint.o,restart.o} from liblxc.so
+# 	and link lxc-checkpoint/lxc-restart with them directly.
 liblxc_so_LDFLAGS = \
 	-shared \
+	$(LIBCR_OBJS) \
 	-Wl,-soname,liblxc.so.$(firstword $(subst ., ,$(VERSION)))
 
 liblxc_so_LDADD = -lutil
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  7:06   ` [PATCH 1/6][lxc][v3] Add --with-libcr configure option Sukadev Bhattiprolu
@ 2010-03-31  7:07   ` Sukadev Bhattiprolu
       [not found]     ` <20100331070711.GB23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  7:07   ` [PATCH 3/6][lxc][v3] lxc_checkpoint: " Sukadev Bhattiprolu
                     ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:07 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Sat, 27 Mar 2010 00:08:17 -0700
Subject: [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option

The existing --directory option to lxc_restart expects the checkpoint state
to be a directory. USERCR however uses a single regular file to store the
checkpoint image. So add a --statefile option to enable checkpointing and
restarting applications using USERCR.

Depending on how the application was checkpointed, users should specify
either --statefile=STATEFILE or the --directory=STATEFILE option (but not
both).

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 src/lxc/lxc_restart.c |   34 +++++++++++++++++++++++++---------
 1 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/src/lxc/lxc_restart.c b/src/lxc/lxc_restart.c
index 7db1d85..de4b421 100644
--- a/src/lxc/lxc_restart.c
+++ b/src/lxc/lxc_restart.c
@@ -38,13 +38,21 @@
 lxc_log_define(lxc_restart_ui, lxc_restart);
 
 static struct lxc_list defines;
+static char *statedir;
 
 static int my_checker(const struct lxc_arguments* args)
 {
-	if (!args->statefile) {
-		lxc_error(args, "no statefile specified");
-		return -1;
-	}
+	int d, f;
+	
+	/* make them boolean */
+	d = !!(statedir);
+	f = !!(args->statefile);
+
+        if (!(d ^ f)) {
+                lxc_error(args, "Must specify exactly one of --directory "
+                                "and --statefile options");
+                return -1;
+        }
 
 	return 0;
 }
@@ -52,8 +60,9 @@ static int my_checker(const struct lxc_arguments* args)
 static int my_parser(struct lxc_arguments* args, int c, char* arg)
 {
 	switch (c) {
-	case 'd': args->statefile = arg; break;
+	case 'd': statedir = arg; break;
 	case 'f': args->rcfile = arg; break;
+	case 'S': args->statefile = arg; break;
 	case 'p': args->flags = LXC_FLAG_PAUSE; break;
 	case 's': return lxc_config_define_add(&defines, arg);
 	}
@@ -66,21 +75,24 @@ static const struct option my_longopts[] = {
 	{"rcfile", required_argument, 0, 'f'},
 	{"pause", no_argument, 0, 'p'},
 	{"define", required_argument, 0, 's'},
+	{"statefile", required_argument, 0, 'S'},
 	LXC_COMMON_OPTIONS
 };
 
 static struct lxc_arguments my_args = {
 	.progname = "lxc-restart",
 	.help     = "\
---name=NAME --directory STATEFILE\n\
+--name=NAME --directory STATEFILE (deprecated)\n\
+\tlxc_restart --name=NAME --statefile=STATEFILE\n\
 \n\
 lxc-restart restarts from STATEFILE the NAME container\n\
 \n\
 Options :\n\
   -n, --name=NAME      NAME for name of the container\n\
   -p, --pause          do not release the container after the restart\n\
-  -d, --directory=STATEFILE for name of statefile\n\
+  -d, --directory=STATEFILE for name of statefile (legacy mode, deprecated)\n\
   -f, --rcfile=FILE Load configuration file FILE\n\
+  -i, --statefile=STATEFILE Load the application state from STATEFILE (libcr mode)\n\
   -s, --define KEY=VAL Assign VAL to configuration variable KEY\n",
 	.options  = my_longopts,
 	.parser   = my_parser,
@@ -90,6 +102,7 @@ Options :\n\
 int main(int argc, char *argv[])
 {
 	char *rcfile = NULL;
+	const char *statefile;
 	struct lxc_conf *conf;
 
 	lxc_list_init(&defines);
@@ -131,6 +144,9 @@ int main(int argc, char *argv[])
 	if (lxc_config_define_load(&defines, conf))
 		return -1;
 
-	return lxc_restart(my_args.name, my_args.statefile, conf,
-			   my_args.flags);
+	statefile = my_args.statefile;
+	if (statedir)
+		statefile = statedir;
+
+	return lxc_restart(my_args.name, statefile, conf, my_args.flags);
 }
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/6][lxc][v3] lxc_checkpoint: Add --statefile option
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  7:06   ` [PATCH 1/6][lxc][v3] Add --with-libcr configure option Sukadev Bhattiprolu
  2010-03-31  7:07   ` [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option Sukadev Bhattiprolu
@ 2010-03-31  7:07   ` Sukadev Bhattiprolu
  2010-03-31  7:08   ` [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c Sukadev Bhattiprolu
                     ` (7 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:07 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Wed, 10 Mar 2010 22:24:17 -0800
Subject: [PATCH 3/6][lxc][v3] lxc_checkpoint: Add --statefile option

The existing --directory option to lxc_checkpoint expects to save the
checkpoint state in a directory. USERCR however uses a single regular
file to store the checkpoint image. So add a --statefile option to enable
checkpointing and restarting applications using USERCR.

Users should specify either --statefile or --directory option (but not both)
to select the file/directory where the application state will be stored.

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 src/lxc/lxc_checkpoint.c |   40 +++++++++++++++++++++++++++++-----------
 1 files changed, 29 insertions(+), 11 deletions(-)

diff --git a/src/lxc/lxc_checkpoint.c b/src/lxc/lxc_checkpoint.c
index a8c74a9..5e8506f 100644
--- a/src/lxc/lxc_checkpoint.c
+++ b/src/lxc/lxc_checkpoint.c
@@ -37,12 +37,21 @@
 
 lxc_log_define(lxc_checkpoint_ui, lxc_checkpoint);
 
+static char *statedir;
+
 static int my_checker(const struct lxc_arguments* args)
 {
-	if (!args->statefile) {
-		lxc_error(args, "no statefile specified");
-		return -1;
-	}
+	int d, f;
+	
+	/* make them boolean */
+	d = !!(statedir);
+	f = !!(args->statefile);
+
+        if (!(d ^ f)) {
+                lxc_error(args, "Must specify exactly one of --directory "
+                                "and --statefile options");
+                return -1;
+        }
 
 	return 0;
 }
@@ -52,7 +61,8 @@ static int my_parser(struct lxc_arguments* args, int c, char* arg)
 	switch (c) {
 	case 'k': args->flags = LXC_FLAG_HALT; break;
 	case 'p': args->flags = LXC_FLAG_PAUSE; break;
-	case 'd': args->statefile = arg; break;
+	case 'd': statedir = arg; break;
+	case 'S': args->statefile = arg; break;
 	}
 	return 0;
 }
@@ -61,13 +71,15 @@ static const struct option my_longopts[] = {
 	{"kill", no_argument, 0, 'k'},
 	{"pause", no_argument, 0, 'p'},
 	{"directory", required_argument, 0, 'd'},
+	{"statefile", required_argument, 0, 'S'},
 	LXC_COMMON_OPTIONS
 };
 
 static struct lxc_arguments my_args = {
 	.progname = "lxc-checkpoint",
 	.help     = "\
---name=NAME --directory STATEFILE\n\
+--name=NAME --directory STATEFILE (deprecated)\n\
+\tlxc_checkpoint --name=NAME --statefile=STATEFILE\n\
 \n\
 lxc-checkpoint checkpoints in STATEFILE the NAME container\n\
 \n\
@@ -75,7 +87,8 @@ Options :\n\
   -n, --name=NAME      NAME for name of the container\n\
   -k, --kill           stop the container after checkpoint\n\
   -p, --pause          don't unfreeze the container after the checkpoint\n\
-  -d, --directory=STATEFILE where to store the statefile\n",
+  -d, --directory=STATEFILE where to store the statefile (deprecated)\n\
+  -i, --statefile=STATEFILE where to store the checkpoint-image (LIBCR mode)\n",
 
 	.options  = my_longopts,
 	.parser   = my_parser,
@@ -97,6 +110,7 @@ static int create_statefile(const char *dir)
 int main(int argc, char *argv[])
 {
 	int ret;
+	const char *statefile;
 
 	ret = lxc_arguments_parse(&my_args, argc, argv);
 	if (ret)
@@ -107,11 +121,15 @@ int main(int argc, char *argv[])
 	if (ret)
 		return ret;
 
-	ret = create_statefile(my_args.statefile);
-	if (ret)
-		return ret;
+	statefile = my_args.statefile;
+	if (statedir) {
+		statefile = statedir;
+		ret = create_statefile(statefile);
+		if (ret)
+			return ret;
+	}
 
-	ret = lxc_checkpoint(my_args.name, my_args.statefile, my_args.flags);
+	ret = lxc_checkpoint(my_args.name, statefile, my_args.flags);
 	if (ret)
 		ERROR("failed to checkpoint '%s'", my_args.name);
 	else
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (2 preceding siblings ...)
  2010-03-31  7:07   ` [PATCH 3/6][lxc][v3] lxc_checkpoint: " Sukadev Bhattiprolu
@ 2010-03-31  7:08   ` Sukadev Bhattiprolu
       [not found]     ` <20100331070848.GD23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  7:09   ` [PATCH 5/6][lxc][v3] Hook up lxc_restart() with app_restart() Sukadev Bhattiprolu
                     ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:08 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Mon, 29 Mar 2010 23:53:55 -0700
Subject: [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c

lxc_attach.c is currently not included in liblxc.so. In  afollowon
patch, checkpoint() function needs to also use the get_init_pid()
interface. So move the defintions into checkpoint.c - which would
then be accessible to both lxc_attach and lxc-checkpoint.

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 src/lxc/checkpoint.c |   29 +++++++++++++++++++++++++++++
 src/lxc/lxc.h        |    6 ++++++
 src/lxc/lxc_attach.c |   29 +----------------------------
 3 files changed, 36 insertions(+), 28 deletions(-)

diff --git a/src/lxc/checkpoint.c b/src/lxc/checkpoint.c
index 7e8a93e..4e75cb6 100644
--- a/src/lxc/checkpoint.c
+++ b/src/lxc/checkpoint.c
@@ -22,9 +22,38 @@
  */
 #include <lxc/lxc.h>
 #include <lxc/log.h>
+#include <lxc/commands.h>
 
 lxc_log_define(lxc_checkpoint, lxc);
 
+pid_t get_init_pid(const char *name)
+{
+	struct lxc_command command = {
+		.request = { .type = LXC_COMMAND_PID },
+	};
+
+	int ret, stopped = 0;
+
+	ret = lxc_command(name, &command, &stopped);
+	if (ret < 0 && stopped) {
+		INFO("'%s' is already stopped", name);
+		return 0;
+	}
+
+	if (ret < 0) {
+		ERROR("failed to send command");
+		return -1;
+	}
+
+	if (command.answer.ret) {
+		ERROR("failed to retrieve the init pid: %s",
+		      strerror(-command.answer.ret));
+		return -1;
+	}
+
+	return command.answer.pid;
+}
+
 int lxc_checkpoint(const char *name, const char *statefile, int flags)
 {
 	return 0;
diff --git a/src/lxc/lxc.h b/src/lxc/lxc.h
index b0b9f4e..bd87bdb 100644
--- a/src/lxc/lxc.h
+++ b/src/lxc/lxc.h
@@ -27,6 +27,7 @@
 extern "C" {
 #endif
 
+#include <unistd.h>
 #include <stddef.h>
 #include <lxc/state.h>
 
@@ -56,6 +57,11 @@ extern int lxc_start(const char *name, char *const argv[], struct lxc_conf *);
 extern int lxc_stop(const char *name);
 
 /*
+ * Get the pid of the root application process tree in parent-pid namespace
+ */
+extern pid_t get_init_pid(const char *name);
+
+/*
  * Open the monitoring mechanism for a specific container
  * The function will return an fd corresponding to the events
  * Returns a file descriptor on success, < 0 otherwise
diff --git a/src/lxc/lxc_attach.c b/src/lxc/lxc_attach.c
index a012c2c..3d2cdd5 100644
--- a/src/lxc/lxc_attach.c
+++ b/src/lxc/lxc_attach.c
@@ -28,6 +28,7 @@
 #include "commands.h"
 #include "arguments.h"
 #include "namespace.h"
+#include "lxc.h"
 #include "log.h"
 
 lxc_log_define(lxc_attach_ui, lxc);
@@ -50,34 +51,6 @@ Options :\n\
 	.checker  = NULL,
 };
 
-pid_t get_init_pid(const char *name)
-{
-	struct lxc_command command = {
-		.request = { .type = LXC_COMMAND_PID },
-	};
-
-	int ret, stopped = 0;
-
-	ret = lxc_command(name, &command, &stopped);
-	if (ret < 0 && stopped) {
-		INFO("'%s' is already stopped", name);
-		return 0;
-	}
-
-	if (ret < 0) {
-		ERROR("failed to send command");
-		return -1;
-	}
-
-	if (command.answer.ret) {
-		ERROR("failed to retrieve the init pid: %s",
-		      strerror(-command.answer.ret));
-		return -1;
-	}
-
-	return command.answer.pid;
-}
-
 int main(int argc, char *argv[], char *envp[])
 {
 	int ret;
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/6][lxc][v3] Hook up lxc_restart() with app_restart()
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (3 preceding siblings ...)
  2010-03-31  7:08   ` [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c Sukadev Bhattiprolu
@ 2010-03-31  7:09   ` Sukadev Bhattiprolu
  2010-03-31  7:10   ` [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint() Sukadev Bhattiprolu
                     ` (5 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:09 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Thu, 11 Mar 2010 20:47:56 -0800
Subject: [PATCH 5/6][lxc][v3] Hook up lxc_restart() with app_restart()

Have lxc_restart() call app_restart() implemented in the 'restart.o' from
USER-CR git tree.

Changelog[v3]:
	- (Daniel Lezcano) Remove unnecessary check for 'statefile' before
	  opening it
	- [Daniel Lezcano] Rebase to more recent version and use
	  lxc_check_inherited() instead of lxc_close_inherited_fd().
	- Have lxc_restart unfreeze the container if the --pause option
	  was not specified.
	- Use -D LIBCR to fix compile error when --with-libcr config is
	  not specified. Return ENOSYS if built with LIBCR undefined.
	- Implement the --pause option to set the RESTART_FROZEN flag.

Changelog[v2]:
	- Link with restart.o from usercr rather than libcheckpoint.a
	- rename 'struct restart_args' to 'struct app_restart_args'
	- Initialize the new field app_restart_args->uerrfd
	- (Oren Laadan)Remove ->send_sigint field from 'struct app_restart_args'
---
 src/lxc/restart.c |  129 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 128 insertions(+), 1 deletions(-)

diff --git a/src/lxc/restart.c b/src/lxc/restart.c
index 467489e..220acbf 100644
--- a/src/lxc/restart.c
+++ b/src/lxc/restart.c
@@ -22,11 +22,138 @@
  */
 #include <lxc/lxc.h>
 #include <lxc/log.h>
+#include <lxc/start.h>
+#include <lxc/namespace.h>
+#include <lxc/cgroup.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <signal.h>
+#include <sys/prctl.h>
+#include <sys/wait.h>
+#include "error.h"
+#ifdef LIBCR
+#include <app-checkpoint.h>
+#endif
 
 lxc_log_define(lxc_restart, lxc);
 
+#ifdef LIBCR
+
+struct lxc_restart_arg {
+	const char *name;
+	const char *statefile;
+	char *const argv;
+	struct lxc_handler *handler;
+	int lxc_flags;
+};
+
+static int do_restart(struct lxc_restart_arg *lxcarg)
+{
+	int pid;
+	int lxc_flags = lxcarg->lxc_flags;
+	struct lxc_handler *handler = lxcarg->handler;
+	const char *statefile = lxcarg->statefile;
+	struct app_restart_args restart_args;
+
+        if (sigprocmask(SIG_SETMASK, &handler->oldmask, NULL)) {
+                SYSERROR("failed to set sigprocmask");
+                return -1;
+        }
+
+        if (prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0)) {
+                SYSERROR("failed to set pdeath signal");
+                return -1;
+        }
+
+	memset(&restart_args, 0, sizeof(restart_args));
+
+	restart_args.infd = open(statefile, O_RDONLY, 0);
+	if (restart_args.infd < 0) {
+		SYSERROR("Failed to open statefile %s\n", statefile);
+		return -1;
+	}
+
+	restart_args.pids = 1;
+	restart_args.pidns = 1;
+	restart_args.mnt_pty = 1;
+	restart_args.mntns = 1;
+	restart_args.klogfd = -1;
+	restart_args.ulogfd = lxc_log_fd;
+	restart_args.uerrfd = fileno(stderr);
+	restart_args.debug = 1;
+	restart_args.wait = 0;
+
+	if (lxc_flags & LXC_FLAG_PAUSE)
+		restart_args.keep_frozen = 1;
+
+        pid = app_restart(&restart_args);
+
+        return pid;
+}
+
 int lxc_restart(const char *name, const char *statefile, struct lxc_conf *conf,
-		int flags)
+		int lxc_flags)
 {
+	int err;
+	int status;
+	struct lxc_handler *handler;
+	struct lxc_restart_arg lxcarg = {
+		.name = name,
+		.statefile = statefile,
+		.lxc_flags = lxc_flags,
+		.handler = NULL,
+	};
+
+	if (lxc_check_inherited())
+		return -1;
+
+	handler = lxc_init(name, conf);
+	if (!handler) {
+		ERROR("failed to initialize the container");
+		return -1;
+	}
+
+	lxcarg.handler = handler;
+	handler->pid = do_restart(&lxcarg);
+
+	INFO("do_restart(): returns pid %d\n", handler->pid);
+	lxc_rename_nsgroup(name, handler);
+
+	err = lxc_poll(name, handler);
+	if (err) {
+		ERROR("mainloop exited with an error");
+		goto out_abort;
+	}
+
+	while (waitpid(handler->pid, &status, 0) < 0 && errno == EINTR)
+		continue;
+
+	if (!(lxc_flags & LXC_FLAG_PAUSE)) {
+		err = lxc_unfreeze(name);
+		if (err) {
+			ERROR("lxc_restart(): Unable to unfreeze\n");
+			goto out_fini;
+		}
+	}
+
+	err =  lxc_error_set_and_log(handler->pid, status);
+
+out_fini:
+	lxc_fini(name, handler);
+	return err;
+
+out_abort:
+	lxc_abort(name, handler);
+	goto out_fini;
+
 	return 0;
 }
+#else
+int lxc_restart(const char *name, const char *statefile, struct lxc_conf *conf,
+		int lxc_flags)
+{
+	ERROR("'restart' function not configured");
+	ERROR("Try --with-libcr option to 'configure' script");
+	return -1;
+}
+#endif
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint()
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (4 preceding siblings ...)
  2010-03-31  7:09   ` [PATCH 5/6][lxc][v3] Hook up lxc_restart() with app_restart() Sukadev Bhattiprolu
@ 2010-03-31  7:10   ` Sukadev Bhattiprolu
       [not found]     ` <20100331071016.GF23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  9:29   ` [PATCH 0/6][lxc][v3] Link LXC with USERCR Michel Normand
                     ` (4 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31  7:10 UTC (permalink / raw)
  To: dlezcano-NmTC/0ZBporQT0dZR+AlfA; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA


From: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Date: Thu, 11 Mar 2010 21:32:38 -0800
Subject: [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint()

Have lxc_checkpoint() call app_checkpoint() implemented in checkpoint.o
in the USER-CR git tree

TODO:
	- Map lxc_flags to flags in sys_checkpoint()
	- Initialize app_checkpoint_args.debug and other fields based on
	  command line options to lxc_checkpoint rather than hard-coding
	  them

Changelog:[v2]:
	- Drop find_cinit_pid() and use get_init_pid() from a recent checkin
	- Implement --pause and --kill options to lxc-checkpoint
	- Use -D LIBCR to fix compile error when --with-libcr config option
	  is not specified. Return ENOSYS when LIBCR is undefined.
	- Add CHECKPOINT_NONETNS to flags arg to app_checkpoint() which
	  is needed to checkpoint an application that is not in a private
	  netns (new in ckpt-v20-dev).

Signed-off-by: Sukadev Bhattiprolu <sukadev-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
---
 src/lxc/checkpoint.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++-
 src/lxc/state.c      |    1 -
 2 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/src/lxc/checkpoint.c b/src/lxc/checkpoint.c
index 4e75cb6..6e905ff 100644
--- a/src/lxc/checkpoint.c
+++ b/src/lxc/checkpoint.c
@@ -22,7 +22,17 @@
  */
 #include <lxc/lxc.h>
 #include <lxc/log.h>
-#include <lxc/commands.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <errno.h>
+#include <linux/checkpoint.h>
+
+#include "commands.h"
+#include "arguments.h"
+#include "namespace.h"
+#ifdef LIBCR
+#include "app-checkpoint.h"
+#endif
 
 lxc_log_define(lxc_checkpoint, lxc);
 
@@ -54,7 +64,81 @@ pid_t get_init_pid(const char *name)
 	return command.answer.pid;
 }
 
-int lxc_checkpoint(const char *name, const char *statefile, int flags)
+#ifdef LIBCR
+int lxc_checkpoint(const char *name, const char *statefile, int lxc_flags)
 {
+	int ret;
+	int pid;
+	int flags;
+	struct app_checkpoint_args crargs;
+
+	pid = get_init_pid(name);
+	if (pid < 0) {
+		ERROR("Unable to find cinit pid");
+		return -1;
+	}
+
+	ret = lxc_freeze(name);
+	if (ret < 0)
+		return ret;
+
+	memset(&crargs, 0, sizeof(crargs));
+
+	ret = open(statefile, O_CREAT|O_RDWR|O_EXCL, 0600);
+	if (ret < 0) {
+		ERROR("open(%s) failed, %s\n", statefile, strerror(errno));
+		return -1;
+	}
+
+	crargs.outfd = ret;
+	crargs.logfd = lxc_log_fd;
+	crargs.uerrfd = lxc_log_fd;
+	/*
+	 * TODO: Set this to 0 for now - otherwise we get an objhash leak
+	 * 	 due to mismatched references to current PTY which needs to
+	 * 	 be investigated.
+	 *
+	 * TODO: Map @lxc_flags to user-cr flags ?
+	 *
+	 * TODO: We can probably drop the ->container field since @flags
+	 * 	 can provide the same selection. 
+	 *
+	 * TODO: Do we may need a --container option to lxc_checkpoint or
+	 * 	 assume that we always work with full containers ?
+	 */
+	crargs.container = 0;
+
+	/*
+	 * TODO: Set the CHECKPOINT_NONETNS unconditionally for now. Otherwise
+	 * 	 it would require running the application in a private netns.
+	 * 	 Implement a command-line option to allow user selection.
+	 */
+	flags = CHECKPOINT_SUBTREE|CHECKPOINT_NONETNS;
+
+	ret = app_checkpoint(pid, flags, &crargs);
+	if (ret < 0) {
+		ERROR("checkpoint of %s (pid %d) failed\n", name, pid);
+		return -1;
+	}
+
+	if (lxc_flags & LXC_FLAG_HALT) {
+		ret = lxc_stop(name);
+		if (ret < 0)
+			return ret;
+
+		return lxc_unfreeze(name);
+	}
+
+	if (!(lxc_flags & LXC_FLAG_PAUSE))
+		return lxc_unfreeze(name);
+
 	return 0;
 }
+#else
+int lxc_checkpoint(const char *name, const char *statefile, int lxc_flags)
+{
+	ERROR("'checkpoint' not configured");
+	ERROR("Try --with-libcr option to 'configure' script");
+	return -1;
+}
+#endif
diff --git a/src/lxc/state.c b/src/lxc/state.c
index b29ae09..1e4c7e1 100644
--- a/src/lxc/state.c
+++ b/src/lxc/state.c
@@ -167,4 +167,3 @@ extern int lxc_state_callback(int fd, struct lxc_request *request,
 out:
 	return ret;
 }
-
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint()
       [not found]     ` <20100331071016.GF23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31  8:08       ` Michel Normand
  2010-03-31  8:18       ` Cedric Le Goater
  1 sibling, 0 replies; 28+ messages in thread
From: Michel Normand @ 2010-03-31  8:08 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Containers, clg-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	DLEZCANO-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

unable to apply without manual changes
you have blank characters on two comment lines.

---
Michel

Le mercredi 31 mars 2010 à 00:10 -0700, Sukadev Bhattiprolu a écrit :
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Thu, 11 Mar 2010 21:32:38 -0800
> Subject: [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint()
> 
> Have lxc_checkpoint() call app_checkpoint() implemented in checkpoint.o
> in the USER-CR git tree
> 
> TODO:
> 	- Map lxc_flags to flags in sys_checkpoint()
> 	- Initialize app_checkpoint_args.debug and other fields based on
> 	  command line options to lxc_checkpoint rather than hard-coding
> 	  them
> 
> Changelog:[v2]:
> 	- Drop find_cinit_pid() and use get_init_pid() from a recent checkin
> 	- Implement --pause and --kill options to lxc-checkpoint
> 	- Use -D LIBCR to fix compile error when --with-libcr config option
> 	  is not specified. Return ENOSYS when LIBCR is undefined.
> 	- Add CHECKPOINT_NONETNS to flags arg to app_checkpoint() which
> 	  is needed to checkpoint an application that is not in a private
> 	  netns (new in ckpt-v20-dev).
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  src/lxc/checkpoint.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  src/lxc/state.c      |    1 -
>  2 files changed, 86 insertions(+), 3 deletions(-)
> 
> diff --git a/src/lxc/checkpoint.c b/src/lxc/checkpoint.c
> index 4e75cb6..6e905ff 100644
> --- a/src/lxc/checkpoint.c
> +++ b/src/lxc/checkpoint.c
> @@ -22,7 +22,17 @@
>   */
>  #include <lxc/lxc.h>
>  #include <lxc/log.h>
> -#include <lxc/commands.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <errno.h>
> +#include <linux/checkpoint.h>
> +
> +#include "commands.h"
> +#include "arguments.h"
> +#include "namespace.h"
> +#ifdef LIBCR
> +#include "app-checkpoint.h"
> +#endif
>  
>  lxc_log_define(lxc_checkpoint, lxc);
>  
> @@ -54,7 +64,81 @@ pid_t get_init_pid(const char *name)
>  	return command.answer.pid;
>  }
>  
> -int lxc_checkpoint(const char *name, const char *statefile, int flags)
> +#ifdef LIBCR
> +int lxc_checkpoint(const char *name, const char *statefile, int lxc_flags)
>  {
> +	int ret;
> +	int pid;
> +	int flags;
> +	struct app_checkpoint_args crargs;
> +
> +	pid = get_init_pid(name);
> +	if (pid < 0) {
> +		ERROR("Unable to find cinit pid");
> +		return -1;
> +	}
> +
> +	ret = lxc_freeze(name);
> +	if (ret < 0)
> +		return ret;
> +
> +	memset(&crargs, 0, sizeof(crargs));
> +
> +	ret = open(statefile, O_CREAT|O_RDWR|O_EXCL, 0600);
> +	if (ret < 0) {
> +		ERROR("open(%s) failed, %s\n", statefile, strerror(errno));
> +		return -1;
> +	}
> +
> +	crargs.outfd = ret;
> +	crargs.logfd = lxc_log_fd;
> +	crargs.uerrfd = lxc_log_fd;
> +	/*
> +	 * TODO: Set this to 0 for now - otherwise we get an objhash leak
> +	 * 	 due to mismatched references to current PTY which needs to
> +	 * 	 be investigated.
> +	 *
> +	 * TODO: Map @lxc_flags to user-cr flags ?
> +	 *
> +	 * TODO: We can probably drop the ->container field since @flags
> +	 * 	 can provide the same selection. 
> +	 *
> +	 * TODO: Do we may need a --container option to lxc_checkpoint or
> +	 * 	 assume that we always work with full containers ?
> +	 */
> +	crargs.container = 0;
> +
> +	/*
> +	 * TODO: Set the CHECKPOINT_NONETNS unconditionally for now. Otherwise
> +	 * 	 it would require running the application in a private netns.
> +	 * 	 Implement a command-line option to allow user selection.
> +	 */
> +	flags = CHECKPOINT_SUBTREE|CHECKPOINT_NONETNS;
> +
> +	ret = app_checkpoint(pid, flags, &crargs);
> +	if (ret < 0) {
> +		ERROR("checkpoint of %s (pid %d) failed\n", name, pid);
> +		return -1;
> +	}
> +
> +	if (lxc_flags & LXC_FLAG_HALT) {
> +		ret = lxc_stop(name);
> +		if (ret < 0)
> +			return ret;
> +
> +		return lxc_unfreeze(name);
> +	}
> +
> +	if (!(lxc_flags & LXC_FLAG_PAUSE))
> +		return lxc_unfreeze(name);
> +
>  	return 0;
>  }
> +#else
> +int lxc_checkpoint(const char *name, const char *statefile, int lxc_flags)
> +{
> +	ERROR("'checkpoint' not configured");
> +	ERROR("Try --with-libcr option to 'configure' script");
> +	return -1;
> +}
> +#endif
> diff --git a/src/lxc/state.c b/src/lxc/state.c
> index b29ae09..1e4c7e1 100644
> --- a/src/lxc/state.c
> +++ b/src/lxc/state.c
> @@ -167,4 +167,3 @@ extern int lxc_state_callback(int fd, struct lxc_request *request,
>  out:
>  	return ret;
>  }
> -


_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option
       [not found]     ` <20100331070711.GB23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31  8:10       ` Michel Normand
  0 siblings, 0 replies; 28+ messages in thread
From: Michel Normand @ 2010-03-31  8:10 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Containers, clg-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	DLEZCANO-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

unable to apply without manual changes
you have blank character on one line

---
Michel


Le mercredi 31 mars 2010 à 00:07 -0700, Sukadev Bhattiprolu a écrit :
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Sat, 27 Mar 2010 00:08:17 -0700
> Subject: [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option
> 
> The existing --directory option to lxc_restart expects the checkpoint state
> to be a directory. USERCR however uses a single regular file to store the
> checkpoint image. So add a --statefile option to enable checkpointing and
> restarting applications using USERCR.
> 
> Depending on how the application was checkpointed, users should specify
> either --statefile=STATEFILE or the --directory=STATEFILE option (but not
> both).
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  src/lxc/lxc_restart.c |   34 +++++++++++++++++++++++++---------
>  1 files changed, 25 insertions(+), 9 deletions(-)
> 
> diff --git a/src/lxc/lxc_restart.c b/src/lxc/lxc_restart.c
> index 7db1d85..de4b421 100644
> --- a/src/lxc/lxc_restart.c
> +++ b/src/lxc/lxc_restart.c
> @@ -38,13 +38,21 @@
>  lxc_log_define(lxc_restart_ui, lxc_restart);
>  
>  static struct lxc_list defines;
> +static char *statedir;
>  
>  static int my_checker(const struct lxc_arguments* args)
>  {
> -	if (!args->statefile) {
> -		lxc_error(args, "no statefile specified");
> -		return -1;
> -	}
> +	int d, f;
> +	
> +	/* make them boolean */
> +	d = !!(statedir);
> +	f = !!(args->statefile);
> +
> +        if (!(d ^ f)) {
> +                lxc_error(args, "Must specify exactly one of --directory "
> +                                "and --statefile options");
> +                return -1;
> +        }
>  
>  	return 0;
>  }
> @@ -52,8 +60,9 @@ static int my_checker(const struct lxc_arguments* args)
>  static int my_parser(struct lxc_arguments* args, int c, char* arg)
>  {
>  	switch (c) {
> -	case 'd': args->statefile = arg; break;
> +	case 'd': statedir = arg; break;
>  	case 'f': args->rcfile = arg; break;
> +	case 'S': args->statefile = arg; break;
>  	case 'p': args->flags = LXC_FLAG_PAUSE; break;
>  	case 's': return lxc_config_define_add(&defines, arg);
>  	}
> @@ -66,21 +75,24 @@ static const struct option my_longopts[] = {
>  	{"rcfile", required_argument, 0, 'f'},
>  	{"pause", no_argument, 0, 'p'},
>  	{"define", required_argument, 0, 's'},
> +	{"statefile", required_argument, 0, 'S'},
>  	LXC_COMMON_OPTIONS
>  };
>  
>  static struct lxc_arguments my_args = {
>  	.progname = "lxc-restart",
>  	.help     = "\
> ---name=NAME --directory STATEFILE\n\
> +--name=NAME --directory STATEFILE (deprecated)\n\
> +\tlxc_restart --name=NAME --statefile=STATEFILE\n\
>  \n\
>  lxc-restart restarts from STATEFILE the NAME container\n\
>  \n\
>  Options :\n\
>    -n, --name=NAME      NAME for name of the container\n\
>    -p, --pause          do not release the container after the restart\n\
> -  -d, --directory=STATEFILE for name of statefile\n\
> +  -d, --directory=STATEFILE for name of statefile (legacy mode, deprecated)\n\
>    -f, --rcfile=FILE Load configuration file FILE\n\
> +  -i, --statefile=STATEFILE Load the application state from STATEFILE (libcr mode)\n\
>    -s, --define KEY=VAL Assign VAL to configuration variable KEY\n",
>  	.options  = my_longopts,
>  	.parser   = my_parser,
> @@ -90,6 +102,7 @@ Options :\n\
>  int main(int argc, char *argv[])
>  {
>  	char *rcfile = NULL;
> +	const char *statefile;
>  	struct lxc_conf *conf;
>  
>  	lxc_list_init(&defines);
> @@ -131,6 +144,9 @@ int main(int argc, char *argv[])
>  	if (lxc_config_define_load(&defines, conf))
>  		return -1;
>  
> -	return lxc_restart(my_args.name, my_args.statefile, conf,
> -			   my_args.flags);
> +	statefile = my_args.statefile;
> +	if (statedir)
> +		statefile = statedir;
> +
> +	return lxc_restart(my_args.name, statefile, conf, my_args.flags);
>  }


_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6][lxc][v3] Add --with-libcr configure option
       [not found]     ` <20100331070633.GA23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31  8:11       ` Michel Normand
  2010-03-31 17:21         ` Sukadev Bhattiprolu
  0 siblings, 1 reply; 28+ messages in thread
From: Michel Normand @ 2010-03-31  8:11 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Containers, clg-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	DLEZCANO-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

configure error because of typo below


Le mercredi 31 mars 2010 à 00:06 -0700, Sukadev Bhattiprolu a écrit :
> From: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Date: Wed, 24 Mar 2010 17:26:44 -0700
> Subject: [PATCH 1/6][lxc][v3] Add --with-libcr configure option
> 
> Add a configure option, --with-libcr=dir which would allow linking
> with external (i.e USERCR) implementation  of checkpoint/restart.
> 
> For now, USERCR "publishes" a app-checkpoint.h, checkpoint.o and
> restart.o files which implement the functions app_checkpoint() and
> app_restart().
> 
> Usage:
> 	$ ./autogen.sh
> 
> 	$ ./configure --help |grep libcr
> 	--with-libcr=dir     use the Checkpoint/Restart implementation in 'dir'
> 
> 	$ ls /home/guest/user-cr/
> 	app-checkpoint.h    checkpoint.o    restart.o
> 
> 	$ ./configure --with-libcr=/home/guest/user-cr
> 
> TODO:
> 	If names of interfaces in USERCR change, we may want to rename
> 	the config option too ?
> 
> 	LIBCR_CFLAGS are only needed for src/lxc/{checkpoint.c,restart.c}
> 	but not sure if there is an easy way to define autoconf CFLAGS
> 	just for those two files.
> 
> Changelog[v2]:
> 	- Rename --with-usercr to --with-libcr
> 	- Add libeclone.a to the LIBCR_OBJS variable since functions in
> 	  libeclone.a will be used by checkpoint() and restart() functions.
> 	- Add -I${with_libcr}/include to LIBCR_CFLAGS to pick up
> 	  checkpoint_hdr.h, checkpoint.h etc.
> 
> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> ---
>  configure.ac        |   19 +++++++++++++++++++
>  src/lxc/Makefile.am |   10 +++++++++-
>  2 files changed, 28 insertions(+), 1 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index f82e7df..fe6584c 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -12,6 +12,25 @@ AM_PROG_CC_C_O
>  AC_GNU_SOURCE
>  AC_CHECK_PROG(SETCAP, setcap, yes, no, $PATH$PATH_SEPARATOR/sbin)
>  
> +AC_ARG_WITH(libcr, [AS_HELP_STRING([--with-libcr=dir], \
> +           [use the Checkpoint/Restart implementation in 'dir'])], [], \
> +	   [with_libcr=no])
> +
> +if test "x$with_libcr" != "xno"; then
> +       AS_AC_EXPAND(LIBCR_OBJS, "${with_libcr}/checkpoint.o ${with_libcr}/restart.o ${with_libcr}/libeclone.a")
> +       AS_AC_EXPAND(LIBCR_CFLAGS, "-DLIBCR -I${with_libcr} -I$(with_libcr}/include")

typo here $( to be replaced by ${

> +
> +       AC_CHECK_FILE([$with_libcr/app-checkpoint.h], [], \
> +               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but $with_libcr/app-checkpoint.h was not found]))
> +
> +       AC_CHECK_FILE([${with_libcr}/checkpoint.o], [], \
> +               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but ${with_libcr}/checkpoint.o was not found]))
> +
> +       AC_CHECK_FILE([${with_libcr}/restart.o], [], \
> +               AC_MSG_ERROR([--with-libcr specified directory $with_libcr but ${with_libcr}/restart.o was not found]))
> +fi
> +
> +
>  AC_ARG_ENABLE([doc],
>  	[AC_HELP_STRING([--enable-doc], [make mans (require docbook2man installed) [default=auto]])],
>  	[], [enable_doc=auto])
> diff --git a/src/lxc/Makefile.am b/src/lxc/Makefile.am
> index 890f706..699c355 100644
> --- a/src/lxc/Makefile.am
> +++ b/src/lxc/Makefile.am
> @@ -46,12 +46,20 @@ liblxc_so_SOURCES = \
>  	mainloop.c mainloop.h \
>  	af_unix.c af_unix.h
>  
> -AM_CFLAGS=-I$(top_srcdir)/src
> +# We only need $(LIBCR_CFLAGS) for lxc_checkpoint and lxc_restart files
> +# but for now, just set it for all.
> +AM_CFLAGS=-I$(top_srcdir)/src $(LIBCR_CFLAGS)
>  
>  liblxc_so_CFLAGS = -fPIC -DPIC $(AM_CFLAGS)
>  
> +# TODO: Adding $(LIBCR_OBJS) here ensures we don't have undefined references
> +# 	when building liblxc.so, but this has the side-effect of putting the
> +# 	app_checkpoint/restart functions in liblxc.so. Or alternatively,
> +# 	we could remove src/lxc/{checkpoint.o,restart.o} from liblxc.so
> +# 	and link lxc-checkpoint/lxc-restart with them directly.
>  liblxc_so_LDFLAGS = \
>  	-shared \
> +	$(LIBCR_OBJS) \
>  	-Wl,-soname,liblxc.so.$(firstword $(subst ., ,$(VERSION)))
>  
>  liblxc_so_LDADD = -lutil


_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c
       [not found]     ` <20100331070848.GD23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31  8:17       ` Cedric Le Goater
  0 siblings, 0 replies; 28+ messages in thread
From: Cedric Le Goater @ 2010-03-31  8:17 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dlezcano-NmTC/0ZBporQT0dZR+AlfA, Containers

On 03/31/2010 09:08 AM, Sukadev Bhattiprolu wrote:
> lxc_attach.c is currently not included in liblxc.so. In  afollowon
> patch, checkpoint() function needs to also use the get_init_pid()
> interface. So move the defintions into checkpoint.c - which would
> then be accessible to both lxc_attach and lxc-checkpoint.
>

couldn't we move that routine in src/lxc/utils.c ?

thanks,

C.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint()
       [not found]     ` <20100331071016.GF23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31  8:08       ` Michel Normand
@ 2010-03-31  8:18       ` Cedric Le Goater
  1 sibling, 0 replies; 28+ messages in thread
From: Cedric Le Goater @ 2010-03-31  8:18 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dlezcano-NmTC/0ZBporQT0dZR+AlfA, Containers

On 03/31/2010 09:10 AM, Sukadev Bhattiprolu wrote:
> diff --git a/src/lxc/state.c b/src/lxc/state.c
> index b29ae09..1e4c7e1 100644
> --- a/src/lxc/state.c
> +++ b/src/lxc/state.c
> @@ -167,4 +167,3 @@ extern int lxc_state_callback(int fd, struct lxc_request *request,
>   out:
>   	return ret;
>   }
> -

that's noise.

C.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (5 preceding siblings ...)
  2010-03-31  7:10   ` [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint() Sukadev Bhattiprolu
@ 2010-03-31  9:29   ` Michel Normand
  2010-03-31  9:38   ` Cedric Le Goater
                     ` (3 subsequent siblings)
  10 siblings, 0 replies; 28+ messages in thread
From: Michel Normand @ 2010-03-31  9:29 UTC (permalink / raw)
  To: Sukadev Bhattiprolu
  Cc: Containers, clg-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	DLEZCANO-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Hi Suka, 
I tried to follow your how-to but failed to checkpoint 
your first example.

Le mercredi 31 mars 2010 à 00:04 -0700, Sukadev Bhattiprolu a écrit :
> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
> Following set of patches, when applied to LXC and built with USERCR as
> described below, enable enable lxc-checkpoint and lxc-restart of some
> simple containers

> [CUT] ...

> 1. Build USERCR
> 
> 	$ cd /root
> 
> 	$ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr
> 
> 	$ cd user-cr
> 
> 	$ git-checkout ckpt-v20-dev
> 
> 	  	Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2
> 		as HEAD.
> 	
> 	# Apply following two patches:
> 
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024037.html
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024038.html
> 
> 	$ cd /root/user-cr
> 
> 	$ KERNELSRC=/root/linux-2.6/ make 
> 
> 		Build USERCR by pointing to corresponding kernel-source.
> 		This should create restart.o and checkpoint.o needed by LXC.

I assume we need a sudo make install, at least for ckptinfo, do we ?

> 2. Build/install LXC
> 
> 	$ cd /root/lxc.git
> 
> 	Apply attached patches to LXC (I tested with these patches applied
> 	to commit 9ea8066aa67b808f71f46e346bd7a215e2a355f3)
> 
> 	$ autogen.sh
> 
> 	$ ./configure --with-libcr=/root/user-cr
> 
> 		This will fail if /root/user-cr does not container checkpoint.o,
> 		restart.o and app-checkpoint.h files 
> 	$ make
> 
> 	$ make install
> 
> 3. Checkpoint/restart a simple LXC container
> 
> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
> 
> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt

not --image but --statefile

* The command failed, but no details of the error.
Would it be possible to have displayed on stderr an error message
to explain the cause of the problem ?
===
$lxc-checkpoint -n foo --statefile sf
lxc-checkpoint: checkpoint of foo (pid 1554) failed

lxc-checkpoint: failed to checkpoint 'foo'
===

* If I am re-running the command adding -ofoo.log -lTRACE
I have one more clue with following two lines, but do not help.
could the message be improved ?
===
checkpoint: Invalid argument
(you may use 'ckptinfo -e' for more info)
===

* and doing a strace of the command. How to determine what is wrong ?
===
open("sf", O_RDWR|O_CREAT|O_EXCL, 0600) = 4
SYS_339(0x628, 0x4, 0x3, 0xffffffff, 0x1) = -1 EINVAL (Invalid argument)
===

---
Michel

> 
> 	$ lxc-stop --name foo
> 
> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt

> [CUT] ...

_______________________________________________
Containers mailing list
Containers@lists.linux-foundation.org
https://lists.linux-foundation.org/mailman/listinfo/containers

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (6 preceding siblings ...)
  2010-03-31  9:29   ` [PATCH 0/6][lxc][v3] Link LXC with USERCR Michel Normand
@ 2010-03-31  9:38   ` Cedric Le Goater
       [not found]     ` <4BB31801.4000304-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
  2010-03-31 13:58   ` Daniel Lezcano
                     ` (2 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Cedric Le Goater @ 2010-03-31  9:38 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: dlezcano-NmTC/0ZBporQT0dZR+AlfA, Containers

On 03/31/2010 09:04 AM, Sukadev Bhattiprolu wrote:
> 3. Checkpoint/restart a simple LXC container
>
> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
>
> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
>
> 	$ lxc-stop --name foo
>
> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt

Here's a very simple program :

	# pi1.sh
	#
	# IBM Confidential
	#
	# OCO Source Materials
	#
	# P91223
	#
	# (C) Copyright IBM Corp. 2003, 2008
	#
	# The source code for this program is not published or otherwise
	# divested of its trade secrets, irrespective of what has been
	# deposited with the U.S. Copyright Office.

	#!/bin/sh

	me=`basename "$0" 2>/dev/null`
	ndigit=${1:-1000}

	echo $me - $ndigit digits
	echo "scale=$ndigit;a(1)*4" | bc -l

can you run in a terminal :

	$ lxc-execute -n pi1 -- ./pi1.sh 3000 &
	$ lxc-checkpoint -n pi1 -d ./ckpt -k
	$ lxc-restart -n pi1 -d ./ckpt

Thanks,

C.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]     ` <4BB31801.4000304-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31 12:13       ` Cedric Le Goater
       [not found]         ` <4BB33C81.9070802-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Cedric Le Goater @ 2010-03-31 12:13 UTC (permalink / raw)
  Cc: dlezcano-NmTC/0ZBporQT0dZR+AlfA, Sukadev Bhattiprolu, Containers

  
> Here's a very simple program :
>
> 	# pi1.sh
> 	#
> 	# IBM Confidential
> 	#
> 	# OCO Source Materials
> 	#
> 	# P91223
> 	#
> 	# (C) Copyright IBM Corp. 2003, 2008
> 	#
> 	# The source code for this program is not published or otherwise
> 	# divested of its trade secrets, irrespective of what has been
> 	# deposited with the U.S. Copyright Office.

that's from the example section of bc(1) manpage. nothing confidential
there :)

> 	#!/bin/sh
>
> 	me=`basename "$0" 2>/dev/null`
> 	ndigit=${1:-1000}
>
> 	echo $me - $ndigit digits
> 	echo "scale=$ndigit;a(1)*4" | bc -l
>
> can you run in a terminal :
>
> 	$ lxc-execute -n pi1 -- ./pi1.sh 3000&
> 	$ lxc-checkpoint -n pi1 -d ./ckpt -k
> 	$ lxc-restart -n pi1 -d ./ckpt

please try that suka, it's much more interesting than sleep which can
silently segfault after restart.

thanks,

C.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (7 preceding siblings ...)
  2010-03-31  9:38   ` Cedric Le Goater
@ 2010-03-31 13:58   ` Daniel Lezcano
       [not found]     ` <4BB35519.8080500-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
  2010-03-31 16:31   ` Daniel Lezcano
  2010-03-31 19:58   ` Daniel Lezcano
  10 siblings, 1 reply; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 13:58 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Sukadev Bhattiprolu wrote:
> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
> Following set of patches, when applied to LXC and built with USERCR as
> described below, enable enable lxc-checkpoint and lxc-restart of some
> simple containers
> 

[ ... ]

> 1. Build USERCR
> 
> 	$ cd /root
> 
> 	$ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr
> 
> 	$ cd user-cr
> 
> 	$ git-checkout ckpt-v20-dev
> 
> 	  	Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2
> 		as HEAD.
> 	
> 	# Apply following two patches:
> 
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024037.html
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024038.html
> 
> 	$ cd /root/user-cr
> 
> 	$ KERNELSRC=/root/linux-2.6/ make 

Assuming the kernel source tree below is right,

   http://www.linux-cr.org/git/?p=linux-cr.git;a=summary

Shall I use ckpt-v20 or ckpt-v20-dev with your patchset ?

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (8 preceding siblings ...)
  2010-03-31 13:58   ` Daniel Lezcano
@ 2010-03-31 16:31   ` Daniel Lezcano
  2010-03-31 19:58   ` Daniel Lezcano
  10 siblings, 0 replies; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 16:31 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Sukadev Bhattiprolu wrote:
> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
> Following set of patches, when applied to LXC and built with USERCR as
> described below, enable enable lxc-checkpoint and lxc-restart of some
> simple containers
>   

Hi Suka,

Thanks for the patchset. Before going further and comment the patchset 
and the TODOs, I would like to succeed to checkpoint something.

Despite your detailed informations, I was not able to checkpoint 
'sleep'. I am sure I am missing something or something is missing :)

[ ... ]

Here is the different steps I did:

 1 -  downloaded the git tree at git://www.linux-cr.org/pub/git/linux-cr
 2 - compiled the kernel making sure I have the following kernel config 
options

 > grep CHECKPOINT .config
CONFIG_CHECKPOINT_SUPPORT=y
CONFIG_SYSVIPC_CHECKPOINT=y
CONFIG_CHECKPOINT=y
CONFIG_CHECKPOINT_DEBUG=y

and of course the needed namespaces.

3 - The level of the code is the commit 
3522c57a9ec6f08a129a78322318abcb4467db28


> 1. Build USERCR
>
> 	$ cd /root
>
> 	$ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr
>
> 	$ cd user-cr
>
> 	$ git-checkout ckpt-v20-dev
>
> 	  	Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2
> 		as HEAD.
> 	
> 	# Apply following two patches:
>
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024037.html
> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024038.html
>
> 	$ cd /root/user-cr
>
> 	$ KERNELSRC=/root/linux-2.6/ make 
>
> 		Build USERCR by pointing to corresponding kernel-source.
> 		This should create restart.o and checkpoint.o needed by LXC.
>   

4 - followed these steps ^^^^

I had to compile user-cr with the "-fPIC" flags in order to link lxc and 
user-cr.

/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: 
/home/dlezcano/work/src/user-cr/checkpoint.o: relocation R_X86_64_32 
against `a local symbol' can not be used when making a shared object; 
recompile with -fPIC
/home/dlezcano/work/src/user-cr/checkpoint.o: could not read symbols: 
Bad value


> 2. Build/install LXC
>
> 	$ cd /root/lxc.git
>
> 	Apply attached patches to LXC (I tested with these patches applied
> 	to commit 9ea8066aa67b808f71f46e346bd7a215e2a355f3)
>   

Ok, I applied the patchset with quilt. It looks like Michel used git 
which is stricter than quilt.
I fixed the typo in configure.ac

> 3. Checkpoint/restart a simple LXC container
>
> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
>
> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
>   

Ok, I reach this point. But the checkpoint fails always with EINVAL and 
the statefile is empty.

Digging a bit in the userspace + kernel code, I was wondering if the 
user flags match ?

Thanks in advance.
  -- Daniel

ps : I noticed the application is not thawed in the error code path, so 
I have to resume it manually everytime the checkpoint fails.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 1/6][lxc][v3] Add --with-libcr configure option
  2010-03-31  8:11       ` Michel Normand
@ 2010-03-31 17:21         ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31 17:21 UTC (permalink / raw)
  To: Michel Normand
  Cc: Containers, clg-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	dlezcano-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8

Michel Normand [normand-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org] wrote:
| configure error because of typo below
| 

Sorry, I had not committed after fixing it. Thanks for fixing it.

Did you also need the -fPIC to compile checkpoint.o ? 

Thanks,

Sukadev

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
                     ` (9 preceding siblings ...)
  2010-03-31 16:31   ` Daniel Lezcano
@ 2010-03-31 19:58   ` Daniel Lezcano
       [not found]     ` <4BB3A981.4020709-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
  10 siblings, 1 reply; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 19:58 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Sukadev Bhattiprolu wrote:
> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
> Following set of patches, when applied to LXC and built with USERCR as
> described below, enable enable lxc-checkpoint and lxc-restart of some
> simple containers

[ ... ]

> 3. Checkpoint/restart a simple LXC container
> 
> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
> 
> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
> 
> 	$ lxc-stop --name foo
> 
> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt

Finally, using ckpt-v20-dev, I succeeded to checkpoint sleep but when I 
restart, I got the error:

<4534>number of tasks: 2
<4534>number of vpids: 0
<4534>total tasks (including ghosts): 2
<4534>pid 2: inherit sid 0
<4534>pid 2: creator set to 1
<4534>====== TASKS
<4534>	[0] pid 1 ppid 0 sid 0 creator 0
<4534>	[1] pid 2 ppid 1 sid 0 creator 1
<4534>............
<4534>task[0].vidx = -1
<4534>task[1].vidx = -1
<4534>new pidns with init
<4534>forking child vpid 1 flags 0x321
<4534>task 1 forking with flags 20020011 numpids 1
<4534>task 1 pid[0]=0
<4535>====== PIDS ARRAY
<4535>[0] pid 1 ppid 0 sid 0 pgid 0
<4535>[1] pid 2 ppid 1 sid 0 pgid 0
<4535>............
Error: /dev/ptmx must be a link to /dev/pts/ptmx
<4534>forked child vpid 4536 (asked 1)
root task exited status 0

What can I do to prevent this error ?

Thanks
   -- Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]     ` <4BB3A981.4020709-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31 20:12       ` Serge E. Hallyn
       [not found]         ` <20100331201240.GA26773-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Serge E. Hallyn @ 2010-03-31 20:12 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Containers, Sukadev Bhattiprolu, clg-NmTC/0ZBporQT0dZR+AlfA

Quoting Daniel Lezcano (dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org):
> Sukadev Bhattiprolu wrote:
> >lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
> >Following set of patches, when applied to LXC and built with USERCR as
> >described below, enable enable lxc-checkpoint and lxc-restart of some
> >simple containers
> 
> [ ... ]
> 
> >3. Checkpoint/restart a simple LXC container
> >
> >	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
> >
> >	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
> >
> >	$ lxc-stop --name foo
> >
> >	$ lxc-restart --name foo --image /root/lxc-foo.ckpt
> 
> Finally, using ckpt-v20-dev, I succeeded to checkpoint sleep but
> when I restart, I got the error:
> 
> <4534>number of tasks: 2
> <4534>number of vpids: 0
> <4534>total tasks (including ghosts): 2
> <4534>pid 2: inherit sid 0
> <4534>pid 2: creator set to 1
> <4534>====== TASKS
> <4534>	[0] pid 1 ppid 0 sid 0 creator 0
> <4534>	[1] pid 2 ppid 1 sid 0 creator 1
> <4534>............
> <4534>task[0].vidx = -1
> <4534>task[1].vidx = -1
> <4534>new pidns with init
> <4534>forking child vpid 1 flags 0x321
> <4534>task 1 forking with flags 20020011 numpids 1
> <4534>task 1 pid[0]=0
> <4535>====== PIDS ARRAY
> <4535>[0] pid 1 ppid 0 sid 0 pgid 0
> <4535>[1] pid 2 ppid 1 sid 0 pgid 0
> <4535>............
> Error: /dev/ptmx must be a link to /dev/pts/ptmx
> <4534>forked child vpid 4536 (asked 1)
> root task exited status 0
> 
> What can I do to prevent this error ?

test -e /dev/pts/ptmx || (echo "Don't go through with the rest of this" && exit)
rm -f /dev/ptmx
ln -s /dev/pts/ptmx /dev/ptmx
chmod 666 /dev/ptmx

Unfortunately with udev i tend to always have to do this after every
mount (So I do it in the same script that mounts freezer)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]         ` <20100331201240.GA26773-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31 20:22           ` Daniel Lezcano
  2010-03-31 21:00           ` Daniel Lezcano
  1 sibling, 0 replies; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 20:22 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Containers, Sukadev Bhattiprolu, clg-NmTC/0ZBporQT0dZR+AlfA

Serge E. Hallyn wrote:
> Quoting Daniel Lezcano (dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org):
>> Sukadev Bhattiprolu wrote:
>>> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
>>> Following set of patches, when applied to LXC and built with USERCR as
>>> described below, enable enable lxc-checkpoint and lxc-restart of some
>>> simple containers
>> [ ... ]
>>
>>> 3. Checkpoint/restart a simple LXC container
>>>
>>> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
>>>
>>> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
>>>
>>> 	$ lxc-stop --name foo
>>>
>>> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt
>> Finally, using ckpt-v20-dev, I succeeded to checkpoint sleep but
>> when I restart, I got the error:
>>
>> <4534>number of tasks: 2
>> <4534>number of vpids: 0
>> <4534>total tasks (including ghosts): 2
>> <4534>pid 2: inherit sid 0
>> <4534>pid 2: creator set to 1
>> <4534>====== TASKS
>> <4534>	[0] pid 1 ppid 0 sid 0 creator 0
>> <4534>	[1] pid 2 ppid 1 sid 0 creator 1
>> <4534>............
>> <4534>task[0].vidx = -1
>> <4534>task[1].vidx = -1
>> <4534>new pidns with init
>> <4534>forking child vpid 1 flags 0x321
>> <4534>task 1 forking with flags 20020011 numpids 1
>> <4534>task 1 pid[0]=0
>> <4535>====== PIDS ARRAY
>> <4535>[0] pid 1 ppid 0 sid 0 pgid 0
>> <4535>[1] pid 2 ppid 1 sid 0 pgid 0
>> <4535>............
>> Error: /dev/ptmx must be a link to /dev/pts/ptmx
>> <4534>forked child vpid 4536 (asked 1)
>> root task exited status 0
>>
>> What can I do to prevent this error ?
> 
> test -e /dev/pts/ptmx || (echo "Don't go through with the rest of this" && exit)
> rm -f /dev/ptmx
> ln -s /dev/pts/ptmx /dev/ptmx
> chmod 666 /dev/ptmx

I was able to restart. I will play a bit with it :)

Thanks Serge.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]         ` <20100331201240.GA26773-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-03-31 20:22           ` Daniel Lezcano
@ 2010-03-31 21:00           ` Daniel Lezcano
       [not found]             ` <4BB3B7E1.8080608-GANU6spQydw@public.gmane.org>
  1 sibling, 1 reply; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 21:00 UTC (permalink / raw)
  To: Serge E. Hallyn
  Cc: Containers, Sukadev Bhattiprolu, clg-NmTC/0ZBporQT0dZR+AlfA

Serge E. Hallyn wrote:
> Quoting Daniel Lezcano (dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org):
>> Sukadev Bhattiprolu wrote:
>>> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
>>> Following set of patches, when applied to LXC and built with USERCR as
>>> described below, enable enable lxc-checkpoint and lxc-restart of some
>>> simple containers
>> [ ... ]
>>
>>> 3. Checkpoint/restart a simple LXC container
>>>
>>> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
>>>
>>> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
>>>
>>> 	$ lxc-stop --name foo
>>>
>>> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt
>> Finally, using ckpt-v20-dev, I succeeded to checkpoint sleep but
>> when I restart, I got the error:
>>
>> <4534>number of tasks: 2
>> <4534>number of vpids: 0
>> <4534>total tasks (including ghosts): 2
>> <4534>pid 2: inherit sid 0
>> <4534>pid 2: creator set to 1
>> <4534>====== TASKS
>> <4534>	[0] pid 1 ppid 0 sid 0 creator 0
>> <4534>	[1] pid 2 ppid 1 sid 0 creator 1
>> <4534>............
>> <4534>task[0].vidx = -1
>> <4534>task[1].vidx = -1
>> <4534>new pidns with init
>> <4534>forking child vpid 1 flags 0x321
>> <4534>task 1 forking with flags 20020011 numpids 1
>> <4534>task 1 pid[0]=0
>> <4535>====== PIDS ARRAY
>> <4535>[0] pid 1 ppid 0 sid 0 pgid 0
>> <4535>[1] pid 2 ppid 1 sid 0 pgid 0
>> <4535>............
>> Error: /dev/ptmx must be a link to /dev/pts/ptmx
>> <4534>forked child vpid 4536 (asked 1)
>> root task exited status 0
>>
>> What can I do to prevent this error ?
> 
> test -e /dev/pts/ptmx || (echo "Don't go through with the rest of this" && exit)
> rm -f /dev/ptmx
> ln -s /dev/pts/ptmx /dev/ptmx
> chmod 666 /dev/ptmx

I was able to CR a simple program like sleep.

But most of the simple test programs I run, exit right after the restart 
was marked successful instead of continuing their execution.

In the kernel I see the traces:

[26108:3:c/r:restore_debug_free:145] active pid was 3, ctx->errno 0
[26108:3:c/r:restore_debug_free:147] kflags 6 uflags 0 oflags 3
[26108:3:c/r:restore_debug_free:149] task[0] to run 1
[26108:3:c/r:restore_debug_free:149] task[1] to run 2
[26108:3:c/r:restore_debug_free:149] task[2] to run 3
[26108:3:c/r:restore_debug_free:174] pid 26104 type Coord state Success
[26108:3:c/r:restore_debug_free:174] pid 26106 type Root state Success
[26108:3:c/r:restore_debug_free:174] pid 26107 type Task state Success
[26108:3:c/r:restore_debug_free:174] pid 26108 type Task state Success
[26108:3:c/r:pgarr_release_pages:102] total pages 0
[26108:3:c/r:do_restart:1446] sys_restart returns -516

What does mean -516 ? an error ?

I am running on a x86_64.

Thanks
   -- Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]             ` <4BB3B7E1.8080608-GANU6spQydw@public.gmane.org>
@ 2010-03-31 21:23               ` Sukadev Bhattiprolu
       [not found]                 ` <20100331212359.GA18934-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2010-04-01  5:43               ` Oren Laadan
  1 sibling, 1 reply; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-03-31 21:23 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Daniel Lezcano [daniel.lezcano-GANU6spQydw@public.gmane.org] wrote:
> But most of the simple test programs I run, exit right after the restart  
> was marked successful instead of continuing their execution.
>
> In the kernel I see the traces:
>
> [26108:3:c/r:restore_debug_free:145] active pid was 3, ctx->errno 0
> [26108:3:c/r:restore_debug_free:147] kflags 6 uflags 0 oflags 3
> [26108:3:c/r:restore_debug_free:149] task[0] to run 1
> [26108:3:c/r:restore_debug_free:149] task[1] to run 2
> [26108:3:c/r:restore_debug_free:149] task[2] to run 3
> [26108:3:c/r:restore_debug_free:174] pid 26104 type Coord state Success
> [26108:3:c/r:restore_debug_free:174] pid 26106 type Root state Success
> [26108:3:c/r:restore_debug_free:174] pid 26107 type Task state Success
> [26108:3:c/r:restore_debug_free:174] pid 26108 type Task state Success
> [26108:3:c/r:pgarr_release_pages:102] total pages 0
> [26108:3:c/r:do_restart:1446] sys_restart returns -516
>
> What does mean -516 ? an error ?

Could it be ERESTART_RESTARTBLOCK ? Also, can you let us know what application
causes this ? Are any signals generated ?

Thanks,

Sukadev
>
> I am running on a x86_64.
>
> Thanks
>   -- Daniel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]                 ` <20100331212359.GA18934-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2010-03-31 21:30                   ` Daniel Lezcano
       [not found]                     ` <4BB3BF02.7060402-GANU6spQydw@public.gmane.org>
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Lezcano @ 2010-03-31 21:30 UTC (permalink / raw)
  To: Sukadev Bhattiprolu; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Sukadev Bhattiprolu wrote:
> Daniel Lezcano [daniel.lezcano-GANU6spQydw@public.gmane.org] wrote:
>   
>> But most of the simple test programs I run, exit right after the restart  
>> was marked successful instead of continuing their execution.
>>
>> In the kernel I see the traces:
>>
>> [26108:3:c/r:restore_debug_free:145] active pid was 3, ctx->errno 0
>> [26108:3:c/r:restore_debug_free:147] kflags 6 uflags 0 oflags 3
>> [26108:3:c/r:restore_debug_free:149] task[0] to run 1
>> [26108:3:c/r:restore_debug_free:149] task[1] to run 2
>> [26108:3:c/r:restore_debug_free:149] task[2] to run 3
>> [26108:3:c/r:restore_debug_free:174] pid 26104 type Coord state Success
>> [26108:3:c/r:restore_debug_free:174] pid 26106 type Root state Success
>> [26108:3:c/r:restore_debug_free:174] pid 26107 type Task state Success
>> [26108:3:c/r:restore_debug_free:174] pid 26108 type Task state Success
>> [26108:3:c/r:pgarr_release_pages:102] total pages 0
>> [26108:3:c/r:do_restart:1446] sys_restart returns -516
>>
>> What does mean -516 ? an error ?
>>     
>
> Could it be ERESTART_RESTARTBLOCK ? Also, can you let us know what application
> causes this ? Are any signals generated ?
>   
That happens with sleep.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]         ` <4BB33C81.9070802-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
@ 2010-04-01  5:03           ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-04-01  5:03 UTC (permalink / raw)
  To: Cedric Le Goater; +Cc: dlezcano-NmTC/0ZBporQT0dZR+AlfA, Containers

Cedric Le Goater [clg-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org] wrote:
>> 	#!/bin/sh
>>
>> 	me=`basename "$0" 2>/dev/null`
>> 	ndigit=${1:-1000}
>>
>> 	echo $me - $ndigit digits
>> 	echo "scale=$ndigit;a(1)*4" | bc -l
>>
>> can you run in a terminal :
>>
>> 	$ lxc-execute -n pi1 -- ./pi1.sh 3000&
>> 	$ lxc-checkpoint -n pi1 -d ./ckpt -k
>> 	$ lxc-restart -n pi1 -d ./ckpt
>
> please try that suka, it's much more interesting than sleep which can
> silently segfault after restart.

Agree.  BTW, after the sleep I tested C/R of a VNC server with a vi session
running inside.

I tried the above program, which is definitely more interesting, but am
having trouble with stdout after restart - restart seems to succeed, but
nothing shows up on stdout. If I redirect the output of "bc -l" to a file,
the C/R (with lxc) works fine.

I am investigating the stdout issue though.

Thanks,

Sukadev

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]     ` <4BB35519.8080500-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
@ 2010-04-01  5:37       ` Oren Laadan
  0 siblings, 0 replies; 28+ messages in thread
From: Oren Laadan @ 2010-04-01  5:37 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Containers, Sukadev Bhattiprolu, clg-NmTC/0ZBporQT0dZR+AlfA



Daniel Lezcano wrote:
> Sukadev Bhattiprolu wrote:
>> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
>> Following set of patches, when applied to LXC and built with USERCR as
>> described below, enable enable lxc-checkpoint and lxc-restart of some
>> simple containers
>>
> 
> [ ... ]
> 
>> 1. Build USERCR
>>
>> 	$ cd /root
>>
>> 	$ git-clone git://git.ncl.cs.columbia.edu/pub/git/user-cr.git user-cr
>>
>> 	$ cd user-cr
>>
>> 	$ git-checkout ckpt-v20-dev
>>
>> 	  	Tested with commit e275f77e4a82d228c1df14dbeb691342e32cdac2
>> 		as HEAD.
>> 	
>> 	# Apply following two patches:
>>
>> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024037.html
>> 	https://lists.linux-foundation.org/pipermail/containers/2010-March/024038.html
>>
>> 	$ cd /root/user-cr
>>
>> 	$ KERNELSRC=/root/linux-2.6/ make 
> 
> Assuming the kernel source tree below is right,
> 
>    http://www.linux-cr.org/git/?p=linux-cr.git;a=summary
> 
> Shall I use ckpt-v20 or ckpt-v20-dev with your patchset ?

It's in rcX state and hence not made the primary branch, but you
should probably try ckpt-v21-rc2 (kernel) and ckpt-v20-dev (for
user-cr) instead.

Oren.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]             ` <4BB3B7E1.8080608-GANU6spQydw@public.gmane.org>
  2010-03-31 21:23               ` Sukadev Bhattiprolu
@ 2010-04-01  5:43               ` Oren Laadan
  1 sibling, 0 replies; 28+ messages in thread
From: Oren Laadan @ 2010-04-01  5:43 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Containers, Sukadev Bhattiprolu, clg-NmTC/0ZBporQT0dZR+AlfA



Daniel Lezcano wrote:
> Serge E. Hallyn wrote:
>> Quoting Daniel Lezcano (dlezcano-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org):
>>> Sukadev Bhattiprolu wrote:
>>>> lxc-checkpoint, lxc-restart in the LXC source tree are currently stubs.
>>>> Following set of patches, when applied to LXC and built with USERCR as
>>>> described below, enable enable lxc-checkpoint and lxc-restart of some
>>>> simple containers
>>> [ ... ]
>>>
>>>> 3. Checkpoint/restart a simple LXC container
>>>>
>>>> 	$ lxc-execute --name foo --rcfile lxc-no-netns.conf -- /bin/sleep 1000
>>>>
>>>> 	$ lxc-checkpoint --name foo --image /root/lxc-foo.ckpt
>>>>
>>>> 	$ lxc-stop --name foo
>>>>
>>>> 	$ lxc-restart --name foo --image /root/lxc-foo.ckpt
>>> Finally, using ckpt-v20-dev, I succeeded to checkpoint sleep but
>>> when I restart, I got the error:
>>>
>>> <4534>number of tasks: 2
>>> <4534>number of vpids: 0
>>> <4534>total tasks (including ghosts): 2
>>> <4534>pid 2: inherit sid 0
>>> <4534>pid 2: creator set to 1
>>> <4534>====== TASKS
>>> <4534>	[0] pid 1 ppid 0 sid 0 creator 0
>>> <4534>	[1] pid 2 ppid 1 sid 0 creator 1
>>> <4534>............
>>> <4534>task[0].vidx = -1
>>> <4534>task[1].vidx = -1
>>> <4534>new pidns with init
>>> <4534>forking child vpid 1 flags 0x321
>>> <4534>task 1 forking with flags 20020011 numpids 1
>>> <4534>task 1 pid[0]=0
>>> <4535>====== PIDS ARRAY
>>> <4535>[0] pid 1 ppid 0 sid 0 pgid 0
>>> <4535>[1] pid 2 ppid 1 sid 0 pgid 0
>>> <4535>............
>>> Error: /dev/ptmx must be a link to /dev/pts/ptmx
>>> <4534>forked child vpid 4536 (asked 1)
>>> root task exited status 0
>>>
>>> What can I do to prevent this error ?
>> test -e /dev/pts/ptmx || (echo "Don't go through with the rest of this" && exit)
>> rm -f /dev/ptmx
>> ln -s /dev/pts/ptmx /dev/ptmx
>> chmod 666 /dev/ptmx
> 
> I was able to CR a simple program like sleep.
> 
> But most of the simple test programs I run, exit right after the restart 
> was marked successful instead of continuing their execution.
> 
> In the kernel I see the traces:
> 
> [26108:3:c/r:restore_debug_free:145] active pid was 3, ctx->errno 0
> [26108:3:c/r:restore_debug_free:147] kflags 6 uflags 0 oflags 3
> [26108:3:c/r:restore_debug_free:149] task[0] to run 1
> [26108:3:c/r:restore_debug_free:149] task[1] to run 2
> [26108:3:c/r:restore_debug_free:149] task[2] to run 3
> [26108:3:c/r:restore_debug_free:174] pid 26104 type Coord state Success
> [26108:3:c/r:restore_debug_free:174] pid 26106 type Root state Success
> [26108:3:c/r:restore_debug_free:174] pid 26107 type Task state Success
> [26108:3:c/r:restore_debug_free:174] pid 26108 type Task state Success
> [26108:3:c/r:pgarr_release_pages:102] total pages 0
> [26108:3:c/r:do_restart:1446] sys_restart returns -516
> 
> What does mean -516 ? an error ?

It means ERESTART_RESTARTBLOCK - it's the way sys_restart tells
the kernel to resume the previous sleep by reusing the exact same
mechanism in the kernel to resume a previous sleep after a signal
or freeze.

You can request that the application be frozen when restart is
complete, and then attach with a debugger and single step it to
see what's happening. See the --freezer=  option of 'restart'.

Oren.

> 
> I am running on a x86_64.
> 
> Thanks
>    -- Daniel
> 
> _______________________________________________
> Containers mailing list
> Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org
> https://lists.linux-foundation.org/mailman/listinfo/containers
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 0/6][lxc][v3] Link LXC with USERCR
       [not found]                     ` <4BB3BF02.7060402-GANU6spQydw@public.gmane.org>
@ 2010-04-02  5:54                       ` Sukadev Bhattiprolu
  0 siblings, 0 replies; 28+ messages in thread
From: Sukadev Bhattiprolu @ 2010-04-02  5:54 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Containers, clg-NmTC/0ZBporQT0dZR+AlfA

Daniel Lezcano [daniel.lezcano-GANU6spQydw@public.gmane.org] wrote:
> Sukadev Bhattiprolu wrote:
>> Daniel Lezcano [daniel.lezcano-GANU6spQydw@public.gmane.org] wrote:
>>   
>>> But most of the simple test programs I run, exit right after the 
>>> restart  was marked successful instead of continuing their execution.
>>>
>>> In the kernel I see the traces:
>>>
>>> [26108:3:c/r:restore_debug_free:145] active pid was 3, ctx->errno 0
>>> [26108:3:c/r:restore_debug_free:147] kflags 6 uflags 0 oflags 3
>>> [26108:3:c/r:restore_debug_free:149] task[0] to run 1
>>> [26108:3:c/r:restore_debug_free:149] task[1] to run 2
>>> [26108:3:c/r:restore_debug_free:149] task[2] to run 3
>>> [26108:3:c/r:restore_debug_free:174] pid 26104 type Coord state Success
>>> [26108:3:c/r:restore_debug_free:174] pid 26106 type Root state Success
>>> [26108:3:c/r:restore_debug_free:174] pid 26107 type Task state Success
>>> [26108:3:c/r:restore_debug_free:174] pid 26108 type Task state Success
>>> [26108:3:c/r:pgarr_release_pages:102] total pages 0
>>> [26108:3:c/r:do_restart:1446] sys_restart returns -516
>>>
>>> What does mean -516 ? an error ?
>>>     
>>
>> Could it be ERESTART_RESTARTBLOCK ? Also, can you let us know what application
>> causes this ? Are any signals generated ?
>>   
> That happens with sleep.

Oh, I misread earlier and thought both checkpoint and restat of sleep worked.

Anyway, when I run C/R a simple program with sleep(), I see the above errors
too, but I think they are expected if the checkpoint happened during sleep -
the system call returned prematurely and after restart the syscall returns
-ERESTART_RESTARTBLOCK which I think causes libc to repeat the syscall.

I get the above ERESTART* error in dmesg, when I lxc-checkpoint/lxc-restart
the following simple program, but, application restarts correctly and
continues to write to /tmp/foo.

If fd == 1, however, the writes to stdout do not show up on stdout
even though the application continues to run (you can strace and
see that 'i' continues to get incremented). I am chasing the stdout
problem.

Sukadev
---
#include <stdio.h>
#include <string.h>
#include <sys/fcntl.h>

main()
{
        int i, n, fd;
        char buf[256];

        fd = open("/tmp/foo", O_CREAT|O_RDWR|O_TRUNC, 0666);
        if (fd < 0) {
                perror("open()");
                exit(1);
        }

        for (i = 0; i < 1000; i++) {
                sprintf(buf, "i %d\n", i);
                n = write(fd, buf, strlen(buf));
                if (n != strlen(buf))
                        perror("write()");
                sleep(1);
        }
}

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2010-04-02  5:54 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-31  7:04 [PATCH 0/6][lxc][v3] Link LXC with USERCR Sukadev Bhattiprolu
     [not found] ` <20100331070440.GA21570-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31  7:06   ` [PATCH 1/6][lxc][v3] Add --with-libcr configure option Sukadev Bhattiprolu
     [not found]     ` <20100331070633.GA23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31  8:11       ` Michel Normand
2010-03-31 17:21         ` Sukadev Bhattiprolu
2010-03-31  7:07   ` [PATCH 2/6][lxc][v3] lxc_restart: Add --statefile option Sukadev Bhattiprolu
     [not found]     ` <20100331070711.GB23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31  8:10       ` Michel Normand
2010-03-31  7:07   ` [PATCH 3/6][lxc][v3] lxc_checkpoint: " Sukadev Bhattiprolu
2010-03-31  7:08   ` [PATCH 4/6][lxc][v3] Move get_init_pid() into checkpoint.c Sukadev Bhattiprolu
     [not found]     ` <20100331070848.GD23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31  8:17       ` Cedric Le Goater
2010-03-31  7:09   ` [PATCH 5/6][lxc][v3] Hook up lxc_restart() with app_restart() Sukadev Bhattiprolu
2010-03-31  7:10   ` [PATCH 6/6][lxc][v3] Hook up lxc_checkpoint() with app_checkpoint() Sukadev Bhattiprolu
     [not found]     ` <20100331071016.GF23567-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31  8:08       ` Michel Normand
2010-03-31  8:18       ` Cedric Le Goater
2010-03-31  9:29   ` [PATCH 0/6][lxc][v3] Link LXC with USERCR Michel Normand
2010-03-31  9:38   ` Cedric Le Goater
     [not found]     ` <4BB31801.4000304-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2010-03-31 12:13       ` Cedric Le Goater
     [not found]         ` <4BB33C81.9070802-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2010-04-01  5:03           ` Sukadev Bhattiprolu
2010-03-31 13:58   ` Daniel Lezcano
     [not found]     ` <4BB35519.8080500-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2010-04-01  5:37       ` Oren Laadan
2010-03-31 16:31   ` Daniel Lezcano
2010-03-31 19:58   ` Daniel Lezcano
     [not found]     ` <4BB3A981.4020709-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2010-03-31 20:12       ` Serge E. Hallyn
     [not found]         ` <20100331201240.GA26773-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31 20:22           ` Daniel Lezcano
2010-03-31 21:00           ` Daniel Lezcano
     [not found]             ` <4BB3B7E1.8080608-GANU6spQydw@public.gmane.org>
2010-03-31 21:23               ` Sukadev Bhattiprolu
     [not found]                 ` <20100331212359.GA18934-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2010-03-31 21:30                   ` Daniel Lezcano
     [not found]                     ` <4BB3BF02.7060402-GANU6spQydw@public.gmane.org>
2010-04-02  5:54                       ` Sukadev Bhattiprolu
2010-04-01  5:43               ` Oren Laadan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.