* [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work
@ 2008-10-09 19:04 Dave Hansen
2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen
2008-10-10 8:20 ` [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Greg Kurz
0 siblings, 2 replies; 35+ messages in thread
From: Dave Hansen @ 2008-10-09 19:04 UTC (permalink / raw)
To: Oren Laadan; +Cc: arnd, containers, linux-kernel, Ingo Molnar, Dave Hansen
Suggested by Ingo.
Checkpoint/restart is going to be a long effort to get things working.
We're going to have a lot of things that we know just don't work for
a long time. That doesn't mean that it will be useless, it just means
that there's some complicated features that we are going to have to
work incrementally to fix.
This patch introduces a new mechanism to help the checkpoint/restart
developers. A new function pair: task/process_deny_checkpoint() is
created. When called, these tell the kernel that we *know* that the
process has performed some activity that will keep it from being
properly checkpointed.
The 'flag' is an atomic_t for now so that we can have some level
of atomicity and make sure to only warn once.
For now, this is a one-way trip. Once a process is no longer
'may_checkpoint' capable, neither it nor its children ever will be.
This can, of course, be fixed up in the future. We might want to
reset the flag when a new pid namespace is created, for instance.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
---
linux-2.6.git-dave/include/linux/checkpoint.h | 31 +++++++++++++++++++++++++-
linux-2.6.git-dave/include/linux/sched.h | 3 ++
linux-2.6.git-dave/kernel/fork.c | 10 ++++++++
3 files changed, 43 insertions(+), 1 deletion(-)
diff -puN include/linux/checkpoint.h~flag include/linux/checkpoint.h
--- linux-2.6.git/include/linux/checkpoint.h~flag 2008-10-09 11:56:57.000000000 -0700
+++ linux-2.6.git-dave/include/linux/checkpoint.h 2008-10-09 11:56:57.000000000 -0700
@@ -10,8 +10,11 @@
* distribution for more details.
*/
-#include <linux/path.h>
#include <linux/fs.h>
+#include <linux/path.h>
+#include <linux/sched.h>
+
+#ifdef CONFIG_CHECKPOINT_RESTART
#define CR_VERSION 2
@@ -94,4 +97,30 @@ extern void file_pos_write(struct file *
#define cr_debug(fmt, args...) \
pr_debug("[CR:%s] " fmt, __func__, ## args)
+static inline void __task_deny_checkpointing(struct task_struct *task,
+ char *file, int line)
+{
+ if(!atomic_dec_and_test(&task->may_checkpoint))
+ return;
+ printk(KERN_INFO "process performed an action that can not be "
+ "checkpointed at: %s:%d\n", file, line);
+ WARN_ON(1);
+}
+#define process_deny_checkpointing(p) __task_deny_checkpointing(p, __FILE__, __LINE__)
+
+/*
+ * For now, we're not going to have a distinction between
+ * tasks and processes for the purpose of c/r. But, allow
+ * these two calls anyway to make new users at least think
+ * about it.
+ */
+#define task_deny_checkpointing(p) __task_deny_checkpointing(p, __FILE__, __LINE__)
+
+#else
+
+static inline void task_deny_checkpointing(struct task_struct *task) {}
+static inline void process_deny_checkpointing(struct task_struct *task) {}
+
+#endif
+
#endif /* _CHECKPOINT_CKPT_H_ */
diff -puN include/linux/sched.h~flag include/linux/sched.h
--- linux-2.6.git/include/linux/sched.h~flag 2008-10-09 11:56:57.000000000 -0700
+++ linux-2.6.git-dave/include/linux/sched.h 2008-10-09 11:56:57.000000000 -0700
@@ -1301,6 +1301,9 @@ struct task_struct {
int latency_record_count;
struct latency_record latency_record[LT_SAVECOUNT];
#endif
+#ifdef CONFIG_CHECKPOINT_RESTART
+ atomic_t may_checkpoint;
+#endif
};
/*
diff -puN kernel/fork.c~flag kernel/fork.c
--- linux-2.6.git/kernel/fork.c~flag 2008-10-09 11:56:57.000000000 -0700
+++ linux-2.6.git-dave/kernel/fork.c 2008-10-09 11:56:57.000000000 -0700
@@ -194,6 +194,13 @@ void __init fork_init(unsigned long memp
init_task.signal->rlim[RLIMIT_NPROC].rlim_max = max_threads/2;
init_task.signal->rlim[RLIMIT_SIGPENDING] =
init_task.signal->rlim[RLIMIT_NPROC];
+
+#ifdef CONFIG_CHECKPOINT_RESTART
+ /*
+ * This probably won't stay set for long...
+ */
+ atomic_set(&init_task.may_checkpoint, 1);
+#endif
}
int __attribute__((weak)) arch_dup_task_struct(struct task_struct *dst,
@@ -244,6 +251,9 @@ static struct task_struct *dup_task_stru
tsk->btrace_seq = 0;
#endif
tsk->splice_pipe = NULL;
+#ifdef CONFIG_CHECKPOINT_RESTART
+ atomic_set(&tsk->may_checkpoint, atomic_read(&orig->may_checkpoint));
+#endif
return tsk;
out:
_
^ permalink raw reply [flat|nested] 35+ messages in thread* [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 19:04 [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Dave Hansen @ 2008-10-09 19:04 ` Dave Hansen 2008-10-09 19:43 ` Serge E. Hallyn ` (2 more replies) 2008-10-10 8:20 ` [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Greg Kurz 1 sibling, 3 replies; 35+ messages in thread From: Dave Hansen @ 2008-10-09 19:04 UTC (permalink / raw) To: Oren Laadan; +Cc: arnd, containers, linux-kernel, Ingo Molnar, Dave Hansen These are just a few simple examples of things we know we can't checkpoint now. There are plenty more, but this should give everyone an idea how this will look in practice. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> --- linux-2.6.git-dave/fs/aio.c | 7 +++++++ linux-2.6.git-dave/ipc/mqueue.c | 3 +++ linux-2.6.git-dave/net/socket.c | 5 +++++ 3 files changed, 15 insertions(+) diff -puN fs/aio.c~no-checkpointing-for-sockets fs/aio.c --- linux-2.6.git/fs/aio.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 +++ linux-2.6.git-dave/fs/aio.c 2008-10-09 11:56:58.000000000 -0700 @@ -19,6 +19,7 @@ #define DEBUG 0 +#include <linux/checkpoint.h> #include <linux/sched.h> #include <linux/fs.h> #include <linux/file.h> @@ -1663,6 +1664,8 @@ asmlinkage long sys_io_submit(aio_contex if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(*iocbpp))))) return -EFAULT; + process_deny_checkpointing(current); + ctx = lookup_ioctx(ctx_id); if (unlikely(!ctx)) { pr_debug("EINVAL: io_submit: invalid context id\n"); @@ -1742,6 +1745,8 @@ asmlinkage long sys_io_cancel(aio_contex if (unlikely(!ctx)) return -EINVAL; + process_deny_checkpointing(current); + spin_lock_irq(&ctx->ctx_lock); ret = -EAGAIN; kiocb = lookup_kiocb(ctx, iocb, key); @@ -1796,6 +1801,8 @@ asmlinkage long sys_io_getevents(aio_con struct kioctx *ioctx = lookup_ioctx(ctx_id); long ret = -EINVAL; + process_deny_checkpointing(current); + if (likely(ioctx)) { if (likely(min_nr <= nr && min_nr >= 0 && nr >= 0)) ret = read_events(ioctx, min_nr, nr, events, timeout); diff -puN ipc/mqueue.c~no-checkpointing-for-sockets ipc/mqueue.c --- linux-2.6.git/ipc/mqueue.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 +++ linux-2.6.git-dave/ipc/mqueue.c 2008-10-09 11:56:58.000000000 -0700 @@ -14,6 +14,7 @@ */ #include <linux/capability.h> +#include <linux/checkpoint.h> #include <linux/init.h> #include <linux/pagemap.h> #include <linux/file.h> @@ -655,6 +656,8 @@ asmlinkage long sys_mq_open(const char _ char *name; int fd, error; + process_deny_checkpointing(current); + error = audit_mq_open(oflag, mode, u_attr); if (error != 0) return error; diff -puN net/socket.c~no-checkpointing-for-sockets net/socket.c --- linux-2.6.git/net/socket.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 +++ linux-2.6.git-dave/net/socket.c 2008-10-09 11:56:58.000000000 -0700 @@ -87,6 +87,7 @@ #include <linux/audit.h> #include <linux/wireless.h> #include <linux/nsproxy.h> +#include <linux/checkpoint.h> #include <asm/uaccess.h> #include <asm/unistd.h> @@ -1236,6 +1237,8 @@ asmlinkage long sys_socket(int family, i if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK)) flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK; + process_deny_checkpointing(current); + retval = sock_create(family, type, protocol, &sock); if (retval < 0) goto out; @@ -2130,6 +2133,8 @@ asmlinkage long sys_socketcall(int call, a0 = a[0]; a1 = a[1]; + process_deny_checkpointing(current); + switch (call) { case SYS_SOCKET: err = sys_socket(a0, a1, a[2]); _ ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen @ 2008-10-09 19:43 ` Serge E. Hallyn 2008-10-09 20:54 ` Dave Hansen 2008-10-10 8:41 ` Daniel Lezcano 2008-10-10 10:17 ` Cedric Le Goater 2 siblings, 1 reply; 35+ messages in thread From: Serge E. Hallyn @ 2008-10-09 19:43 UTC (permalink / raw) To: Dave Hansen; +Cc: Oren Laadan, containers, Ingo Molnar, linux-kernel, arnd Quoting Dave Hansen (dave@linux.vnet.ibm.com): > > These are just a few simple examples of things we know we can't > checkpoint now. There are plenty more, but this should give > everyone an idea how this will look in practice. > > > Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> > --- > > linux-2.6.git-dave/fs/aio.c | 7 +++++++ > linux-2.6.git-dave/ipc/mqueue.c | 3 +++ > linux-2.6.git-dave/net/socket.c | 5 +++++ > 3 files changed, 15 insertions(+) > > diff -puN fs/aio.c~no-checkpointing-for-sockets fs/aio.c > --- linux-2.6.git/fs/aio.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/fs/aio.c 2008-10-09 11:56:58.000000000 -0700 > @@ -19,6 +19,7 @@ > > #define DEBUG 0 > > +#include <linux/checkpoint.h> > #include <linux/sched.h> > #include <linux/fs.h> > #include <linux/file.h> > @@ -1663,6 +1664,8 @@ asmlinkage long sys_io_submit(aio_contex > if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(*iocbpp))))) > return -EFAULT; > > + process_deny_checkpointing(current); > + > ctx = lookup_ioctx(ctx_id); > if (unlikely(!ctx)) { > pr_debug("EINVAL: io_submit: invalid context id\n"); > @@ -1742,6 +1745,8 @@ asmlinkage long sys_io_cancel(aio_contex > if (unlikely(!ctx)) > return -EINVAL; > > + process_deny_checkpointing(current); > + > spin_lock_irq(&ctx->ctx_lock); > ret = -EAGAIN; > kiocb = lookup_kiocb(ctx, iocb, key); > @@ -1796,6 +1801,8 @@ asmlinkage long sys_io_getevents(aio_con > struct kioctx *ioctx = lookup_ioctx(ctx_id); > long ret = -EINVAL; > > + process_deny_checkpointing(current); Hmm, I don't know too much about aio, but is it possible to succeed with io_getevents if we didn't first do a submit? It looks like the contexts are looked up out of current->mm, so I don't think we need this call here. Otherwise, this is neat. -serge > + > if (likely(ioctx)) { > if (likely(min_nr <= nr && min_nr >= 0 && nr >= 0)) > ret = read_events(ioctx, min_nr, nr, events, timeout); > diff -puN ipc/mqueue.c~no-checkpointing-for-sockets ipc/mqueue.c > --- linux-2.6.git/ipc/mqueue.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/ipc/mqueue.c 2008-10-09 11:56:58.000000000 -0700 > @@ -14,6 +14,7 @@ > */ > > #include <linux/capability.h> > +#include <linux/checkpoint.h> > #include <linux/init.h> > #include <linux/pagemap.h> > #include <linux/file.h> > @@ -655,6 +656,8 @@ asmlinkage long sys_mq_open(const char _ > char *name; > int fd, error; > > + process_deny_checkpointing(current); > + > error = audit_mq_open(oflag, mode, u_attr); > if (error != 0) > return error; > diff -puN net/socket.c~no-checkpointing-for-sockets net/socket.c > --- linux-2.6.git/net/socket.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/net/socket.c 2008-10-09 11:56:58.000000000 -0700 > @@ -87,6 +87,7 @@ > #include <linux/audit.h> > #include <linux/wireless.h> > #include <linux/nsproxy.h> > +#include <linux/checkpoint.h> > > #include <asm/uaccess.h> > #include <asm/unistd.h> > @@ -1236,6 +1237,8 @@ asmlinkage long sys_socket(int family, i > if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK)) > flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK; > > + process_deny_checkpointing(current); > + > retval = sock_create(family, type, protocol, &sock); > if (retval < 0) > goto out; > @@ -2130,6 +2133,8 @@ asmlinkage long sys_socketcall(int call, > a0 = a[0]; > a1 = a[1]; > > + process_deny_checkpointing(current); > + > switch (call) { > case SYS_SOCKET: > err = sys_socket(a0, a1, a[2]); > _ > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 19:43 ` Serge E. Hallyn @ 2008-10-09 20:54 ` Dave Hansen 2008-10-10 8:46 ` Ingo Molnar 2008-10-10 10:27 ` Cedric Le Goater 0 siblings, 2 replies; 35+ messages in thread From: Dave Hansen @ 2008-10-09 20:54 UTC (permalink / raw) To: Serge E. Hallyn; +Cc: containers, Ingo Molnar, arnd, linux-kernel On Thu, 2008-10-09 at 14:43 -0500, Serge E. Hallyn wrote: > Hmm, I don't know too much about aio, but is it possible to succeed with > io_getevents if we didn't first do a submit? It looks like the contexts > are looked up out of current->mm, so I don't think we need this call > here. > > Otherwise, this is neat. Good question. I know nothing, either. :) My thought was that any process *trying* to do aio stuff of any kind is going to be really confused if it gets checkpointed. Or, it might try to submit an aio right after it checks the list of them. I thought it best to be cautious and say, if you screw with aio, no checkpointing for you! -- Dave ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 20:54 ` Dave Hansen @ 2008-10-10 8:46 ` Ingo Molnar 2008-10-10 13:17 ` Rafael J. Wysocki 2008-10-11 13:48 ` Pavel Machek 2008-10-10 10:27 ` Cedric Le Goater 1 sibling, 2 replies; 35+ messages in thread From: Ingo Molnar @ 2008-10-10 8:46 UTC (permalink / raw) To: Dave Hansen Cc: Serge E. Hallyn, containers, arnd, linux-kernel, Rafael J. Wysocki, Arjan van de Ven, Peter Zijlstra * Dave Hansen <dave@linux.vnet.ibm.com> wrote: > On Thu, 2008-10-09 at 14:43 -0500, Serge E. Hallyn wrote: > > Hmm, I don't know too much about aio, but is it possible to succeed with > > io_getevents if we didn't first do a submit? It looks like the contexts > > are looked up out of current->mm, so I don't think we need this call > > here. > > > > Otherwise, this is neat. > > Good question. I know nothing, either. :) > > My thought was that any process *trying* to do aio stuff of any kind > is going to be really confused if it gets checkpointed. Or, it might > try to submit an aio right after it checks the list of them. I > thought it best to be cautious and say, if you screw with aio, no > checkpointing for you! as long as there's total transparency and the transition from CR-capable to CR-disabled state is absolutely safe and race-free, that should be fine. I expect users to quickly cause enough pressure to reduce the NOCR areas of the kernel significantly ;-) In the long run, could we expect a (experimental) version of hibernation that would just use this checkpointing facility to hibernate? That would be way cool for users and for testing: we could do transparent kernel upgrades/downgrades via this form of hibernation, between CR-compatible kernels (!). distros could mark certain kernels as 'safe fallback' kernels, and if say a WARN_ON() or app breakage hits that is suspected to be kernel related, the user could hit a 'switch back to safe kernel' button - which would switch back _without losing any app state_. People could even try new versions of the kernel which would just fall back to the known-workin safe kernel if the bootup fails for example. Pie in the sky for sure, but way cool: it could propel Linux kernel testing to completely new areas - new kernels could be tried non-intrusively. (as long as a new kernel does not corrupt the CR data structures - so some good consistency and redundancy checking would be nice in the format!) Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 8:46 ` Ingo Molnar @ 2008-10-10 13:17 ` Rafael J. Wysocki 2008-10-10 14:54 ` Ingo Molnar 2008-10-11 13:48 ` Pavel Machek 1 sibling, 1 reply; 35+ messages in thread From: Rafael J. Wysocki @ 2008-10-10 13:17 UTC (permalink / raw) To: Ingo Molnar Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra On Friday, 10 of October 2008, Ingo Molnar wrote: > > * Dave Hansen <dave@linux.vnet.ibm.com> wrote: > > > On Thu, 2008-10-09 at 14:43 -0500, Serge E. Hallyn wrote: > > > Hmm, I don't know too much about aio, but is it possible to succeed with > > > io_getevents if we didn't first do a submit? It looks like the contexts > > > are looked up out of current->mm, so I don't think we need this call > > > here. > > > > > > Otherwise, this is neat. > > > > Good question. I know nothing, either. :) > > > > My thought was that any process *trying* to do aio stuff of any kind > > is going to be really confused if it gets checkpointed. Or, it might > > try to submit an aio right after it checks the list of them. I > > thought it best to be cautious and say, if you screw with aio, no > > checkpointing for you! > > as long as there's total transparency and the transition from CR-capable > to CR-disabled state is absolutely safe and race-free, that should be > fine. > > I expect users to quickly cause enough pressure to reduce the NOCR areas > of the kernel significantly ;-) > > In the long run, could we expect a (experimental) version of hibernation > that would just use this checkpointing facility to hibernate? Surely not ACPI-compliant. Apart from this I don't see why not, but OTOH I'm not particularly interested in implementing that. Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 13:17 ` Rafael J. Wysocki @ 2008-10-10 14:54 ` Ingo Molnar 2008-10-10 19:53 ` Rafael J. Wysocki 0 siblings, 1 reply; 35+ messages in thread From: Ingo Molnar @ 2008-10-10 14:54 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > In the long run, could we expect a (experimental) version of > > hibernation that would just use this checkpointing facility to > > hibernate? > > Surely not ACPI-compliant. what do you mean? Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 14:54 ` Ingo Molnar @ 2008-10-10 19:53 ` Rafael J. Wysocki 2008-10-10 19:53 ` Ingo Molnar 0 siblings, 1 reply; 35+ messages in thread From: Rafael J. Wysocki @ 2008-10-10 19:53 UTC (permalink / raw) To: Ingo Molnar Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra, ACPI Devel Maling List On Friday, 10 of October 2008, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > In the long run, could we expect a (experimental) version of > > > hibernation that would just use this checkpointing facility to > > > hibernate? > > > > Surely not ACPI-compliant. > > what do you mean? The ACPI spec says quite specifically what should be done while entering hibernation and during resume from hibernation. We're not following that in the current code, but we can (gradually) update the code to become ACPI-compilant in that respect. However, if we go the checkpointing route, I don't think that will be possible any more. [In short, the problem is that ACPI regards the S4 state corresponding to hibernation as a sleep state of the system which is therefore fundamentally different from the soft power-off state and requires special handling.] This may be a theory etc. (I don't want to start the entire discussion about that once again), but clearly there's a choice to be made here. I'd prefer hibernation to be ACPI-compliant, but if people don't want that, I won't fight for it. Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 19:53 ` Rafael J. Wysocki @ 2008-10-10 19:53 ` Ingo Molnar 2008-10-10 20:40 ` Len Brown 2008-10-10 22:57 ` Rafael J. Wysocki 0 siblings, 2 replies; 35+ messages in thread From: Ingo Molnar @ 2008-10-10 19:53 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra, ACPI Devel Maling List * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > Surely not ACPI-compliant. > > > > what do you mean? > > The ACPI spec says quite specifically what should be done while > entering hibernation and during resume from hibernation. We're not > following that in the current code, but we can (gradually) update the > code to become ACPI-compilant in that respect. However, if we go the > checkpointing route, I don't think that will be possible any more. ah, i see. I did not mean to utilize any ACPI paths but simple powerdown or reboot. If we checkpoint all apps to persistent disk areas (which the checkpoint patches in this thread are about), then we can just reboot the kernel and forget all its state. That capability can be used to build a really robust hibernation implementation IMO: we could "hibernate/kexec" over between different kernel versions transparently. (only a small delay will be noticed by the user - if we do it smartly with in-kernel modesetting then not even the screen contents will be changed over this.) Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 19:53 ` Ingo Molnar @ 2008-10-10 20:40 ` Len Brown 2008-10-10 22:57 ` Rafael J. Wysocki 1 sibling, 0 replies; 35+ messages in thread From: Len Brown @ 2008-10-10 20:40 UTC (permalink / raw) To: Ingo Molnar Cc: Rafael J. Wysocki, Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra, ACPI Devel Maling List On Fri, 10 Oct 2008, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > Surely not ACPI-compliant. > > > > > > what do you mean? > > > > The ACPI spec says quite specifically what should be done while > > entering hibernation and during resume from hibernation. We're not > > following that in the current code, but we can (gradually) update the > > code to become ACPI-compilant in that respect. However, if we go the > > checkpointing route, I don't think that will be possible any more. > > ah, i see. I did not mean to utilize any ACPI paths but simple powerdown > or reboot. If we don't enter ACPI S4, and instead poweroff, then we'll lose the capability to wake the system from devices that are capable of waking S4, but incapable of waking S5. ie. The power button will still work, but others may not. cheers, -Len > If we checkpoint all apps to persistent disk areas (which the checkpoint > patches in this thread are about), then we can just reboot the kernel > and forget all its state. > > That capability can be used to build a really robust hibernation > implementation IMO: we could "hibernate/kexec" over between different > kernel versions transparently. (only a small delay will be noticed by > the user - if we do it smartly with in-kernel modesetting then not even > the screen contents will be changed over this.) ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 19:53 ` Ingo Molnar 2008-10-10 20:40 ` Len Brown @ 2008-10-10 22:57 ` Rafael J. Wysocki 1 sibling, 0 replies; 35+ messages in thread From: Rafael J. Wysocki @ 2008-10-10 22:57 UTC (permalink / raw) To: Ingo Molnar Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Arjan van de Ven, Peter Zijlstra, ACPI Devel Maling List On Friday, 10 of October 2008, Ingo Molnar wrote: > > * Rafael J. Wysocki <rjw@sisk.pl> wrote: > > > > > Surely not ACPI-compliant. > > > > > > what do you mean? > > > > The ACPI spec says quite specifically what should be done while > > entering hibernation and during resume from hibernation. We're not > > following that in the current code, but we can (gradually) update the > > code to become ACPI-compilant in that respect. However, if we go the > > checkpointing route, I don't think that will be possible any more. > > ah, i see. I did not mean to utilize any ACPI paths but simple powerdown > or reboot. > > If we checkpoint all apps to persistent disk areas (which the checkpoint > patches in this thread are about), then we can just reboot the kernel > and forget all its state. > > That capability can be used to build a really robust hibernation > implementation IMO: we could "hibernate/kexec" over between different > kernel versions transparently. (only a small delay will be noticed by > the user - if we do it smartly with in-kernel modesetting then not even > the screen contents will be changed over this.) That actually should be called a migration of VM IMO and would be a useful functionality. Sure. Hibernation, however, generally involves the restoration of the hardware and most importantly _platform_ state which IMO is impossible without the ACPI functionality, as well as wake-up, which may depend on ACPI too. Thanks, Rafael ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 8:46 ` Ingo Molnar 2008-10-10 13:17 ` Rafael J. Wysocki @ 2008-10-11 13:48 ` Pavel Machek 2008-10-11 15:00 ` Ingo Molnar 1 sibling, 1 reply; 35+ messages in thread From: Pavel Machek @ 2008-10-11 13:48 UTC (permalink / raw) To: Ingo Molnar Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Rafael J. Wysocki, Arjan van de Ven, Peter Zijlstra Hi! > > > Hmm, I don't know too much about aio, but is it possible to succeed with > > > io_getevents if we didn't first do a submit? It looks like the contexts > > > are looked up out of current->mm, so I don't think we need this call > > > here. > > > > > > Otherwise, this is neat. > > > > Good question. I know nothing, either. :) > > > > My thought was that any process *trying* to do aio stuff of any kind > > is going to be really confused if it gets checkpointed. Or, it might > > try to submit an aio right after it checks the list of them. I > > thought it best to be cautious and say, if you screw with aio, no > > checkpointing for you! > > as long as there's total transparency and the transition from CR-capable > to CR-disabled state is absolutely safe and race-free, that should be > fine. > > I expect users to quickly cause enough pressure to reduce the NOCR areas > of the kernel significantly ;-) > > In the long run, could we expect a (experimental) version of hibernation > that would just use this checkpointing facility to hibernate? That would > be way cool for users and for testing: we could do transparent kernel > upgrades/downgrades via this form of hibernation, between CR-compatible > kernels (!). Well, if we could do that, I guess we could also use CR to 'hibernate' your desktop then continue on your notebook. And yes that sounds cool. > Pie in the sky for sure, but way cool: it could propel Linux kernel > testing to completely new areas - new kernels could be tried > non-intrusively. (as long as a new kernel does not corrupt the CR data > structures - so some good consistency and redundancy checking would be > nice in the format!) Well, for simple apps, it should not be that hard... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-11 13:48 ` Pavel Machek @ 2008-10-11 15:00 ` Ingo Molnar 0 siblings, 0 replies; 35+ messages in thread From: Ingo Molnar @ 2008-10-11 15:00 UTC (permalink / raw) To: Pavel Machek Cc: Dave Hansen, Serge E. Hallyn, containers, arnd, linux-kernel, Rafael J. Wysocki, Arjan van de Ven, Peter Zijlstra * Pavel Machek <pavel@suse.cz> wrote: > > In the long run, could we expect a (experimental) version of > > hibernation that would just use this checkpointing facility to > > hibernate? That would be way cool for users and for testing: we > > could do transparent kernel upgrades/downgrades via this form of > > hibernation, between CR-compatible kernels (!). > > Well, if we could do that, I guess we could also use CR to 'hibernate' > your desktop then continue on your notebook. And yes that sounds cool. yes. > > Pie in the sky for sure, but way cool: it could propel Linux kernel > > testing to completely new areas - new kernels could be tried > > non-intrusively. (as long as a new kernel does not corrupt the CR > > data structures - so some good consistency and redundancy checking > > would be nice in the format!) > > Well, for simple apps, it should not be that hard... Generally, if something works for simple apps already (in a robust, compatible and supportable way) and users find it "very cool", then support for more complex apps is not far in the future. but if you want to support more complex apps straight away, it takes forever and gets ugly. Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 20:54 ` Dave Hansen 2008-10-10 8:46 ` Ingo Molnar @ 2008-10-10 10:27 ` Cedric Le Goater 1 sibling, 0 replies; 35+ messages in thread From: Cedric Le Goater @ 2008-10-10 10:27 UTC (permalink / raw) To: Dave Hansen; +Cc: Serge E. Hallyn, containers, Ingo Molnar, arnd, linux-kernel Dave Hansen wrote: > On Thu, 2008-10-09 at 14:43 -0500, Serge E. Hallyn wrote: >> Hmm, I don't know too much about aio, but is it possible to succeed with >> io_getevents if we didn't first do a submit? It looks like the contexts >> are looked up out of current->mm, so I don't think we need this call >> here. >> >> Otherwise, this is neat. > > Good question. I know nothing, either. :) > > My thought was that any process *trying* to do aio stuff of any kind is > going to be really confused if it gets checkpointed. Or, it might try > to submit an aio right after it checks the list of them. I thought it > best to be cautious and say, if you screw with aio, no checkpointing for > you! IMO, the right path would be to deny checkpoint if you have aio requests pending but if you have not, aio's are not an issue for checkpoint anymore. but that's one easy one to track ... C. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen 2008-10-09 19:43 ` Serge E. Hallyn @ 2008-10-10 8:41 ` Daniel Lezcano 2008-10-10 10:17 ` Cedric Le Goater 2 siblings, 0 replies; 35+ messages in thread From: Daniel Lezcano @ 2008-10-10 8:41 UTC (permalink / raw) To: Dave Hansen; +Cc: Oren Laadan, containers, Ingo Molnar, linux-kernel, arnd Dave Hansen wrote: > These are just a few simple examples of things we know we can't > checkpoint now. There are plenty more, but this should give > everyone an idea how this will look in practice. > > > Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> > --- > > linux-2.6.git-dave/fs/aio.c | 7 +++++++ > linux-2.6.git-dave/ipc/mqueue.c | 3 +++ > linux-2.6.git-dave/net/socket.c | 5 +++++ > 3 files changed, 15 insertions(+) > > diff -puN fs/aio.c~no-checkpointing-for-sockets fs/aio.c > --- linux-2.6.git/fs/aio.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/fs/aio.c 2008-10-09 11:56:58.000000000 -0700 > @@ -19,6 +19,7 @@ > > #define DEBUG 0 > > +#include <linux/checkpoint.h> > #include <linux/sched.h> > #include <linux/fs.h> > #include <linux/file.h> > @@ -1663,6 +1664,8 @@ asmlinkage long sys_io_submit(aio_contex > if (unlikely(!access_ok(VERIFY_READ, iocbpp, (nr*sizeof(*iocbpp))))) > return -EFAULT; > > + process_deny_checkpointing(current); > + > ctx = lookup_ioctx(ctx_id); > if (unlikely(!ctx)) { > pr_debug("EINVAL: io_submit: invalid context id\n"); > @@ -1742,6 +1745,8 @@ asmlinkage long sys_io_cancel(aio_contex > if (unlikely(!ctx)) > return -EINVAL; > > + process_deny_checkpointing(current); > + > spin_lock_irq(&ctx->ctx_lock); > ret = -EAGAIN; > kiocb = lookup_kiocb(ctx, iocb, key); > @@ -1796,6 +1801,8 @@ asmlinkage long sys_io_getevents(aio_con > struct kioctx *ioctx = lookup_ioctx(ctx_id); > long ret = -EINVAL; > > + process_deny_checkpointing(current); > + > if (likely(ioctx)) { > if (likely(min_nr <= nr && min_nr >= 0 && nr >= 0)) > ret = read_events(ioctx, min_nr, nr, events, timeout); > diff -puN ipc/mqueue.c~no-checkpointing-for-sockets ipc/mqueue.c > --- linux-2.6.git/ipc/mqueue.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/ipc/mqueue.c 2008-10-09 11:56:58.000000000 -0700 > @@ -14,6 +14,7 @@ > */ > > #include <linux/capability.h> > +#include <linux/checkpoint.h> > #include <linux/init.h> > #include <linux/pagemap.h> > #include <linux/file.h> > @@ -655,6 +656,8 @@ asmlinkage long sys_mq_open(const char _ > char *name; > int fd, error; > > + process_deny_checkpointing(current); > + > error = audit_mq_open(oflag, mode, u_attr); > if (error != 0) > return error; > diff -puN net/socket.c~no-checkpointing-for-sockets net/socket.c > --- linux-2.6.git/net/socket.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/net/socket.c 2008-10-09 11:56:58.000000000 -0700 > @@ -87,6 +87,7 @@ > #include <linux/audit.h> > #include <linux/wireless.h> > #include <linux/nsproxy.h> > +#include <linux/checkpoint.h> > > #include <asm/uaccess.h> > #include <asm/unistd.h> > @@ -1236,6 +1237,8 @@ asmlinkage long sys_socket(int family, i > if (SOCK_NONBLOCK != O_NONBLOCK && (flags & SOCK_NONBLOCK)) > flags = (flags & ~SOCK_NONBLOCK) | O_NONBLOCK; > > + process_deny_checkpointing(current); > + > retval = sock_create(family, type, protocol, &sock); > if (retval < 0) > goto out; > @@ -2130,6 +2133,8 @@ asmlinkage long sys_socketcall(int call, > a0 = a[0]; > a1 = a[1]; > > + process_deny_checkpointing(current); > + > switch (call) { > case SYS_SOCKET: > err = sys_socket(a0, a1, a[2]); That seems to be a good idea. There isn't a risk of a big propagation of this function all around the kernel code ? Especially if there are partial support for a specific resource ? I mean we are able to checkpoint ipv4/tcp sockets, (yeah I like socket examples :) ) , but not other protocols so we have to remove from sys_socket the process_deny_checkpoint and add it to all others protocols. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen 2008-10-09 19:43 ` Serge E. Hallyn 2008-10-10 8:41 ` Daniel Lezcano @ 2008-10-10 10:17 ` Cedric Le Goater 2008-10-10 14:04 ` Serge E. Hallyn 2 siblings, 1 reply; 35+ messages in thread From: Cedric Le Goater @ 2008-10-10 10:17 UTC (permalink / raw) To: Dave Hansen; +Cc: Oren Laadan, containers, Ingo Molnar, linux-kernel, arnd > diff -puN ipc/mqueue.c~no-checkpointing-for-sockets ipc/mqueue.c > --- linux-2.6.git/ipc/mqueue.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > +++ linux-2.6.git-dave/ipc/mqueue.c 2008-10-09 11:56:58.000000000 -0700 > @@ -14,6 +14,7 @@ > */ > > #include <linux/capability.h> > +#include <linux/checkpoint.h> > #include <linux/init.h> > #include <linux/pagemap.h> > #include <linux/file.h> > @@ -655,6 +656,8 @@ asmlinkage long sys_mq_open(const char _ > char *name; > int fd, error; > > + process_deny_checkpointing(current); > + mqueue being a file system, i would put the checks in the inode_operations. Also, you can't always deny ! I would expect some allow in sys_mq_unlink(). C. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 10:17 ` Cedric Le Goater @ 2008-10-10 14:04 ` Serge E. Hallyn 2008-10-10 16:45 ` Greg Kurz 0 siblings, 1 reply; 35+ messages in thread From: Serge E. Hallyn @ 2008-10-10 14:04 UTC (permalink / raw) To: Cedric Le Goater; +Cc: Dave Hansen, containers, Ingo Molnar, arnd, linux-kernel Quoting Cedric Le Goater (clg@fr.ibm.com): > > diff -puN ipc/mqueue.c~no-checkpointing-for-sockets ipc/mqueue.c > > --- linux-2.6.git/ipc/mqueue.c~no-checkpointing-for-sockets 2008-10-09 11:56:58.000000000 -0700 > > +++ linux-2.6.git-dave/ipc/mqueue.c 2008-10-09 11:56:58.000000000 -0700 > > @@ -14,6 +14,7 @@ > > */ > > > > #include <linux/capability.h> > > +#include <linux/checkpoint.h> > > #include <linux/init.h> > > #include <linux/pagemap.h> > > #include <linux/file.h> > > @@ -655,6 +656,8 @@ asmlinkage long sys_mq_open(const char _ > > char *name; > > int fd, error; > > > > + process_deny_checkpointing(current); > > + > > mqueue being a file system, i would put the checks in the inode_operations. > > Also, you can't always deny ! I would expect some allow in sys_mq_unlink(). Remember a part of Ingo's motivation is to push c/r developers to address the lacking features that users use most, earlier. So the warnings and subsequent email complaints are what we're after. Hence a single 'checkpointable or not' flag. Given the single flag, how do you know at sys_mq_unlink() whether the process also has an opensocket? Rather than make this tracking facility more complicated and intrusive, if people complain that they couldn't checkpoint bc of a warning about aio, then we implement aio c/r! We don't just try and reduce the amount of time that you can't checkpoint bc of lack of aio c/r support :) -serge ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 14:04 ` Serge E. Hallyn @ 2008-10-10 16:45 ` Greg Kurz 2008-10-10 17:13 ` Serge E. Hallyn 2008-10-10 17:28 ` Dave Hansen 0 siblings, 2 replies; 35+ messages in thread From: Greg Kurz @ 2008-10-10 16:45 UTC (permalink / raw) To: Serge E. Hallyn Cc: Cedric Le Goater, linux-kernel, containers, Ingo Molnar, arnd, Dave Hansen On Fri, 2008-10-10 at 09:04 -0500, Serge E. Hallyn wrote: > Remember a part of Ingo's motivation is to push c/r developers to > address the lacking features that users use most, earlier. So the > warnings and subsequent email complaints are what we're after. Hence a > single 'checkpointable or not' flag. > > Given the single flag, how do you know at sys_mq_unlink() whether the > process also has an opensocket? > > Rather than make this tracking facility more complicated and intrusive, > if people complain that they couldn't checkpoint bc of a warning about > aio, then we implement aio c/r! We don't just try and reduce the amount > of time that you can't checkpoint bc of lack of aio c/r support :) > > -serge Serge, It's exactly what I meant before, the tracking facility would be awfully complicated. It cannot be done that way. But there's also something awkward with the flag thing : can you provide right now an exhaustive list of all the places where you must raise it ? I'd rather do some heavy checking at checkpoint time. Greg. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 16:45 ` Greg Kurz @ 2008-10-10 17:13 ` Serge E. Hallyn 2008-10-10 17:28 ` Dave Hansen 1 sibling, 0 replies; 35+ messages in thread From: Serge E. Hallyn @ 2008-10-10 17:13 UTC (permalink / raw) To: Greg Kurz Cc: Cedric Le Goater, linux-kernel, containers, Ingo Molnar, arnd, Dave Hansen Quoting Greg Kurz (gkurz@fr.ibm.com): > On Fri, 2008-10-10 at 09:04 -0500, Serge E. Hallyn wrote: > > Remember a part of Ingo's motivation is to push c/r developers to > > address the lacking features that users use most, earlier. So the > > warnings and subsequent email complaints are what we're after. Hence a > > single 'checkpointable or not' flag. > > > > Given the single flag, how do you know at sys_mq_unlink() whether the > > process also has an opensocket? > > > > Rather than make this tracking facility more complicated and intrusive, > > if people complain that they couldn't checkpoint bc of a warning about > > aio, then we implement aio c/r! We don't just try and reduce the amount > > of time that you can't checkpoint bc of lack of aio c/r support :) > > > > -serge > > Serge, > > It's exactly what I meant before, the tracking facility would be awfully > complicated. It cannot be done that way. > But there's also something awkward with the flag thing : can you provide > right now an exhaustive list of all the places where you must raise it ? > > I'd rather do some heavy checking at checkpoint time. Noone is saying that we are not going to do that. -serge ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 16:45 ` Greg Kurz 2008-10-10 17:13 ` Serge E. Hallyn @ 2008-10-10 17:28 ` Dave Hansen 2008-10-13 8:20 ` Greg Kurz 1 sibling, 1 reply; 35+ messages in thread From: Dave Hansen @ 2008-10-10 17:28 UTC (permalink / raw) To: Greg Kurz Cc: Serge E. Hallyn, arnd, containers, linux-kernel, Cedric Le Goater, Ingo Molnar On Fri, 2008-10-10 at 18:45 +0200, Greg Kurz wrote: > It's exactly what I meant before, the tracking facility would be awfully > complicated. It cannot be done that way. > But there's also something awkward with the flag thing : can you provide > right now an exhaustive list of all the places where you must raise it ? Greg, that's just pure FUD. We don't say that spinlocks are a bad thing because we can't come up with an exhaustive list of places where we need locking. We'll do plenty of checks at checkpoint time. We'll do plenty of checks at runtime. Neither will work completely on its own, and neither will be exhaustive. The way this will work is just as Serge said: in true Linux style, we'll add more places users of process_deny_checkpoint() incrementally as we find them and as people complain. We'll also be incrementally removing them as we add functionality. -- Dave ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 2/2] first callers of process_deny_checkpoint() 2008-10-10 17:28 ` Dave Hansen @ 2008-10-13 8:20 ` Greg Kurz 0 siblings, 0 replies; 35+ messages in thread From: Greg Kurz @ 2008-10-13 8:20 UTC (permalink / raw) To: Dave Hansen Cc: Serge E. Hallyn, arnd, containers, linux-kernel, Cedric Le Goater, Ingo Molnar On Fri, 2008-10-10 at 10:28 -0700, Dave Hansen wrote: > On Fri, 2008-10-10 at 18:45 +0200, Greg Kurz wrote: > > It's exactly what I meant before, the tracking facility would be awfully > > complicated. It cannot be done that way. > > But there's also something awkward with the flag thing : can you provide > > right now an exhaustive list of all the places where you must raise it ? > > Greg, that's just pure FUD. We don't say that spinlocks are a bad thing > because we can't come up with an exhaustive list of places where we need > locking. > > We'll do plenty of checks at checkpoint time. > > We'll do plenty of checks at runtime. > > Neither will work completely on its own, and neither will be exhaustive. > The way this will work is just as Serge said: in true Linux style, we'll > add more places users of process_deny_checkpoint() incrementally as we > find them and as people complain. We'll also be incrementally removing > them as we add functionality. > > -- Dave > Well then I misunderstood the purpose your initial postings. Sorry. :) -- Gregory Kurz gkurz@fr.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-09 19:04 [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Dave Hansen 2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen @ 2008-10-10 8:20 ` Greg Kurz 2008-10-10 8:37 ` Daniel Lezcano 1 sibling, 1 reply; 35+ messages in thread From: Greg Kurz @ 2008-10-10 8:20 UTC (permalink / raw) To: Dave Hansen; +Cc: Oren Laadan, containers, Ingo Molnar, linux-kernel, arnd On Thu, 2008-10-09 at 12:04 -0700, Dave Hansen wrote: > Suggested by Ingo. > > Checkpoint/restart is going to be a long effort to get things working. > We're going to have a lot of things that we know just don't work for > a long time. That doesn't mean that it will be useless, it just means > that there's some complicated features that we are going to have to > work incrementally to fix. > > This patch introduces a new mechanism to help the checkpoint/restart > developers. A new function pair: task/process_deny_checkpoint() is > created. When called, these tell the kernel that we *know* that the > process has performed some activity that will keep it from being > properly checkpointed. > > The 'flag' is an atomic_t for now so that we can have some level > of atomicity and make sure to only warn once. > > For now, this is a one-way trip. Once a process is no longer > 'may_checkpoint' capable, neither it nor its children ever will be. > This can, of course, be fixed up in the future. We might want to > reset the flag when a new pid namespace is created, for instance. > Then this patch should be described as: Track in-kernel when we expect checkpoint/restart to fail. By the way, why don't you introduce the reverse operation ? > > Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> > --- > > linux-2.6.git-dave/include/linux/checkpoint.h | 31 +++++++++++++++++++++++++- > linux-2.6.git-dave/include/linux/sched.h | 3 ++ > linux-2.6.git-dave/kernel/fork.c | 10 ++++++++ > 3 files changed, 43 insertions(+), 1 deletion(-) > > diff -puN include/linux/checkpoint.h~flag include/linux/checkpoint.h > --- linux-2.6.git/include/linux/checkpoint.h~flag 2008-10-09 11:56:57.000000000 -0700 > +++ linux-2.6.git-dave/include/linux/checkpoint.h 2008-10-09 11:56:57.000000000 -0700 > @@ -10,8 +10,11 @@ > * distribution for more details. > */ > > -#include <linux/path.h> > #include <linux/fs.h> > +#include <linux/path.h> > +#include <linux/sched.h> > + > +#ifdef CONFIG_CHECKPOINT_RESTART > > #define CR_VERSION 2 > > @@ -94,4 +97,30 @@ extern void file_pos_write(struct file * > #define cr_debug(fmt, args...) \ > pr_debug("[CR:%s] " fmt, __func__, ## args) > > +static inline void __task_deny_checkpointing(struct task_struct *task, > + char *file, int line) > +{ > + if(!atomic_dec_and_test(&task->may_checkpoint)) > + return; > + printk(KERN_INFO "process performed an action that can not be " > + "checkpointed at: %s:%d\n", file, line); > + WARN_ON(1); > +} > +#define process_deny_checkpointing(p) __task_deny_checkpointing(p, __FILE__, __LINE__) > + > +/* > + * For now, we're not going to have a distinction between > + * tasks and processes for the purpose of c/r. But, allow > + * these two calls anyway to make new users at least think > + * about it. > + */ > +#define task_deny_checkpointing(p) __task_deny_checkpointing(p, __FILE__, __LINE__) > + > +#else > + > +static inline void task_deny_checkpointing(struct task_struct *task) {} > +static inline void process_deny_checkpointing(struct task_struct *task) {} > + > +#endif > + > #endif /* _CHECKPOINT_CKPT_H_ */ > diff -puN include/linux/sched.h~flag include/linux/sched.h > --- linux-2.6.git/include/linux/sched.h~flag 2008-10-09 11:56:57.000000000 -0700 > +++ linux-2.6.git-dave/include/linux/sched.h 2008-10-09 11:56:57.000000000 -0700 > @@ -1301,6 +1301,9 @@ struct task_struct { > int latency_record_count; > struct latency_record latency_record[LT_SAVECOUNT]; > #endif > +#ifdef CONFIG_CHECKPOINT_RESTART > + atomic_t may_checkpoint; > +#endif > }; > > /* > diff -puN kernel/fork.c~flag kernel/fork.c > --- linux-2.6.git/kernel/fork.c~flag 2008-10-09 11:56:57.000000000 -0700 > +++ linux-2.6.git-dave/kernel/fork.c 2008-10-09 11:56:57.000000000 -0700 > @@ -194,6 +194,13 @@ void __init fork_init(unsigned long memp > init_task.signal->rlim[RLIMIT_NPROC].rlim_max = max_threads/2; > init_task.signal->rlim[RLIMIT_SIGPENDING] = > init_task.signal->rlim[RLIMIT_NPROC]; > + > +#ifdef CONFIG_CHECKPOINT_RESTART > + /* > + * This probably won't stay set for long... > + */ > + atomic_set(&init_task.may_checkpoint, 1); > +#endif > } > > int __attribute__((weak)) arch_dup_task_struct(struct task_struct *dst, > @@ -244,6 +251,9 @@ static struct task_struct *dup_task_stru > tsk->btrace_seq = 0; > #endif > tsk->splice_pipe = NULL; > +#ifdef CONFIG_CHECKPOINT_RESTART > + atomic_set(&tsk->may_checkpoint, atomic_read(&orig->may_checkpoint)); > +#endif > return tsk; > > out: > _ > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers -- Gregory Kurz gkurz@fr.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 8:20 ` [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Greg Kurz @ 2008-10-10 8:37 ` Daniel Lezcano 2008-10-10 8:47 ` Greg Kurz ` (3 more replies) 0 siblings, 4 replies; 35+ messages in thread From: Daniel Lezcano @ 2008-10-10 8:37 UTC (permalink / raw) To: Greg Kurz; +Cc: Dave Hansen, containers, Ingo Molnar, arnd, linux-kernel Greg Kurz wrote: > On Thu, 2008-10-09 at 12:04 -0700, Dave Hansen wrote: >> Suggested by Ingo. >> >> Checkpoint/restart is going to be a long effort to get things working. >> We're going to have a lot of things that we know just don't work for >> a long time. That doesn't mean that it will be useless, it just means >> that there's some complicated features that we are going to have to >> work incrementally to fix. >> >> This patch introduces a new mechanism to help the checkpoint/restart >> developers. A new function pair: task/process_deny_checkpoint() is >> created. When called, these tell the kernel that we *know* that the >> process has performed some activity that will keep it from being >> properly checkpointed. >> >> The 'flag' is an atomic_t for now so that we can have some level >> of atomicity and make sure to only warn once. >> >> For now, this is a one-way trip. Once a process is no longer >> 'may_checkpoint' capable, neither it nor its children ever will be. >> This can, of course, be fixed up in the future. We might want to >> reset the flag when a new pid namespace is created, for instance. >> > > Then this patch should be described as: > > Track in-kernel when we expect checkpoint/restart to fail. > > By the way, why don't you introduce the reverse operation ? I think implementing the reverse operation will be a nightmare, IMHO it is safe to say we deny checkpointing for the process life-cycle either if the created resource was destroyed before we initiate the checkpoint. For example, you create a socket, the process becomes uncheckpointable, you close (via sys_close) the socket, you have to track this close to be related to the socket which made the process uncheckpointable in order to make the operation reversible. Let's imagine you implement this reverse operation anyway, you have a process which creates a TCP connection, writes data and close the socket (so you are again checkpointable), but in the namespace there is the orphan socket which is not checkpointable yet and you missed this case. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 8:37 ` Daniel Lezcano @ 2008-10-10 8:47 ` Greg Kurz 2008-10-10 10:11 ` Oren Laadan ` (2 subsequent siblings) 3 siblings, 0 replies; 35+ messages in thread From: Greg Kurz @ 2008-10-10 8:47 UTC (permalink / raw) To: Daniel Lezcano; +Cc: Dave Hansen, containers, Ingo Molnar, arnd, linux-kernel On Fri, 2008-10-10 at 10:37 +0200, Daniel Lezcano wrote: > I think implementing the reverse operation will be a nightmare, IMHO it > is safe to say we deny checkpointing for the process life-cycle either > if the created resource was destroyed before we initiate the checkpoint. > > For example, you create a socket, the process becomes uncheckpointable, > you close (via sys_close) the socket, you have to track this close to be > related to the socket which made the process uncheckpointable in order > to make the operation reversible. > > Let's imagine you implement this reverse operation anyway, you have a > process which creates a TCP connection, writes data and close the socket > (so you are again checkpointable), but in the namespace there is the > orphan socket which is not checkpointable yet and you missed this case. That's exactly what I wanted to read... Tracking only is inherently flawed. The valid way IMHO implies checks at checkpoint time. -- Gregory Kurz gkurz@fr.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 8:37 ` Daniel Lezcano 2008-10-10 8:47 ` Greg Kurz @ 2008-10-10 10:11 ` Oren Laadan 2008-10-10 14:59 ` Ingo Molnar 2008-10-10 16:33 ` Dave Hansen 3 siblings, 0 replies; 35+ messages in thread From: Oren Laadan @ 2008-10-10 10:11 UTC (permalink / raw) To: Daniel Lezcano Cc: Greg Kurz, linux-kernel, containers, Ingo Molnar, arnd, Dave Hansen Daniel Lezcano wrote: > Greg Kurz wrote: >> On Thu, 2008-10-09 at 12:04 -0700, Dave Hansen wrote: >>> Suggested by Ingo. >>> >>> Checkpoint/restart is going to be a long effort to get things working. >>> We're going to have a lot of things that we know just don't work for >>> a long time. That doesn't mean that it will be useless, it just means >>> that there's some complicated features that we are going to have to >>> work incrementally to fix. >>> >>> This patch introduces a new mechanism to help the checkpoint/restart >>> developers. A new function pair: task/process_deny_checkpoint() is >>> created. When called, these tell the kernel that we *know* that the >>> process has performed some activity that will keep it from being >>> properly checkpointed. >>> >>> The 'flag' is an atomic_t for now so that we can have some level >>> of atomicity and make sure to only warn once. >>> >>> For now, this is a one-way trip. Once a process is no longer >>> 'may_checkpoint' capable, neither it nor its children ever will be. >>> This can, of course, be fixed up in the future. We might want to >>> reset the flag when a new pid namespace is created, for instance. >>> >> Then this patch should be described as: >> >> Track in-kernel when we expect checkpoint/restart to fail. >> >> By the way, why don't you introduce the reverse operation ? > > I think implementing the reverse operation will be a nightmare, IMHO it > is safe to say we deny checkpointing for the process life-cycle either > if the created resource was destroyed before we initiate the checkpoint. > > For example, you create a socket, the process becomes uncheckpointable, > you close (via sys_close) the socket, you have to track this close to be > related to the socket which made the process uncheckpointable in order > to make the operation reversible. I agree that it makes sense to only track transitions in one direction. Therefore at any given point in time all we'll know is that the process "may be non-checkpointable", instead of the clear-cut "uncheckpointable" (webster anyone ?). The distinction is important, because it may be that the process is, after all, checkpointable, so users/developers could still try to perform a checkpoint, should they wish too. The only thing is that it is not guaranteed to succeed. In fact, one way to transition back to the "checkpointable" state is by doing a dry-checkpoint, where no data is saved (/dev/null ?). No side effects will occur except for a short downtime due to the freeze period. If the dry-checkpoint completes successfully - we can reset the non-/un-/not-/a-/dis-checkpointable flag. > > Let's imagine you implement this reverse operation anyway, you have a > process which creates a TCP connection, writes data and close the socket > (so you are again checkpointable), but in the namespace there is the > orphan socket which is not checkpointable yet and you missed this case. > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers Oren. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 8:37 ` Daniel Lezcano 2008-10-10 8:47 ` Greg Kurz 2008-10-10 10:11 ` Oren Laadan @ 2008-10-10 14:59 ` Ingo Molnar 2008-10-10 15:17 ` Oren Laadan 2008-10-10 16:33 ` Dave Hansen 3 siblings, 1 reply; 35+ messages in thread From: Ingo Molnar @ 2008-10-10 14:59 UTC (permalink / raw) To: Daniel Lezcano; +Cc: Greg Kurz, Dave Hansen, containers, arnd, linux-kernel * Daniel Lezcano <dlezcano@fr.ibm.com> wrote: >> By the way, why don't you introduce the reverse operation ? > > I think implementing the reverse operation will be a nightmare, IMHO > it is safe to say we deny checkpointing for the process life-cycle > either if the created resource was destroyed before we initiate the > checkpoint. it's also a not too interesting case. The end goal is to just be able to checkpoint everything that matters - in the long run there simply wont be many places that are marked 'cannot checkpoint'. So the ability to deny a checkpoint is a transitional feature - a flexible CR todo list in essence - but also needed for applications/users that want to rely on CR being a dependable facility. It would be bad for most of the practical usecases of checkpointing to allow the checkpointing of an app, just to see it break on restore due to lost context. Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 14:59 ` Ingo Molnar @ 2008-10-10 15:17 ` Oren Laadan 2008-10-10 15:28 ` Ingo Molnar 2008-10-10 16:34 ` Greg Kurz 0 siblings, 2 replies; 35+ messages in thread From: Oren Laadan @ 2008-10-10 15:17 UTC (permalink / raw) To: Ingo Molnar; +Cc: Daniel Lezcano, containers, linux-kernel, arnd, Dave Hansen Ingo Molnar wrote: > * Daniel Lezcano <dlezcano@fr.ibm.com> wrote: > >>> By the way, why don't you introduce the reverse operation ? >> I think implementing the reverse operation will be a nightmare, IMHO >> it is safe to say we deny checkpointing for the process life-cycle >> either if the created resource was destroyed before we initiate the >> checkpoint. > > it's also a not too interesting case. The end goal is to just be able to > checkpoint everything that matters - in the long run there simply wont > be many places that are marked 'cannot checkpoint'. > > So the ability to deny a checkpoint is a transitional feature - a > flexible CR todo list in essence - but also needed for > applications/users that want to rely on CR being a dependable facility. > > It would be bad for most of the practical usecases of checkpointing to > allow the checkpointing of an app, just to see it break on restore due > to lost context. Actually it need not wait for restore to fail - it can fail during the checkpoint, as soon as the unsupported feature is encountered. Adding that flag of what you suggest will help make it more vocal and obvious that a feature isn't supported, even without the user actually trying to take a checkpoint. I like that I idea. Oren. > > Ingo > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 15:17 ` Oren Laadan @ 2008-10-10 15:28 ` Ingo Molnar 2008-10-10 16:34 ` Greg Kurz 1 sibling, 0 replies; 35+ messages in thread From: Ingo Molnar @ 2008-10-10 15:28 UTC (permalink / raw) To: Oren Laadan; +Cc: Daniel Lezcano, containers, linux-kernel, arnd, Dave Hansen * Oren Laadan <orenl@cs.columbia.edu> wrote: > > > Ingo Molnar wrote: > > * Daniel Lezcano <dlezcano@fr.ibm.com> wrote: > > > >>> By the way, why don't you introduce the reverse operation ? > >> I think implementing the reverse operation will be a nightmare, IMHO > >> it is safe to say we deny checkpointing for the process life-cycle > >> either if the created resource was destroyed before we initiate the > >> checkpoint. > > > > it's also a not too interesting case. The end goal is to just be able to > > checkpoint everything that matters - in the long run there simply wont > > be many places that are marked 'cannot checkpoint'. > > > > So the ability to deny a checkpoint is a transitional feature - a > > flexible CR todo list in essence - but also needed for > > applications/users that want to rely on CR being a dependable facility. > > > > It would be bad for most of the practical usecases of checkpointing to > > allow the checkpointing of an app, just to see it break on restore due > > to lost context. > > Actually it need not wait for restore to fail - it can fail during the > checkpoint, as soon as the unsupported feature is encountered. correct, that is the idea. Ingo ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 15:17 ` Oren Laadan 2008-10-10 15:28 ` Ingo Molnar @ 2008-10-10 16:34 ` Greg Kurz 2008-10-10 16:36 ` Dave Hansen 2008-10-10 17:18 ` Chris Friesen 1 sibling, 2 replies; 35+ messages in thread From: Greg Kurz @ 2008-10-10 16:34 UTC (permalink / raw) To: Oren Laadan Cc: Ingo Molnar, Daniel Lezcano, Dave Hansen, containers, arnd, linux-kernel On Fri, 2008-10-10 at 11:17 -0400, Oren Laadan wrote: > > Ingo Molnar wrote: > > * Daniel Lezcano <dlezcano@fr.ibm.com> wrote: > > > >>> By the way, why don't you introduce the reverse operation ? > >> I think implementing the reverse operation will be a nightmare, IMHO > >> it is safe to say we deny checkpointing for the process life-cycle > >> either if the created resource was destroyed before we initiate the > >> checkpoint. > > > > it's also a not too interesting case. The end goal is to just be able to > > checkpoint everything that matters - in the long run there simply wont > > be many places that are marked 'cannot checkpoint'. > > > > So the ability to deny a checkpoint is a transitional feature - a > > flexible CR todo list in essence - but also needed for > > applications/users that want to rely on CR being a dependable facility. > > > > It would be bad for most of the practical usecases of checkpointing to > > allow the checkpointing of an app, just to see it break on restore due > > to lost context. > > Actually it need not wait for restore to fail - it can fail during the > checkpoint, as soon as the unsupported feature is encountered. > Of course, bad things must be spotted at checkpoint time ! :) > Adding that flag of what you suggest will help make it more vocal and > obvious that a feature isn't supported, even without the user actually > trying to take a checkpoint. I like that I idea. > This flag is weak... testing it gives absolutly no hint whether the checkpoint may succeed or not. As it is designed now, a user can only be aware that checkpoint is *forever* denied. I agree that it's only useful as a "flexible CR todo list". In the long run, if there are still things that can prevent checkpoint from being consistent, they will have to be checked at checkpoint time. Greg. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 16:34 ` Greg Kurz @ 2008-10-10 16:36 ` Dave Hansen 2008-10-10 20:57 ` Daniel Lezcano 2008-10-10 17:18 ` Chris Friesen 1 sibling, 1 reply; 35+ messages in thread From: Dave Hansen @ 2008-10-10 16:36 UTC (permalink / raw) To: Greg Kurz Cc: Oren Laadan, Ingo Molnar, Daniel Lezcano, containers, arnd, linux-kernel On Fri, 2008-10-10 at 18:34 +0200, Greg Kurz wrote: > This flag is weak... testing it gives absolutly no hint whether the > checkpoint may succeed or not. As it is designed now, a user can only be > aware that checkpoint is *forever* denied. I agree that it's only useful > as a "flexible CR todo list". Cool, so everyone agrees the patch is useful! -- Dave ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 16:36 ` Dave Hansen @ 2008-10-10 20:57 ` Daniel Lezcano 0 siblings, 0 replies; 35+ messages in thread From: Daniel Lezcano @ 2008-10-10 20:57 UTC (permalink / raw) To: Dave Hansen Cc: Greg Kurz, Oren Laadan, Ingo Molnar, containers, arnd, linux-kernel Dave Hansen wrote: > On Fri, 2008-10-10 at 18:34 +0200, Greg Kurz wrote: >> This flag is weak... testing it gives absolutly no hint whether the >> checkpoint may succeed or not. As it is designed now, a user can only be >> aware that checkpoint is *forever* denied. I agree that it's only useful >> as a "flexible CR todo list". > > Cool, so everyone agrees the patch is useful! Yes, definitively. It is a good idea. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 16:34 ` Greg Kurz 2008-10-10 16:36 ` Dave Hansen @ 2008-10-10 17:18 ` Chris Friesen 2008-10-13 8:18 ` Greg Kurz 1 sibling, 1 reply; 35+ messages in thread From: Chris Friesen @ 2008-10-10 17:18 UTC (permalink / raw) To: Greg Kurz Cc: Oren Laadan, Ingo Molnar, Daniel Lezcano, Dave Hansen, containers, arnd, linux-kernel Greg Kurz wrote: > This flag is weak... testing it gives absolutly no hint whether the > checkpoint may succeed or not. As it is designed now, a user can only be > aware that checkpoint is *forever* denied. I agree that it's only useful > as a "flexible CR todo list". I don't think it's true that it gives "absolutly no hint". If the flag is not set, then checkpoint will succeed, right? Whereas if the flag is set, then it's an indication that checkpoint could fail (but may still succeed if whatever condition caused the flag to be set is no longer true). Chris ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 17:18 ` Chris Friesen @ 2008-10-13 8:18 ` Greg Kurz 2008-10-13 16:46 ` Serge E. Hallyn 0 siblings, 1 reply; 35+ messages in thread From: Greg Kurz @ 2008-10-13 8:18 UTC (permalink / raw) To: Chris Friesen Cc: Oren Laadan, Ingo Molnar, Daniel Lezcano, Dave Hansen, containers, arnd, linux-kernel On Fri, 2008-10-10 at 11:18 -0600, Chris Friesen wrote: > Greg Kurz wrote: > > > This flag is weak... testing it gives absolutly no hint whether the > > checkpoint may succeed or not. As it is designed now, a user can only be > > aware that checkpoint is *forever* denied. I agree that it's only useful > > as a "flexible CR todo list". > > I don't think it's true that it gives "absolutly no hint". > > If the flag is not set, then checkpoint will succeed, right? Whereas if Wrong. Unless you test_and_checkpoint atomically, the flag doesn't help. > the flag is set, then it's an indication that checkpoint could fail (but > may still succeed if whatever condition caused the flag to be set is no > longer true). > > Chris > -- Gregory Kurz gkurz@fr.ibm.com Software Engineer @ IBM/Meiosys http://www.ibm.com Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 "Anarchy is about taking complete responsibility for yourself." Alan Moore. ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-13 8:18 ` Greg Kurz @ 2008-10-13 16:46 ` Serge E. Hallyn 0 siblings, 0 replies; 35+ messages in thread From: Serge E. Hallyn @ 2008-10-13 16:46 UTC (permalink / raw) To: Greg Kurz Cc: Chris Friesen, arnd, containers, linux-kernel, Dave Hansen, Ingo Molnar, Daniel Lezcano Quoting Greg Kurz (gkurz@fr.ibm.com): > On Fri, 2008-10-10 at 11:18 -0600, Chris Friesen wrote: > > Greg Kurz wrote: > > > > > This flag is weak... testing it gives absolutly no hint whether the > > > checkpoint may succeed or not. As it is designed now, a user can only be > > > aware that checkpoint is *forever* denied. I agree that it's only useful > > > as a "flexible CR todo list". > > > > I don't think it's true that it gives "absolutly no hint". > > > > If the flag is not set, then checkpoint will succeed, right? Whereas if > > Wrong. Unless you test_and_checkpoint atomically, the flag doesn't help. Atomically wrt what? Presumably you test and checkpoint while the container is frozen... > > the flag is set, then it's an indication that checkpoint could fail (but > > may still succeed if whatever condition caused the flag to be set is no > > longer true). > > > > Chris > > > -- > Gregory Kurz gkurz@fr.ibm.com > Software Engineer @ IBM/Meiosys http://www.ibm.com > Tel +33 (0)534 638 479 Fax +33 (0)561 400 420 > > "Anarchy is about taking complete responsibility for yourself." > Alan Moore. > > _______________________________________________ > Containers mailing list > Containers@lists.linux-foundation.org > https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 35+ messages in thread
* Re: [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work 2008-10-10 8:37 ` Daniel Lezcano ` (2 preceding siblings ...) 2008-10-10 14:59 ` Ingo Molnar @ 2008-10-10 16:33 ` Dave Hansen 3 siblings, 0 replies; 35+ messages in thread From: Dave Hansen @ 2008-10-10 16:33 UTC (permalink / raw) To: Daniel Lezcano; +Cc: Greg Kurz, linux-kernel, containers, Ingo Molnar, arnd On Fri, 2008-10-10 at 10:37 +0200, Daniel Lezcano wrote: > For example, you create a socket, the process becomes uncheckpointable, > you close (via sys_close) the socket, you have to track this close to be > related to the socket which made the process uncheckpointable in order > to make the operation reversible. Challenging, yes. Not quite a nightmare, though. It's basically the same problem we have with r/o bind mounts individual users in the kernel need to check some global state. There are temporary (like the duration of the syscall) and more chronic (like your sockets) users. The hard part is making sure that everything gets properly balanced with the chronic ones. But, we did manage to accomplish that with the r/o bind mounts. I haven't seen a bug in *hours*! ;) In any case, we can worry about that later. -- Dave ^ permalink raw reply [flat|nested] 35+ messages in thread
end of thread, other threads:[~2008-10-13 16:46 UTC | newest] Thread overview: 35+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-10-09 19:04 [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Dave Hansen 2008-10-09 19:04 ` [RFC][PATCH 2/2] first callers of process_deny_checkpoint() Dave Hansen 2008-10-09 19:43 ` Serge E. Hallyn 2008-10-09 20:54 ` Dave Hansen 2008-10-10 8:46 ` Ingo Molnar 2008-10-10 13:17 ` Rafael J. Wysocki 2008-10-10 14:54 ` Ingo Molnar 2008-10-10 19:53 ` Rafael J. Wysocki 2008-10-10 19:53 ` Ingo Molnar 2008-10-10 20:40 ` Len Brown 2008-10-10 22:57 ` Rafael J. Wysocki 2008-10-11 13:48 ` Pavel Machek 2008-10-11 15:00 ` Ingo Molnar 2008-10-10 10:27 ` Cedric Le Goater 2008-10-10 8:41 ` Daniel Lezcano 2008-10-10 10:17 ` Cedric Le Goater 2008-10-10 14:04 ` Serge E. Hallyn 2008-10-10 16:45 ` Greg Kurz 2008-10-10 17:13 ` Serge E. Hallyn 2008-10-10 17:28 ` Dave Hansen 2008-10-13 8:20 ` Greg Kurz 2008-10-10 8:20 ` [RFC][PATCH 1/2] Track in-kernel when we expect checkpoint/restart to work Greg Kurz 2008-10-10 8:37 ` Daniel Lezcano 2008-10-10 8:47 ` Greg Kurz 2008-10-10 10:11 ` Oren Laadan 2008-10-10 14:59 ` Ingo Molnar 2008-10-10 15:17 ` Oren Laadan 2008-10-10 15:28 ` Ingo Molnar 2008-10-10 16:34 ` Greg Kurz 2008-10-10 16:36 ` Dave Hansen 2008-10-10 20:57 ` Daniel Lezcano 2008-10-10 17:18 ` Chris Friesen 2008-10-13 8:18 ` Greg Kurz 2008-10-13 16:46 ` Serge E. Hallyn 2008-10-10 16:33 ` Dave Hansen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox