All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] c/r: Add s390 support
@ 2009-02-25 18:12 Dan Smith
       [not found] ` <1235585529-806-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Smith @ 2009-02-25 18:12 UTC (permalink / raw)
  To: containers-qjLDD68F18O7TbgM5vRIOg

This version includes changes to the s390 c/r patch as suggested by Dave
and Serge, and changes to the CR_COPY() patch suggested by Serge.  It also
adds a patch to pull the magic 3 in the core s390 code out to a constant.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/3] s390: Expose a constant for the number of words representing the CRs
@ 2009-02-25 18:12     ` Dan Smith
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Smith @ 2009-02-25 18:12 UTC (permalink / raw)
  To: containers-qjLDD68F18O7TbgM5vRIOg; +Cc: linux-s390-u79uwXL29TY76Z2rM5mHXA

We need to use this value in the checkpoint/restart code and would like to
have a constant instead of a magic '3'.

Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Cc: linux-s390-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
 arch/s390/include/asm/ptrace.h   |    2 ++
 arch/s390/kernel/compat_ptrace.h |    3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/s390/include/asm/ptrace.h b/arch/s390/include/asm/ptrace.h
index 5396f9f..d5b0781 100644
--- a/arch/s390/include/asm/ptrace.h
+++ b/arch/s390/include/asm/ptrace.h
@@ -172,6 +172,8 @@
 #define NUM_CRS		16
 #define NUM_ACRS	16
 
+#define NUM_CR_WORDS	3
+
 #define FPR_SIZE	8
 #define FPC_SIZE	4
 #define FPC_PAD_SIZE	4 /* gcc insists on aligning the fpregs */
diff --git a/arch/s390/kernel/compat_ptrace.h b/arch/s390/kernel/compat_ptrace.h
index a2be3a9..123dd66 100644
--- a/arch/s390/kernel/compat_ptrace.h
+++ b/arch/s390/kernel/compat_ptrace.h
@@ -1,10 +1,11 @@
 #ifndef _PTRACE32_H
 #define _PTRACE32_H
 
+#include <asm/ptrace.h>    /* needed for NUM_CR_WORDS */
 #include "compat_linux.h"  /* needed for psw_compat_t */
 
 typedef struct {
-	__u32 cr[3];
+	__u32 cr[NUM_CR_WORDS];
 } per_cr_words32;
 
 typedef struct {
-- 
1.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/3] s390: Expose a constant for the number of words representing the CRs
@ 2009-02-25 18:12     ` Dan Smith
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Smith @ 2009-02-25 18:12 UTC (permalink / raw)
  To: linux-s390

We need to use this value in the checkpoint/restart code and would like to
have a constant instead of a magic '3'.

Signed-off-by: Dan Smith <danms@us.ibm.com>
Cc: linux-s390@vger.kernel.org
---
 arch/s390/include/asm/ptrace.h   |    2 ++
 arch/s390/kernel/compat_ptrace.h |    3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/arch/s390/include/asm/ptrace.h b/arch/s390/include/asm/ptrace.h
index 5396f9f..d5b0781 100644
--- a/arch/s390/include/asm/ptrace.h
+++ b/arch/s390/include/asm/ptrace.h
@@ -172,6 +172,8 @@
 #define NUM_CRS		16
 #define NUM_ACRS	16
 
+#define NUM_CR_WORDS	3
+
 #define FPR_SIZE	8
 #define FPC_SIZE	4
 #define FPC_PAD_SIZE	4 /* gcc insists on aligning the fpregs */
diff --git a/arch/s390/kernel/compat_ptrace.h b/arch/s390/kernel/compat_ptrace.h
index a2be3a9..123dd66 100644
--- a/arch/s390/kernel/compat_ptrace.h
+++ b/arch/s390/kernel/compat_ptrace.h
@@ -1,10 +1,11 @@
 #ifndef _PTRACE32_H
 #define _PTRACE32_H
 
+#include <asm/ptrace.h>    /* needed for NUM_CR_WORDS */
 #include "compat_linux.h"  /* needed for psw_compat_t */
 
 typedef struct {
-	__u32 cr[3];
+	__u32 cr[NUM_CR_WORDS];
 } per_cr_words32;
 
 typedef struct {
-- 
1.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/3] c/r: Add CR_COPY() macro (v2)
       [not found] ` <1235585529-806-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2009-02-25 18:12     ` Dan Smith
@ 2009-02-25 18:12   ` Dan Smith
       [not found]     ` <1235585529-806-3-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2009-02-25 18:12   ` [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6) Dan Smith
  2 siblings, 1 reply; 12+ messages in thread
From: Dan Smith @ 2009-02-25 18:12 UTC (permalink / raw)
  To: containers-qjLDD68F18O7TbgM5vRIOg

As suggested by Dave, this provides us a way to make the copy-in and
copy-out processes symmetric.  I also added CR_COPY_BIT() to use with
the s390 bitfields, since we can't memcpy() those.

Changelog:
    Feb 25:
            . Changed WARN_ON() to BUILD_BUG_ON()

Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
 include/linux/checkpoint.h |   20 ++++++++++++++++++++
 1 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
index 217cf6e..3add90e 100644
--- a/include/linux/checkpoint.h
+++ b/include/linux/checkpoint.h
@@ -149,4 +149,24 @@ static inline void process_deny_checkpointing(struct task_struct *task) {}
 
 #endif
 
+#define CR_CPT 1
+#define CR_RST 2
+
+#define CR_COPY(op, a, b)				\
+	do {						\
+		BUILD_BUG_ON(sizeof(a) != sizeof(b));	\
+		if (op == CR_CPT)			\
+			memcpy(&a, &b, sizeof(a));	\
+		else					\
+			memcpy(&b, &a, sizeof(a));	\
+	} while (0);
+
+#define CR_COPY_BIT(op, a, b)				\
+	do {						\
+		if (op == CR_CPT)			\
+			a = b;				\
+		else					\
+			b = a;				\
+	} while (0);
+
 #endif /* _CHECKPOINT_CKPT_H_ */
-- 
1.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6)
       [not found] ` <1235585529-806-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2009-02-25 18:12     ` Dan Smith
  2009-02-25 18:12   ` [PATCH 2/3] c/r: Add CR_COPY() macro (v2) Dan Smith
@ 2009-02-25 18:12   ` Dan Smith
       [not found]     ` <1235585529-806-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
  2 siblings, 1 reply; 12+ messages in thread
From: Dan Smith @ 2009-02-25 18:12 UTC (permalink / raw)
  To: containers-qjLDD68F18O7TbgM5vRIOg

Implement the s390 arch-specific checkpoint/restart helpers.  This
is on top of Oren Laadan's c/r code.

With these, I am able to checkpoint and restart simple programs as per
Oren's patch intro.  While on x86 I never had to freeze a single task
to checkpoint it, on s390 I do need to.  That is a prereq for consistent
snapshots (esp with multiple processes) anyway so I don't see that as
a problem.

Changelog:
    Feb 25:
            . Make checkpoint_hdr.h safe for inclusion in userspace
            . Replace comment about vsdo code
            . Add comment about restoring access registers
            . Write and read an empty cr_hdr_head_arch record to appease
              code (mktree) that expects it to be there
            . Utilize NUM_CR_WORDS in checkpoint_hdr.h
    Feb 24:
            . Use CR_COPY() to unify the un/loading of cpu and mm state
            . Fix fprs definition in cr_hdr_cpu
            . Remove debug WARN_ON() from checkpoint.c
    Feb 23:
            . Macro-ize the un/packing of trace flags
            . Fix the crash when externally-linked
            . Break out the restart functions into restart.c
            . Remove unneeded s390_enable_sie() call
    Jan 30:
            . Switched types in cr_hdr_cpu to __u64 etc.
              (Per Oren suggestion)
            . Replaced direct inclusion of structs in
              cr_hdr_cpu with the struct members.
              (Per Oren suggestion)
            . Also ended up adding a bunch of new things
              into restart (mm_segment, ksp, etc) in vain
              attempt to get code using fpu to not segfault
              after restart.

Signed-off-by: Serge E. Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
---
 arch/s390/include/asm/checkpoint_hdr.h |   88 ++++++++++++++++++++++++
 arch/s390/include/asm/unistd.h         |    4 +-
 arch/s390/kernel/compat_wrapper.S      |   12 +++
 arch/s390/kernel/syscalls.S            |    2 +
 arch/s390/mm/Makefile                  |    1 +
 arch/s390/mm/checkpoint.c              |  118 ++++++++++++++++++++++++++++++++
 arch/s390/mm/restart.c                 |   85 +++++++++++++++++++++++
 checkpoint/Kconfig                     |    2 +-
 8 files changed, 310 insertions(+), 2 deletions(-)
 create mode 100644 arch/s390/include/asm/checkpoint_hdr.h
 create mode 100644 arch/s390/mm/checkpoint.c
 create mode 100644 arch/s390/mm/restart.c

diff --git a/arch/s390/include/asm/checkpoint_hdr.h b/arch/s390/include/asm/checkpoint_hdr.h
new file mode 100644
index 0000000..0a405c2
--- /dev/null
+++ b/arch/s390/include/asm/checkpoint_hdr.h
@@ -0,0 +1,88 @@
+#ifndef __ASM_S390_CKPT_HDR_H
+#define __ASM_S390_CKPT_HDR_H
+/*
+ *  Checkpoint/restart - architecture specific headers s/390
+ *
+ *  Copyright IBM Corp. 2009
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+#include <asm/ptrace.h>
+
+#ifdef __KERNEL__
+#include <asm/processor.h>
+#else
+#include <sys/user.h>
+#endif
+
+#ifdef __s390x__
+
+/*
+ * Notes
+ * NUM_GPRS defined in <asm/ptrace.h> to be 16
+ * NUM_FPRS defined in <asm/ptrace.h> to be 16
+ * NUM_APRS defined in <asm/ptrace.h> to be 16
+ * NUM_CR_WORDS defined in <asm/ptrace.h> to be 3
+ */
+struct cr_hdr_cpu {
+	__u64 args[1];
+	__u64 gprs[NUM_GPRS];
+	__u64 orig_gpr2;
+	__u16 svcnr;
+	__u16 ilc;
+	__u32 acrs[NUM_ACRS];
+	__u64 ieee_instruction_pointer;
+
+	/* psw_t */
+	__u64 psw_t_mask;
+	__u64 psw_t_addr;
+
+	/* s390_fp_regs_t */
+	__u32 fpc;
+	union {
+		float f;
+		double d;
+		__u64 ui;
+		struct {
+			__u32 fp_hi;
+			__u32 fp_lo;
+		} fp;
+	} fprs[NUM_FPRS];
+
+	/* per_struct */
+	__u64 per_control_regs[NUM_CR_WORDS];
+	__u64 starting_addr;
+	__u64 ending_addr;
+	__u64 address;
+	__u16 perc_atmid;
+	__u8 access_id;
+	__u8 single_step;
+	__u8 instruction_fetch;
+};
+
+struct cr_hdr_mm_context {
+	unsigned long vdso_base;
+	int noexec;
+	int has_pgste;
+	int alloc_pgste;
+	unsigned long asce_bits;
+	unsigned long asce_limit;
+};
+
+struct cr_hdr_head_arch {
+};
+
+#ifdef __KERNEL__
+/* Functions for copying to/from the header structs */
+extern void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t);
+extern void cr_s390_mm(int op, struct cr_hdr_mm_context *hh,
+		       struct mm_struct *mm);
+#endif
+
+#endif /* __s390x__ */
+
+#endif /* __ASM_S390_CKPT_HDR__H */
diff --git a/arch/s390/include/asm/unistd.h b/arch/s390/include/asm/unistd.h
index c8ad350..ffe64a0 100644
--- a/arch/s390/include/asm/unistd.h
+++ b/arch/s390/include/asm/unistd.h
@@ -265,7 +265,9 @@
 #define __NR_pipe2		325
 #define __NR_dup3		326
 #define __NR_epoll_create1	327
-#define NR_syscalls 328
+#define __NR_checkpoint		328
+#define __NR_restart		329
+#define NR_syscalls 330
 
 /* 
  * There are some system calls that are not present on 64 bit, some
diff --git a/arch/s390/kernel/compat_wrapper.S b/arch/s390/kernel/compat_wrapper.S
index fc2c971..9546a81 100644
--- a/arch/s390/kernel/compat_wrapper.S
+++ b/arch/s390/kernel/compat_wrapper.S
@@ -1767,3 +1767,15 @@ sys_dup3_wrapper:
 sys_epoll_create1_wrapper:
 	lgfr	%r2,%r2			# int
 	jg	sys_epoll_create1	# branch to system call
+
+	.globl sys_checkpoint_wrapper
+sys_checkpoint_wrapper:
+	lgfr	%r2,%r2			# pid_t
+	lgfr	%r3,%r3			# int
+	llgfr	%r4,%r4			# unsigned long
+
+	.globl sys_restart_wrapper
+sys_restart_wrapper:
+	lgfr	%r2,%r2			# int
+	lgfr	%r3,%r3			# int
+	llgfr	%r4,%r4			# unsigned long
diff --git a/arch/s390/kernel/syscalls.S b/arch/s390/kernel/syscalls.S
index 2d61787..54316c8 100644
--- a/arch/s390/kernel/syscalls.S
+++ b/arch/s390/kernel/syscalls.S
@@ -336,3 +336,5 @@ SYSCALL(sys_inotify_init1,sys_inotify_init1,sys_inotify_init1_wrapper)
 SYSCALL(sys_pipe2,sys_pipe2,sys_pipe2_wrapper) /* 325 */
 SYSCALL(sys_dup3,sys_dup3,sys_dup3_wrapper)
 SYSCALL(sys_epoll_create1,sys_epoll_create1,sys_epoll_create1_wrapper)
+SYSCALL(sys_checkpoint,sys_checkpoint,sys_checkpoint_wrapper)
+SYSCALL(sys_restart,sys_restart,sys_restart_wrapper)
diff --git a/arch/s390/mm/Makefile b/arch/s390/mm/Makefile
index 2a74581..040fbb7 100644
--- a/arch/s390/mm/Makefile
+++ b/arch/s390/mm/Makefile
@@ -6,3 +6,4 @@ obj-y	 := init.o fault.o extmem.o mmap.o vmem.o pgtable.o
 obj-$(CONFIG_CMM) += cmm.o
 obj-$(CONFIG_HUGETLB_PAGE) += hugetlbpage.o
 obj-$(CONFIG_PAGE_STATES) += page-states.o
+obj-$(CONFIG_CHECKPOINT_RESTART) += checkpoint.o restart.o
diff --git a/arch/s390/mm/checkpoint.c b/arch/s390/mm/checkpoint.c
new file mode 100644
index 0000000..09932f1
--- /dev/null
+++ b/arch/s390/mm/checkpoint.c
@@ -0,0 +1,118 @@
+/*
+ *  Checkpoint/restart - architecture specific support for s390
+ *
+ *  Copyright IBM Corp. 2009
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+#include <linux/kernel.h>
+#include <asm/system.h>
+#include <asm/pgtable.h>
+
+void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t)
+{
+	struct pt_regs *regs = task_pt_regs(t);
+	struct thread_struct *thr = &t->thread;
+
+	CR_COPY(op, hh->fpc, thr->fp_regs.fpc);
+	CR_COPY(op, hh->fprs, thr->fp_regs.fprs);
+	CR_COPY(op, hh->acrs, thr->acrs);
+	CR_COPY(op, hh->psw_t_mask, regs->psw.mask);
+	CR_COPY(op, hh->psw_t_addr, regs->psw.addr);
+	CR_COPY(op, hh->args, regs->args);
+	CR_COPY(op, hh->svcnr, regs->svcnr);
+	CR_COPY(op, hh->ilc, regs->ilc);
+	CR_COPY(op, hh->gprs, regs->gprs);
+	CR_COPY(op, hh->orig_gpr2, regs->orig_gpr2);
+	CR_COPY(op, hh->per_control_regs, thr->per_info.control_regs.words);
+	CR_COPY(op, hh->starting_addr, thr->per_info.starting_addr);
+	CR_COPY(op, hh->ending_addr, thr->per_info.ending_addr);
+	CR_COPY(op, hh->perc_atmid, thr->per_info.lowcore.words.perc_atmid);
+	CR_COPY(op, hh->address, thr->per_info.lowcore.words.address);
+	CR_COPY(op, hh->access_id, thr->per_info.lowcore.words.access_id);
+	CR_COPY(op, hh->ieee_instruction_pointer,
+		thr->ieee_instruction_pointer);
+
+	CR_COPY_BIT(op, hh->single_step, thr->per_info.single_step);
+	CR_COPY_BIT(op, hh->instruction_fetch,
+		    thr->per_info.instruction_fetch);
+}
+
+void cr_s390_mm(int op, struct cr_hdr_mm_context *hh, struct mm_struct *mm)
+{
+#if 0
+	/* Oren's v13 is on an older kernel which has no vdso_base
+	 * on newer kernel, we'll have to enable this
+	 */
+	CR_COPY(op, hh->vdso_base, mm->context.vdso_base);
+#endif
+	CR_COPY(op, hh->noexec, mm->context.noexec);
+	CR_COPY(op, hh->has_pgste, mm->context.has_pgste);
+	CR_COPY(op, hh->alloc_pgste, mm->context.alloc_pgste);
+	CR_COPY(op, hh->asce_bits, mm->context.asce_bits);
+	CR_COPY(op, hh->asce_limit, mm->context.asce_limit);
+}
+
+int cr_write_thread(struct cr_ctx *ctx, struct task_struct *t)
+{
+	return 0;
+}
+
+/* dump the cpu state and registers of a given task */
+int cr_write_cpu(struct cr_ctx *ctx, struct task_struct *t)
+{
+	struct cr_hdr h;
+	struct cr_hdr_cpu *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_CPU;
+	h.len = sizeof(*hh);
+	h.parent = task_pid_vnr(t);
+
+	cr_s390_regs(CR_CPT, hh, t);
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+
+	return ret;
+}
+
+/* Write an empty header since it is assumed to be there */
+int cr_write_head_arch(struct cr_ctx *ctx)
+{
+	struct cr_hdr h;
+	struct cr_hdr_head_arch *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_HEAD_ARCH;
+	h.len = sizeof(*hh);
+	h.parent = 0;
+
+	ret = cr_write_obj(ctx, &h, &hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+
+	return ret;
+}
+
+int cr_write_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int parent)
+{
+	struct cr_hdr h;
+	struct cr_hdr_mm_context *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int ret;
+
+	h.type = CR_HDR_MM_CONTEXT;
+	h.len = sizeof(*hh);
+	h.parent = parent;
+
+	cr_s390_mm(CR_CPT, hh, mm);
+
+	ret = cr_write_obj(ctx, &h, hh);
+	cr_hbuf_put(ctx, sizeof(*hh));
+
+	return ret;
+}
diff --git a/arch/s390/mm/restart.c b/arch/s390/mm/restart.c
new file mode 100644
index 0000000..18229f1
--- /dev/null
+++ b/arch/s390/mm/restart.c
@@ -0,0 +1,85 @@
+/*
+ *  Checkpoint/restart - architecture specific support for s390
+ *
+ *  Copyright IBM Corp. 2009
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/checkpoint.h>
+#include <linux/checkpoint_hdr.h>
+#include <linux/kernel.h>
+#include <asm/system.h>
+#include <asm/pgtable.h>
+
+extern void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t);
+extern void cr_s390_mm(int op, struct cr_hdr_mm_context *hh,
+		       struct mm_struct *mm);
+
+int cr_read_thread(struct cr_ctx *ctx)
+{
+	return 0;
+}
+
+int cr_read_cpu(struct cr_ctx *ctx)
+{
+	struct cr_hdr_cpu *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	struct pt_regs *regs = task_pt_regs(current);
+	int parent, ret;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_CPU);
+	if  (parent < 0) {
+		ret = parent;
+		goto out;
+	}
+	ret = 0;
+
+	regs->psw.addr &= ~PSW_ADDR_INSN;
+	cr_s390_regs(CR_RST, hh, current);
+
+	/* s390 does not restore the access registers after a syscall,
+	 * but does on a task switch.  Since we're switching tasks (in
+	 * a way), we need to replicate that behavior here.
+	 */
+	restore_access_regs(hh->acrs);
+out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
+
+int cr_read_head_arch(struct cr_ctx *ctx)
+{
+	struct cr_hdr_head_arch *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = 0;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_HEAD_ARCH);
+	if (parent < 0)
+		ret = parent;
+
+	cr_hbuf_put(ctx, sizeof(*hh));
+
+	return ret;
+}
+
+
+int cr_read_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int rparent)
+{
+	struct cr_hdr_mm_context *hh = cr_hbuf_get(ctx, sizeof(*hh));
+	int parent, ret = -EINVAL;
+
+	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_MM_CONTEXT);
+	if (parent < 0) {
+		ret = parent;
+		goto out;
+	}
+	if (parent != rparent)
+		goto out;
+
+	cr_s390_mm(CR_RST, hh, mm);
+	ret = 0;
+ out:
+	cr_hbuf_put(ctx, sizeof(*hh));
+	return ret;
+}
diff --git a/checkpoint/Kconfig b/checkpoint/Kconfig
index ffaa635..cb1d29d 100644
--- a/checkpoint/Kconfig
+++ b/checkpoint/Kconfig
@@ -1,7 +1,7 @@
 config CHECKPOINT_RESTART
 	prompt "Enable checkpoint/restart (EXPERIMENTAL)"
 	def_bool n
-	depends on X86_32 && EXPERIMENTAL
+	depends on (X86_32 || (S390 && 64BIT)) && EXPERIMENTAL
 	help
 	  Application checkpoint/restart is the ability to save the
 	  state of a running application so that it can later resume
-- 
1.6.1

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] c/r: Add CR_COPY() macro (v2)
       [not found]     ` <1235585529-806-3-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-02-25 22:08       ` Nathan Lynch
       [not found]         ` <20090225160841.4d727144-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
  2009-02-25 22:23         ` Dan Smith
  0 siblings, 2 replies; 12+ messages in thread
From: Nathan Lynch @ 2009-02-25 22:08 UTC (permalink / raw)
  To: Dan Smith; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

On Wed, 25 Feb 2009 13:12:08 -0500
Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:

> As suggested by Dave, this provides us a way to make the copy-in and
> copy-out processes symmetric.  I also added CR_COPY_BIT() to use with
> the s390 bitfields, since we can't memcpy() those.
> 
> Changelog:
>     Feb 25:
>             . Changed WARN_ON() to BUILD_BUG_ON()
> 
> Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> ---
>  include/linux/checkpoint.h |   20 ++++++++++++++++++++
>  1 files changed, 20 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
> index 217cf6e..3add90e 100644
> --- a/include/linux/checkpoint.h
> +++ b/include/linux/checkpoint.h
> @@ -149,4 +149,24 @@ static inline void process_deny_checkpointing(struct task_struct *task) {}
>  
>  #endif
>  
> +#define CR_CPT 1
> +#define CR_RST 2
> +
> +#define CR_COPY(op, a, b)				\
> +	do {						\
> +		BUILD_BUG_ON(sizeof(a) != sizeof(b));	\
> +		if (op == CR_CPT)			\
> +			memcpy(&a, &b, sizeof(a));	\
> +		else					\
> +			memcpy(&b, &a, sizeof(a));	\
> +	} while (0);
> +
> +#define CR_COPY_BIT(op, a, b)				\
> +	do {						\
> +		if (op == CR_CPT)			\
> +			a = b;				\
> +		else					\
> +			b = a;				\
> +	} while (0);
> +
>  #endif /* _CHECKPOINT_CKPT_H_ */

No typechecking.  Multiple expansion of arguments (probably harmless
for the current use case, but still).  Generates a memcpy where,
depending on the arguments, simple assignment would be sufficient and
preferred.  Not useful for targets where fields in the checkpoint image
are larger than the register state they reflect (32-bit variants of x86
and powerpc).

Anyway, checkpoint and restart should not be "symmetric" -- the restart
code has to validate certain values, such as privileged registers, in
the image before committing them.

And while these macros may reduce initial development effort, the
reading effort incurred is non-trivial.  It doesn't seem worth it.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] c/r: Add CR_COPY() macro (v2)
       [not found]         ` <20090225160841.4d727144-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
@ 2009-02-25 22:21           ` Serge E. Hallyn
  0 siblings, 0 replies; 12+ messages in thread
From: Serge E. Hallyn @ 2009-02-25 22:21 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg, Dan Smith

Quoting Nathan Lynch (ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org):
> On Wed, 25 Feb 2009 13:12:08 -0500
> Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> 
> > As suggested by Dave, this provides us a way to make the copy-in and
> > copy-out processes symmetric.  I also added CR_COPY_BIT() to use with
> > the s390 bitfields, since we can't memcpy() those.
> > 
> > Changelog:
> >     Feb 25:
> >             . Changed WARN_ON() to BUILD_BUG_ON()
> > 
> > Signed-off-by: Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> > ---
> >  include/linux/checkpoint.h |   20 ++++++++++++++++++++
> >  1 files changed, 20 insertions(+), 0 deletions(-)
> > 
> > diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
> > index 217cf6e..3add90e 100644
> > --- a/include/linux/checkpoint.h
> > +++ b/include/linux/checkpoint.h
> > @@ -149,4 +149,24 @@ static inline void process_deny_checkpointing(struct task_struct *task) {}
> >  
> >  #endif
> >  
> > +#define CR_CPT 1
> > +#define CR_RST 2
> > +
> > +#define CR_COPY(op, a, b)				\
> > +	do {						\
> > +		BUILD_BUG_ON(sizeof(a) != sizeof(b));	\
> > +		if (op == CR_CPT)			\
> > +			memcpy(&a, &b, sizeof(a));	\
> > +		else					\
> > +			memcpy(&b, &a, sizeof(a));	\
> > +	} while (0);
> > +
> > +#define CR_COPY_BIT(op, a, b)				\
> > +	do {						\
> > +		if (op == CR_CPT)			\
> > +			a = b;				\
> > +		else					\
> > +			b = a;				\
> > +	} while (0);
> > +
> >  #endif /* _CHECKPOINT_CKPT_H_ */
> 
> No typechecking.  Multiple expansion of arguments (probably harmless
> for the current use case, but still).  Generates a memcpy where,
> depending on the arguments, simple assignment would be sufficient and

well is there any case where we must do a memcpy?  Or, if the headers
in the checkpoint image also give a bounded array size, can we just
copy that array using 'a = b'?

In any case we guarantee the sizes are correct, which may be all that
we want anyway.

> preferred.  Not useful for targets where fields in the checkpoint image
> are larger than the register state they reflect (32-bit variants of x86
> and powerpc).
> 
> Anyway, checkpoint and restart should not be "symmetric" -- the restart
> code has to validate certain values, such as privileged registers, in
> the image before committing them.

The argument in favor is that using the CR_COPY() macros in a single fn
for the common code allows us to

	1. make sure that none of the common code is accidentally
		overlooked (which does in fact make it more
		maintainable)
	2. lets any code which must be specially done at checkpoint
		or restart stand out.  Like resetting access registers
		in Dan's version.

> And while these macros may reduce initial development effort, the
> reading effort incurred is non-trivial.  It doesn't seem worth it.

-serge

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/3] c/r: Add CR_COPY() macro (v2)
  2009-02-25 22:08       ` Nathan Lynch
       [not found]         ` <20090225160841.4d727144-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
@ 2009-02-25 22:23         ` Dan Smith
  1 sibling, 0 replies; 12+ messages in thread
From: Dan Smith @ 2009-02-25 22:23 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

NL> No typechecking. 

I think I can make this a little better by doing what Serge suggested:

  sizeof(typeof(a)) != sizeof(typeof(b));

I was focused on the CR_COPY_BIT() variant at the time, so I didn't
think to apply it to CR_COPY.

NL> Generates a memcpy where, depending on the arguments, simple
NL> assignment would be sufficient and preferred.

The implementation that uses these in a common function to copy in
either direction could certainly apply them only where appropriate.
Further, we could have a CR_COPY() and CR_COPY_MULTI() which would use
assignment and memcpy() respectively.

NL> Anyway, checkpoint and restart should not be "symmetric" -- the
NL> restart code has to validate certain values, such as privileged
NL> registers, in the image before committing them.

As Serge (just) said, I think that it makes it pretty clear where the
special cases are.  There's no reason (that I can think of) why you
can't check all of your values before you call the symmetric copy
function during restore.  You've got to check and then copy anyway.

-- 
Dan Smith
IBM Linux Technology Center
email: danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6)
       [not found]     ` <1235585529-806-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
@ 2009-02-25 22:28       ` Nathan Lynch
       [not found]         ` <20090225162817.2003383c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Nathan Lynch @ 2009-02-25 22:28 UTC (permalink / raw)
  To: Dan Smith; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

On Wed, 25 Feb 2009 13:12:09 -0500
Dan Smith <danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> wrote:
> +void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t)
> +{
> +	struct pt_regs *regs = task_pt_regs(t);
> +	struct thread_struct *thr = &t->thread;
> +
> +	CR_COPY(op, hh->fpc, thr->fp_regs.fpc);
> +	CR_COPY(op, hh->fprs, thr->fp_regs.fprs);
> +	CR_COPY(op, hh->acrs, thr->acrs);
> +	CR_COPY(op, hh->psw_t_mask, regs->psw.mask);
> +	CR_COPY(op, hh->psw_t_addr, regs->psw.addr);
> +	CR_COPY(op, hh->args, regs->args);
> +	CR_COPY(op, hh->svcnr, regs->svcnr);
> +	CR_COPY(op, hh->ilc, regs->ilc);
> +	CR_COPY(op, hh->gprs, regs->gprs);
> +	CR_COPY(op, hh->orig_gpr2, regs->orig_gpr2);
> +	CR_COPY(op, hh->per_control_regs, thr->per_info.control_regs.words);
> +	CR_COPY(op, hh->starting_addr, thr->per_info.starting_addr);
> +	CR_COPY(op, hh->ending_addr, thr->per_info.ending_addr);
> +	CR_COPY(op, hh->perc_atmid, thr->per_info.lowcore.words.perc_atmid);
> +	CR_COPY(op, hh->address, thr->per_info.lowcore.words.address);
> +	CR_COPY(op, hh->access_id, thr->per_info.lowcore.words.access_id);
> +	CR_COPY(op, hh->ieee_instruction_pointer,
> +		thr->ieee_instruction_pointer);
> +
> +	CR_COPY_BIT(op, hh->single_step, thr->per_info.single_step);
> +	CR_COPY_BIT(op, hh->instruction_fetch,
> +		    thr->per_info.instruction_fetch);
> +}

No comments here except that I dislike the macros (see response to
patch #2); I'm quoting this because I have a question about one of its
call sites below.


> +
> +void cr_s390_mm(int op, struct cr_hdr_mm_context *hh, struct mm_struct *mm)
> +{
> +#if 0
> +	/* Oren's v13 is on an older kernel which has no vdso_base
> +	 * on newer kernel, we'll have to enable this
> +	 */
> +	CR_COPY(op, hh->vdso_base, mm->context.vdso_base);
> +#endif

During restart, does this replace the current task's VDSO contents, and
if so, is that wise?  VDSO areas contain things like timestamps for
gettimeofday()...


> +/* Write an empty header since it is assumed to be there */
> +int cr_write_head_arch(struct cr_ctx *ctx)
> +{
> +	struct cr_hdr h;
> +	struct cr_hdr_head_arch *hh = cr_hbuf_get(ctx, sizeof(*hh));
> +	int ret;
> +
> +	h.type = CR_HDR_HEAD_ARCH;
> +	h.len = sizeof(*hh);
> +	h.parent = 0;
> +
> +	ret = cr_write_obj(ctx, &h, &hh);
> +	cr_hbuf_put(ctx, sizeof(*hh));
> +
> +	return ret;
> +}

In the powerpc implementation I was able to get away with returning
zero, without writing dummy headers, for cases like this.


> diff --git a/arch/s390/mm/restart.c b/arch/s390/mm/restart.c
> new file mode 100644
> index 0000000..18229f1
> --- /dev/null
> +++ b/arch/s390/mm/restart.c
> @@ -0,0 +1,85 @@
> +/*
> + *  Checkpoint/restart - architecture specific support for s390
> + *
> + *  Copyright IBM Corp. 2009
> + *
> + *  This file is subject to the terms and conditions of the GNU General Public
> + *  License.  See the file COPYING in the main directory of the Linux
> + *  distribution for more details.
> + */
> +
> +#include <linux/checkpoint.h>
> +#include <linux/checkpoint_hdr.h>
> +#include <linux/kernel.h>
> +#include <asm/system.h>
> +#include <asm/pgtable.h>
> +
> +extern void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t);
> +extern void cr_s390_mm(int op, struct cr_hdr_mm_context *hh,
> +		       struct mm_struct *mm);

These belong in a header, please...


> +int cr_read_cpu(struct cr_ctx *ctx)
> +{
> +	struct cr_hdr_cpu *hh = cr_hbuf_get(ctx, sizeof(*hh));
> +	struct pt_regs *regs = task_pt_regs(current);
> +	int parent, ret;
> +
> +	parent = cr_read_obj_type(ctx, hh, sizeof(*hh), CR_HDR_CPU);
> +	if  (parent < 0) {
> +		ret = parent;
> +		goto out;
> +	}
> +	ret = 0;
> +
> +	regs->psw.addr &= ~PSW_ADDR_INSN;
> +	cr_s390_regs(CR_RST, hh, current);

The PSW_ADDR_INSN bit in regs->psw.addr is cleared, and then
regs->psw.addr is overwritten by cr_s390_regs?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6)
       [not found]         ` <20090225162817.2003383c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
@ 2009-02-25 22:37           ` Dan Smith
  2009-02-25 23:34           ` Nathan Lynch
  2009-02-25 23:34           ` Serge E. Hallyn
  2 siblings, 0 replies; 12+ messages in thread
From: Dan Smith @ 2009-02-25 22:37 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

NL> In the powerpc implementation I was able to get away with
NL> returning zero, without writing dummy headers, for cases like
NL> this.

Right, as did we.  However, mktree.c expects to read a header in this
part of the stream.  The kernel expects what the kernel has written,
which is easy.  However, when writing something that needs to
interpret any platform's stream, I think it's easier if you don't have
a bunch of "optional" headers that you need to test for and maybe
handle (especially in the case of reading the stream over a socket or
the like).

>> +extern void cr_s390_regs(int op, struct cr_hdr_cpu *hh, struct task_struct *t);
>> +extern void cr_s390_mm(int op, struct cr_hdr_mm_context *hh,
>> +		       struct mm_struct *mm);

NL> These belong in a header, please...

Actually, I was hoping that Dave would stir up some conversation about
moving all of this back into a single file since we cut the line count
down so much with our evil macros :)

>> +	regs->psw.addr &= ~PSW_ADDR_INSN;
>> +	cr_s390_regs(CR_RST, hh, current);

NL> The PSW_ADDR_INSN bit in regs->psw.addr is cleared, and then
NL> regs-> psw.addr is overwritten by cr_s390_regs?

Yes, this is broken, thanks.  It is also an example of where the
macros won't work as nicely for us.  This is left over from Serge's
original code, where I believe he was attempting to avoid loading
anything into the PSW other than the instruction pointer bit.  A
trivial change to cr_s390_regs() will correct this.

Thanks!

-- 
Dan Smith
IBM Linux Technology Center
email: danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6)
       [not found]         ` <20090225162817.2003383c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
  2009-02-25 22:37           ` Dan Smith
@ 2009-02-25 23:34           ` Nathan Lynch
  2009-02-25 23:34           ` Serge E. Hallyn
  2 siblings, 0 replies; 12+ messages in thread
From: Nathan Lynch @ 2009-02-25 23:34 UTC (permalink / raw)
  To: Dan Smith; +Cc: containers-qjLDD68F18O7TbgM5vRIOg

On Wed, 25 Feb 2009 16:28:17 -0600
Nathan Lynch <ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org> wrote:
> > +
> > +void cr_s390_mm(int op, struct cr_hdr_mm_context *hh, struct mm_struct *mm)
> > +{
> > +#if 0
> > +	/* Oren's v13 is on an older kernel which has no vdso_base
> > +	 * on newer kernel, we'll have to enable this
> > +	 */
> > +	CR_COPY(op, hh->vdso_base, mm->context.vdso_base);
> > +#endif
> 
> During restart, does this replace the current task's VDSO contents, and
> if so, is that wise?  VDSO areas contain things like timestamps for
> gettimeofday()...

Okay, it doesn't blow away the VDSO contents, it merely copies
the value of vdso_base (something I would have been able to discern more
quickly without the macro, btw).

But you'll still have to ensure that vdso region is installed at the
address that the restarting task expects -- see
arch/s390/kernel/vdso.c::arch_setup_additional_pages.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6)
       [not found]         ` <20090225162817.2003383c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
  2009-02-25 22:37           ` Dan Smith
  2009-02-25 23:34           ` Nathan Lynch
@ 2009-02-25 23:34           ` Serge E. Hallyn
  2 siblings, 0 replies; 12+ messages in thread
From: Serge E. Hallyn @ 2009-02-25 23:34 UTC (permalink / raw)
  To: Nathan Lynch; +Cc: containers-qjLDD68F18O7TbgM5vRIOg, Dan Smith

Quoting Nathan Lynch (ntl-e+AXbWqSrlAAvxtiuMwx3w@public.gmane.org):
> > +#if 0
> > +	/* Oren's v13 is on an older kernel which has no vdso_base
> > +	 * on newer kernel, we'll have to enable this
> > +	 */
> > +	CR_COPY(op, hh->vdso_base, mm->context.vdso_base);
> > +#endif
> 
> During restart, does this replace the current task's VDSO contents, and
> if so, is that wise?  VDSO areas contain things like timestamps for
> gettimeofday()...

(I've gone back and forth on this, and the following answer differs
from any interpretation i've had before :)

Well, arch_setup_additional_pages() uses get_unmapped_area() to get
a user mapping for the vdso pages.  So i would expect that in fact
the arch-independent code would set up a new vdso mapping at the
checkpointed location, and so mm->context.vdso_base does in fact
need to be changed.

But it's something to verify when we port the code to a newer
kernel.

-serge

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2009-02-25 23:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-25 18:12 [PATCH 0/3] c/r: Add s390 support Dan Smith
     [not found] ` <1235585529-806-1-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-02-25 18:12   ` [PATCH 1/3] s390: Expose a constant for the number of words representing the CRs Dan Smith
2009-02-25 18:12     ` Dan Smith
2009-02-25 18:12   ` [PATCH 2/3] c/r: Add CR_COPY() macro (v2) Dan Smith
     [not found]     ` <1235585529-806-3-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-02-25 22:08       ` Nathan Lynch
     [not found]         ` <20090225160841.4d727144-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-02-25 22:21           ` Serge E. Hallyn
2009-02-25 22:23         ` Dan Smith
2009-02-25 18:12   ` [PATCH 3/3] c/r: define s390-specific checkpoint-restart code (v6) Dan Smith
     [not found]     ` <1235585529-806-4-git-send-email-danms-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-02-25 22:28       ` Nathan Lynch
     [not found]         ` <20090225162817.2003383c-4v5LP+xe+1byhTdZtsIeww@public.gmane.org>
2009-02-25 22:37           ` Dan Smith
2009-02-25 23:34           ` Nathan Lynch
2009-02-25 23:34           ` Serge E. Hallyn

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.