All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH] user-cr: Extract kernel headers
@ 2009-08-17 15:24 Matt Helsley
       [not found] ` <20090817152403.GA11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Matt Helsley @ 2009-08-17 15:24 UTC (permalink / raw)
  To: Oren Laadan; +Cc: Containers

Using kernel headers directly from userspace is strongly discouraged.
This patch attempts to sanitize kernel headers for userspace by
extracting non-__KERNEL__ portions of the various checkpoint headers
and placing them in a similar organization of userspace headers.

The script is run from the top level of the user-cr source tree like:

	./scripts/extract-headers.sh -s <path-to-kern-source> -o ./include


The patch includes a copy of the auto-generated headers and adjusts
the user-cr programs to use them.

Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>

TODO: Builds on i386. Probably needs more testing, especially on
	other non-i386, non-32-bit platforms.

	Look at mergiing checkpoint_syscalls.h with checkpoint.h
	Or at least find a better, shorter name for checkpoint_syscalls.h

NOTES: The script is much larger (2.5x) than for cr_tests because cr_tests
	only required the syscall numbers and a few flags for the syscalls.

	The headers have a similar organization to the kernel headers
	because struct ckpt_hdr must be defined before the arch hdrs and 
	yet CKPT_ARCH_NSIG must be defined before the generic signal hdrs.
	Plus it's easier to avoid rewriting the paths within the include
	directories...

	checkpoint_syscalls.h is a multi-arch file with all the syscall
	numbers normally found in the arch's unistd.h. I chose to use a
	different name to avoid clashes with /usr/include headers.
---
 Makefile                            |   31 +--
 ckpt.c                              |    1 +
 ckptinfo.c                          |    2 +
 ckptinfo.py                         |    2 +
 include/asm/checkpoint_hdr.h        |  205 ++++++++++++
 include/linux/checkpoint.h          |   26 ++
 include/linux/checkpoint_hdr.h      |  580 +++++++++++++++++++++++++++++++++++
 include/linux/checkpoint_syscalls.h |   30 ++
 mktree.c                            |    4 +
 rstr.c                              |    1 +
 scripts/extract-headers.sh          |  194 ++++++++++++
 self.c                              |    1 +
 12 files changed, 1056 insertions(+), 21 deletions(-)
 create mode 100644 include/asm/checkpoint_hdr.h
 create mode 100644 include/linux/checkpoint.h
 create mode 100644 include/linux/checkpoint_hdr.h
 create mode 100644 include/linux/checkpoint_syscalls.h
 create mode 100755 scripts/extract-headers.sh

diff --git a/Makefile b/Makefile
index afff3f5..509a2a4 100644
--- a/Makefile
+++ b/Makefile
@@ -1,24 +1,5 @@
-
-KERNELSRC ?= ../linux
-KERNELBUILD ?= ../linux
-
-# default with 'make headers_install'
-KERNELHDR ?= $(KERNELSRC)/usr/include
-
-ifneq "$(realpath $(KERNELHDR)/linux/checkpoint.h)" ""
-# if .../usr/include contains our headers
-CKPT_INCLUDE = -I$(KERNELHDR)
-CKPT_HEADERS = $(KERNELHDR)/linux/checkpoint_hdr.h \
-	       $(KERNELHDR)/asm/checkpoint_hdr.h
-else
-# else, usr the kernel source itself
-# but first, find linux architecure
-KERN_ARCH = $(shell readlink $(KERNELBUILD)/include/asm | sed 's/^asm-//')
-CKPT_INCLUDE = -I$(KERNELSRC)/include \
-	       -I$(KERNELSRC)/arch/$(KERN_ARCH)/include
-CKPT_HEADERS = $(KERNELSRC)/include/linux/checkpoint_hdr.h \
-	       $(KERNELSRC)/arch/$(KERN_ARCH)/include/asm/checkpoint_hdr.h
-endif
+CKPT_INCLUDE = -I./include
+CKPT_HEADERS = $(shell find ./include -name '*.h')
 
 # compile with debug ?
 DEBUG = -DCHECKPOINT_DEBUG
@@ -39,6 +20,8 @@ OTHER = ckptinfo_types.c
 
 LDLIBS = -lm
 
+.PHONY: all distclean clean headers install
+
 all: $(PROGS)
 	@make -C test
 
@@ -56,10 +39,16 @@ ckptinfo_types.c: $(CKPT_HEADERS) ckptinfo.py
 
 %.o:	%.c
 
+headers:
+	./scripts/extract-headers.sh -s ../linux-2.6.git
+
 install:
 	@echo /usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
 	@/usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
 
+distclean: clean
+	@rm -f $(CKPT_HEADERS)
+
 clean:
 	@rm -f $(PROGS) $(OTHER) *~ *.o
 	@make -C test clean
diff --git a/ckpt.c b/ckpt.c
index b3f8dea..fb853f9 100644
--- a/ckpt.c
+++ b/ckpt.c
@@ -16,6 +16,7 @@
 #include <unistd.h>
 #include <sys/syscall.h>
 
+#include <linux/checkpoint_syscalls.h>
 #include <linux/checkpoint.h>
 
 static char usage_str[] =
diff --git a/ckptinfo.c b/ckptinfo.c
index 962a62f..2b8aaef 100644
--- a/ckptinfo.c
+++ b/ckptinfo.c
@@ -17,6 +17,8 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <fcntl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
 
 #include <linux/checkpoint_hdr.h>
 #include <asm/checkpoint_hdr.h>
diff --git a/ckptinfo.py b/ckptinfo.py
index ae7e5da..e0959f1 100755
--- a/ckptinfo.py
+++ b/ckptinfo.py
@@ -78,6 +78,8 @@ print """
  * This file is auto-generated by ckptinfo.py
  */
 
+#include <sys/socket.h>
+#include <sys/un.h>
 #include <linux/checkpoint_hdr.h>
 """
 
diff --git a/include/asm/checkpoint_hdr.h b/include/asm/checkpoint_hdr.h
new file mode 100644
index 0000000..57f3635
--- /dev/null
+++ b/include/asm/checkpoint_hdr.h
@@ -0,0 +1,205 @@
+/*
+ * Generated by extract-headers.sh.
+ */
+#ifndef __ASM_CHECKPOINT_HDR_H_
+#define __ASM_CHECKPOINT_HDR_H_
+#include <sys/user.h>
+
+#if __s390x__
+
+/*
+ *  Checkpoint/restart - architecture specific headers s/390
+ *
+ *  Copyright IBM Corp. 2009
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+#include <asm/ptrace.h>
+
+/*
+ * Notes
+ * NUM_GPRS defined in <asm/ptrace.h> to be 16
+ * NUM_FPRS defined in <asm/ptrace.h> to be 16
+ * NUM_APRS defined in <asm/ptrace.h> to be 16
+ * NUM_CR_WORDS defined in <asm/ptrace.h> to be 3
+ */
+struct ckpt_hdr_cpu {
+	struct ckpt_hdr h;
+	__u64 args[1];
+	__u64 gprs[NUM_GPRS];
+	__u64 orig_gpr2;
+	__u16 svcnr;
+	__u16 ilc;
+	__u32 acrs[NUM_ACRS];
+	__u64 ieee_instruction_pointer;
+
+	/* psw_t */
+	__u64 psw_t_mask;
+	__u64 psw_t_addr;
+
+	/* s390_fp_regs_t */
+	__u32 fpc;
+	union {
+		float f;
+		double d;
+		__u64 ui;
+		struct {
+			__u32 fp_hi;
+			__u32 fp_lo;
+		} fp;
+	} fprs[NUM_FPRS];
+
+	/* per_struct */
+	__u64 per_control_regs[NUM_CR_WORDS];
+	__u64 starting_addr;
+	__u64 ending_addr;
+	__u64 address;
+	__u16 perc_atmid;
+	__u8 access_id;
+	__u8 single_step;
+	__u8 instruction_fetch;
+};
+
+struct ckpt_hdr_mm_context {
+	struct ckpt_hdr h;
+	unsigned long vdso_base;
+	int noexec;
+	int has_pgste;
+	int alloc_pgste;
+	unsigned long asce_bits;
+	unsigned long asce_limit;
+};
+
+#define CKPT_ARCH_NSIG 64
+
+struct ckpt_hdr_header_arch {
+	struct ckpt_hdr h;
+};
+
+
+#elif __i386__ || __x86_64__
+
+/*
+ *  Checkpoint/restart - architecture specific headers x86
+ *
+ *  Copyright (C) 2008-2009 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+
+/*
+ * To maintain compatibility between 32-bit and 64-bit architecture flavors,
+ * keep data 64-bit aligned: use padding for structure members, and use
+ * __attribute__((aligned (8))) for the entire structure.
+ *
+ * Quoting Arnd Bergmann:
+ *   "This structure has an odd multiple of 32-bit members, which means
+ *   that if you put it into a larger structure that also contains 64-bit
+ *   members, the larger structure may get different alignment on x86-32
+ *   and x86-64, which you might want to avoid. I can't tell if this is
+ *   an actual problem here. ... In this case, I'm pretty sure that
+ *   sizeof(ckpt_hdr_task) on x86-32 is different from x86-64, since it
+ *   will be 32-bit aligned on x86-32."
+ */
+
+/* i387 structure seen from kernel/userspace */
+
+/* arch dependent header types */
+enum {
+	CKPT_HDR_CPU_FPU = 201,
+	CKPT_HDR_MM_CONTEXT_LDT,
+};
+
+#define CKPT_ARCH_NSIG 64
+
+struct ckpt_hdr_header_arch {
+	struct ckpt_hdr h;
+	/* FIXME: add HAVE_HWFP */
+	__u16 has_fxsr;
+	__u16 has_xsave;
+	__u16 xstate_size;
+	__u16 _pading;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_thread {
+	struct ckpt_hdr h;
+	__u32 thread_info_flags;
+	__u16 gdt_entry_tls_entries;
+	__u16 sizeof_tls_array;
+} __attribute__((aligned(8)));
+
+/* designed to work for both x86_32 and x86_64 */
+struct ckpt_hdr_cpu {
+	struct ckpt_hdr h;
+	/* see struct pt_regs (x86_64) */
+	__u64 r15;
+	__u64 r14;
+	__u64 r13;
+	__u64 r12;
+	__u64 bp;
+	__u64 bx;
+	__u64 r11;
+	__u64 r10;
+	__u64 r9;
+	__u64 r8;
+	__u64 ax;
+	__u64 cx;
+	__u64 dx;
+	__u64 si;
+	__u64 di;
+	__u64 orig_ax;
+	__u64 ip;
+	__u64 sp;
+
+	__u64 flags;
+
+	/* segment registers */
+	__u64 fs;
+	__u64 gs;
+
+	__u16 fsindex;
+	__u16 gsindex;
+	__u16 cs;
+	__u16 ss;
+	__u16 ds;
+	__u16 es;
+
+	__u32 used_math;
+
+	/* debug registers */
+	__u64 debugreg0;
+	__u64 debugreg1;
+	__u64 debugreg2;
+	__u64 debugreg3;
+	__u64 debugreg6;
+	__u64 debugreg7;
+
+	/* thread_xstate contents follow (if used_math) */
+} __attribute__((aligned(8)));
+
+#define CKPT_X86_SEG_NULL 0
+#define CKPT_X86_SEG_USER32_CS 1
+#define CKPT_X86_SEG_USER32_DS 2
+#define CKPT_X86_SEG_TLS 0x4000 /* 0100 0000 0000 00xx */
+#define CKPT_X86_SEG_LDT 0x8000 /* 100x xxxx xxxx xxxx */
+
+struct ckpt_hdr_mm_context {
+	struct ckpt_hdr h;
+	__u64 vdso;
+	__u32 ldt_entry_size;
+	__u32 nldt;
+} __attribute__((aligned(8)));
+
+
+#else
+#error "Architecture does not have definitons needed for checkpoint images."
+#endif
+#endif /* __ASM_CHECKPOINT_HDR_H_ */
diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
new file mode 100644
index 0000000..db6c6a4
--- /dev/null
+++ b/include/linux/checkpoint.h
@@ -0,0 +1,26 @@
+/*
+ * Generated by extract-headers.sh.
+ */
+#ifndef _LINUX_CHECKPOINT_H_
+#define _LINUX_CHECKPOINT_H_
+/*
+ *  Generic checkpoint-restart
+ *
+ *  Copyright (C) 2008-2009 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#define CHECKPOINT_VERSION 1
+
+/* checkpoint user flags */
+#define CHECKPOINT_SUBTREE 0x1
+
+/* restart user flags */
+#define RESTART_TASKSELF 0x1
+#define RESTART_FROZEN 0x2
+
+
+#endif /* _LINUX_CHECKPOINT_H_ */
diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
new file mode 100644
index 0000000..a6f4a57
--- /dev/null
+++ b/include/linux/checkpoint_hdr.h
@@ -0,0 +1,580 @@
+/*
+ * Generated by extract-headers.sh.
+ */
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#define _CHECKPOINT_CKPT_HDR_H_
+/*
+ *  Generic container checkpoint-restart
+ *
+ *  Copyright (C) 2008-2009 Oren Laadan
+ *
+ *  This file is subject to the terms and conditions of the GNU General Public
+ *  License.  See the file COPYING in the main directory of the Linux
+ *  distribution for more details.
+ */
+
+#include <linux/types.h>
+#include <linux/utsname.h>
+
+/*
+ * To maintain compatibility between 32-bit and 64-bit architecture flavors,
+ * keep data 64-bit aligned: use padding for structure members, and use
+ * __attribute__((aligned (8))) for the entire structure.
+ *
+ * Quoting Arnd Bergmann:
+ *   "This structure has an odd multiple of 32-bit members, which means
+ *   that if you put it into a larger structure that also contains 64-bit
+ *   members, the larger structure may get different alignment on x86-32
+ *   and x86-64, which you might want to avoid. I can't tell if this is
+ *   an actual problem here. ... In this case, I'm pretty sure that
+ *   sizeof(ckpt_hdr_task) on x86-32 is different from x86-64, since it
+ *   will be 32-bit aligned on x86-32."
+ */
+
+/*
+ * header format: 'struct ckpt_hdr' must prefix all other headers. Therfore
+ * when a header is passed around, the information about it (type, size)
+ * is readily available.
+ */
+struct ckpt_hdr {
+	__u32 type;
+	__u32 len;
+} __attribute__((aligned(8)));
+
+#include <asm/checkpoint_hdr.h>
+
+/* header types */
+enum {
+	CKPT_HDR_HEADER = 1,
+	CKPT_HDR_HEADER_ARCH,
+	CKPT_HDR_BUFFER,
+	CKPT_HDR_STRING,
+	CKPT_HDR_OBJREF,
+
+	CKPT_HDR_TREE = 101,
+	CKPT_HDR_TASK,
+	CKPT_HDR_TASK_NS,
+	CKPT_HDR_TASK_OBJS,
+	CKPT_HDR_RESTART_BLOCK,
+	CKPT_HDR_THREAD,
+	CKPT_HDR_CPU,
+	CKPT_HDR_NS,
+	CKPT_HDR_UTS_NS,
+	CKPT_HDR_IPC_NS,
+	CKPT_HDR_CAPABILITIES,
+	CKPT_HDR_USER_NS,
+	CKPT_HDR_CRED,
+	CKPT_HDR_USER,
+	CKPT_HDR_GROUPINFO,
+	CKPT_HDR_TASK_CREDS,
+
+	/* 201-299: reserved for arch-dependent */
+
+	CKPT_HDR_FILE_TABLE = 301,
+	CKPT_HDR_FILE_DESC,
+	CKPT_HDR_FILE_NAME,
+	CKPT_HDR_FILE,
+	CKPT_HDR_PIPE_BUF,
+
+	CKPT_HDR_MM = 401,
+	CKPT_HDR_VMA,
+	CKPT_HDR_PGARR,
+	CKPT_HDR_MM_CONTEXT,
+
+	CKPT_HDR_IPC = 501,
+	CKPT_HDR_IPC_SHM,
+	CKPT_HDR_IPC_MSG,
+	CKPT_HDR_IPC_MSG_MSG,
+	CKPT_HDR_IPC_SEM,
+
+	CKPT_HDR_SIGHAND = 601,
+
+	CKPT_HDR_FD_SOCKET = 701,
+	CKPT_HDR_SOCKET,
+	CKPT_HDR_SOCKET_QUEUE,
+	CKPT_HDR_SOCKET_BUFFER,
+	CKPT_HDR_SOCKET_UNIX,
+
+	CKPT_HDR_TAIL = 9001,
+
+	CKPT_HDR_ERROR = 9999,
+};
+
+/* architecture */
+enum {
+	/* do not change order (will break ABI) */
+	CKPT_ARCH_X86_32 = 1,
+	CKPT_ARCH_S390X,
+};
+
+/* shared objrects (objref) */
+struct ckpt_hdr_objref {
+	struct ckpt_hdr h;
+	__u32 objtype;
+	__s32 objref;
+} __attribute__((aligned(8)));
+
+/* shared objects types */
+enum obj_type {
+	CKPT_OBJ_IGNORE = 0,
+	CKPT_OBJ_INODE,
+	CKPT_OBJ_FILE_TABLE,
+	CKPT_OBJ_FILE,
+	CKPT_OBJ_MM,
+	CKPT_OBJ_SIGHAND,
+	CKPT_OBJ_NS,
+	CKPT_OBJ_UTS_NS,
+	CKPT_OBJ_IPC_NS,
+	CKPT_OBJ_USER_NS,
+	CKPT_OBJ_CRED,
+	CKPT_OBJ_USER,
+	CKPT_OBJ_GROUPINFO,
+	CKPT_OBJ_SOCK,
+	CKPT_OBJ_MAX
+};
+
+/* kernel constants */
+struct ckpt_hdr_const {
+	/* task */
+	__u16 task_comm_len;
+	/* mm */
+	__u16 mm_saved_auxv_len;
+	/* signal */
+	__u16 signal_nsig;
+	/* uts */
+	__u16 uts_sysname_len;
+	__u16 uts_nodename_len;
+	__u16 uts_release_len;
+	__u16 uts_version_len;
+	__u16 uts_machine_len;
+	__u16 uts_domainname_len;
+} __attribute__((aligned(8)));
+
+/* checkpoint image header */
+struct ckpt_hdr_header {
+	struct ckpt_hdr h;
+	__u64 magic;
+
+	__u16 arch_id;
+
+	__u16 major;
+	__u16 minor;
+	__u16 patch;
+	__u16 rev;
+
+	struct ckpt_hdr_const constants;
+
+	__u64 time;	/* when checkpoint taken */
+	__u64 uflags;	/* uflags from checkpoint */
+
+	/*
+	 * the header is followed by three strings:
+	 *   char release[const.uts_release_len];
+	 *   char version[const.uts_version_len];
+	 *   char machine[const.uts_machine_len];
+	 */
+} __attribute__((aligned(8)));
+
+/* checkpoint image trailer */
+struct ckpt_hdr_tail {
+	struct ckpt_hdr h;
+	__u64 magic;
+} __attribute__((aligned(8)));
+
+/* task tree */
+struct ckpt_hdr_tree {
+	struct ckpt_hdr h;
+	__s32 nr_tasks;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_pids {
+	__s32 vpid;
+	__s32 vppid;
+	__s32 vtgid;
+	__s32 vpgid;
+	__s32 vsid;
+} __attribute__((aligned(8)));
+
+/* task data */
+struct ckpt_hdr_task {
+	struct ckpt_hdr h;
+	__u32 state;
+	__u32 exit_state;
+	__u32 exit_code;
+	__u32 exit_signal;
+	__u32 pdeath_signal;
+
+	__u64 set_child_tid;
+	__u64 clear_child_tid;
+
+	__u32 compat_robust_futex_head_len;
+	__u32 compat_robust_futex_list; /* a compat __user ptr */
+	__u32 robust_futex_head_len;
+	__u64 robust_futex_list; /* a __user ptr */
+
+} __attribute__((aligned(8)));
+
+/* Posix capabilities */
+struct ckpt_capabilities {
+	__u32 cap_i_0, cap_i_1; /* inheritable set */
+	__u32 cap_p_0, cap_p_1; /* permitted set */
+	__u32 cap_e_0, cap_e_1; /* effective set */
+	__u32 cap_b_0, cap_b_1; /* bounding set */
+	__u32 securebits;
+	__u32 padding;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_task_creds {
+	struct ckpt_hdr h;
+	__s32 cred_ref;
+	__s32 ecred_ref;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_cred {
+	struct ckpt_hdr h;
+	__u32 uid, suid, euid, fsuid;
+	__u32 gid, sgid, egid, fsgid;
+	__s32 user_ref;
+	__s32 groupinfo_ref;
+	struct ckpt_capabilities cap_s;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_groupinfo {
+	struct ckpt_hdr h;
+	__u32 ngroups;
+	/*
+	 * This is followed by ngroups __u32s
+	 */
+	__u32 groups[0];
+} __attribute__((aligned(8)));
+
+/*
+ * todo - keyrings and LSM
+ * These may be better done with userspace help though
+ */
+struct ckpt_hdr_user_struct {
+	struct ckpt_hdr h;
+	__u32 uid;
+	__s32 userns_ref;
+} __attribute__((aligned(8)));
+
+/*
+ * The user-struct mostly tracks system resource usage.
+ * Most of it's contents therefore will simply be set
+ * correctly as restart opens resources
+ */
+struct ckpt_hdr_user_ns {
+	struct ckpt_hdr h;
+	__s32 creator_ref;
+} __attribute__((aligned(8)));
+
+/* namespaces */
+struct ckpt_hdr_task_ns {
+	struct ckpt_hdr h;
+	__s32 ns_objref;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ns {
+	struct ckpt_hdr h;
+	__s32 uts_objref;
+	__u32 ipc_objref;
+} __attribute__((aligned(8)));
+
+/* task's shared resources */
+struct ckpt_hdr_task_objs {
+	struct ckpt_hdr h;
+
+	__s32 files_objref;
+	__s32 mm_objref;
+	__s32 sighand_objref;
+} __attribute__((aligned(8)));
+
+/* restart blocks */
+struct ckpt_hdr_restart_block {
+	struct ckpt_hdr h;
+	__u64 function_type;
+	__u64 arg_0;
+	__u64 arg_1;
+	__u64 arg_2;
+	__u64 arg_3;
+	__u64 arg_4;
+} __attribute__((aligned(8)));
+
+enum restart_block_type {
+	CKPT_RESTART_BLOCK_NONE = 1,
+	CKPT_RESTART_BLOCK_HRTIMER_NANOSLEEP,
+	CKPT_RESTART_BLOCK_POSIX_CPU_NANOSLEEP,
+	CKPT_RESTART_BLOCK_COMPAT_NANOSLEEP,
+	CKPT_RESTART_BLOCK_COMPAT_CLOCK_NANOSLEEP,
+	CKPT_RESTART_BLOCK_POLL,
+	CKPT_RESTART_BLOCK_FUTEX
+};
+
+/* file system */
+struct ckpt_hdr_file_table {
+	struct ckpt_hdr h;
+	__s32 fdt_nfds;
+} __attribute__((aligned(8)));
+
+/* file descriptors */
+struct ckpt_hdr_file_desc {
+	struct ckpt_hdr h;
+	__s32 fd_objref;
+	__s32 fd_descriptor;
+	__u32 fd_close_on_exec;
+} __attribute__((aligned(8)));
+
+enum file_type {
+	CKPT_FILE_IGNORE = 0,
+	CKPT_FILE_GENERIC,
+	CKPT_FILE_PIPE,
+	CKPT_FILE_FIFO,
+	CKPT_FILE_SOCKET,
+	CKPT_FILE_MAX
+};
+
+/* file objects */
+struct ckpt_hdr_file {
+	struct ckpt_hdr h;
+	__u32 f_type;
+	__u32 f_mode;
+	__u32 f_flags;
+	__s32 f_credref;
+	__u64 f_pos;
+	__u64 f_version;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_file_generic {
+	struct ckpt_hdr_file common;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_file_pipe {
+	struct ckpt_hdr_file common;
+	__s32 pipe_objref;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_file_socket {
+	struct ckpt_hdr_file common;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_utsns {
+	struct ckpt_hdr h;
+	char sysname[__NEW_UTS_LEN + 1];
+	char nodename[__NEW_UTS_LEN + 1];
+	char release[__NEW_UTS_LEN + 1];
+	char version[__NEW_UTS_LEN + 1];
+	char machine[__NEW_UTS_LEN + 1];
+	char domainname[__NEW_UTS_LEN + 1];
+} __attribute__((aligned(8)));
+
+/* memory layout */
+struct ckpt_hdr_mm {
+	struct ckpt_hdr h;
+	__u32 map_count;
+	__s32 exe_objref;
+
+	__u64 def_flags;
+	__u64 flags;
+
+	__u64 start_code, end_code, start_data, end_data;
+	__u64 start_brk, brk, start_stack;
+	__u64 arg_start, arg_end, env_start, env_end;
+} __attribute__((aligned(8)));
+
+/* vma subtypes - index into restore_vma_dispatch[] */
+enum vma_type {
+	CKPT_VMA_IGNORE = 0,
+	CKPT_VMA_VDSO,		/* special vdso vma */
+	CKPT_VMA_ANON,		/* private anonymous */
+	CKPT_VMA_FILE,		/* private mapped file */
+	CKPT_VMA_SHM_ANON,	/* shared anonymous */
+	CKPT_VMA_SHM_ANON_SKIP,	/* shared anonymous (skip contents) */
+	CKPT_VMA_SHM_FILE,	/* shared mapped file, only msync */
+	CKPT_VMA_SHM_IPC,	/* shared sysvipc */
+	CKPT_VMA_SHM_IPC_SKIP,	/* shared sysvipc (skip contents) */
+	CKPT_VMA_MAX,
+};
+
+/* vma descriptor */
+struct ckpt_hdr_vma {
+	struct ckpt_hdr h;
+	__u32 vma_type;
+	__s32 vma_objref;	/* objref of backing file */
+	__s32 ino_objref;	/* objref of shared segment */
+	__u32 _padding;
+	__u64 ino_size;		/* size of shared segment */
+
+	__u64 vm_start;
+	__u64 vm_end;
+	__u64 vm_page_prot;
+	__u64 vm_flags;
+	__u64 vm_pgoff;
+} __attribute__((aligned(8)));
+
+/* page array */
+struct ckpt_hdr_pgarr {
+	struct ckpt_hdr h;
+	__u64 nr_pages;		/* number of pages to saved */
+} __attribute__((aligned(8)));
+
+/* signals */
+struct ckpt_hdr_sigset {
+	__u8 sigset[CKPT_ARCH_NSIG / 8];
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_sigaction {
+	__u64 _sa_handler;
+	__u64 sa_flags;
+	__u64 sa_restorer;
+	struct ckpt_hdr_sigset sa_mask;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_sighand {
+	struct ckpt_hdr h;
+	struct ckpt_hdr_sigaction action[0];
+} __attribute__((aligned(8)));
+
+/* ipc commons */
+struct ckpt_hdr_ipcns {
+	struct ckpt_hdr h;
+	__u64 shm_ctlmax;
+	__u64 shm_ctlall;
+	__s32 shm_ctlmni;
+
+	__s32 msg_ctlmax;
+	__s32 msg_ctlmnb;
+	__s32 msg_ctlmni;
+
+	__s32 sem_ctl_msl;
+	__s32 sem_ctl_mns;
+	__s32 sem_ctl_opm;
+	__s32 sem_ctl_mni;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc {
+	struct ckpt_hdr h;
+	__u32 ipc_type;
+	__u32 ipc_count;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc_perms {
+	struct ckpt_hdr h;
+	__s32 id;
+	__u32 key;
+	__u32 uid;
+	__u32 gid;
+	__u32 cuid;
+	__u32 cgid;
+	__u32 mode;
+	__u32 _padding;
+	__u64 seq;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc_shm {
+	struct ckpt_hdr h;
+	struct ckpt_hdr_ipc_perms perms;
+	__u64 shm_segsz;
+	__u64 shm_atim;
+	__u64 shm_dtim;
+	__u64 shm_ctim;
+	__s32 shm_cprid;
+	__s32 shm_lprid;
+	__u32 mlock_uid;
+	__u32 flags;
+	__u32 objref;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc_msg {
+	struct ckpt_hdr h;
+	struct ckpt_hdr_ipc_perms perms;
+	__u64 q_stime;
+	__u64 q_rtime;
+	__u64 q_ctime;
+	__u64 q_cbytes;
+	__u64 q_qnum;
+	__u64 q_qbytes;
+	__s32 q_lspid;
+	__s32 q_lrpid;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc_msg_msg {
+	struct ckpt_hdr h;
+	__s32 m_type;
+	__u32 m_ts;
+} __attribute__((aligned(8)));
+
+struct ckpt_hdr_ipc_sem {
+	struct ckpt_hdr h;
+	struct ckpt_hdr_ipc_perms perms;
+	__u64 sem_otime;
+	__u64 sem_ctime;
+	__u32 sem_nsems;
+} __attribute__((aligned(8)));
+
+#define CKPT_UNIX_LINKED 1
+struct ckpt_hdr_socket_unix {
+	struct ckpt_hdr h;
+	__s32 this;
+	__s32 peer;
+	__u32 flags;
+	__u32 laddr_len;
+	__u32 raddr_len;
+	struct sockaddr_un laddr;
+	struct sockaddr_un raddr;
+} __attribute__ ((aligned(8)));
+
+struct ckpt_hdr_socket {
+	struct ckpt_hdr h;
+
+	struct { /* struct socket */
+		__u64 flags;
+		__u8 state;
+	} socket __attribute__ ((aligned(8)));
+
+	struct { /* struct sock_common */
+		__u32 bound_dev_if;
+		__u32 reuse;
+		__u16 family;
+		__u8 state;
+	} sock_common __attribute__ ((aligned(8)));
+
+	struct { /* struct sock */
+		__s64 rcvlowat;
+		__u64 flags;
+
+		__u32 err;
+		__u32 err_soft;
+		__u32 priority;
+		__s32 rcvbuf;
+		__s32 sndbuf;
+		__u16 type;
+		__s16 backlog;
+
+		__u8 protocol;
+		__u8 state;
+		__u8 shutdown;
+		__u8 userlocks;
+		__u8 no_check;
+
+		struct linger linger;
+		struct timeval rcvtimeo;
+		struct timeval sndtimeo;
+
+	} sock __attribute__ ((aligned(8)));
+
+} __attribute__ ((aligned(8)));
+
+struct ckpt_hdr_socket_queue {
+	struct ckpt_hdr h;
+	__u32 skb_count;
+	__u32 total_bytes;
+} __attribute__ ((aligned(8)));
+
+#define CKPT_TST_OVERFLOW_16(a,b) ((sizeof(a) > sizeof(b)) && ((a) > SHORT_MAX))
+
+#define CKPT_TST_OVERFLOW_32(a,b) ((sizeof(a) > sizeof(b)) && ((a) > INT_MAX))
+
+#define CKPT_TST_OVERFLOW_64(a,b) ((sizeof(a) > sizeof(b)) && ((a) > LONG_MAX))
+
+
+#endif /* _CHECKPOINT_CKPT_HDR_H_ */
diff --git a/include/linux/checkpoint_syscalls.h b/include/linux/checkpoint_syscalls.h
new file mode 100644
index 0000000..2374bbc
--- /dev/null
+++ b/include/linux/checkpoint_syscalls.h
@@ -0,0 +1,30 @@
+#ifndef _CHECKPOINT_SYSCALLS_H_
+#define _CHECKPOINT_SYSCALLS_H_
+/*
+ * Generated by extract-headers.sh.
+ */
+
+#if __s390x__
+
+#	ifndef __NR_checkpoint
+#		define __NR_checkpoint 332
+#	endif
+
+#	ifndef __NR_restart
+#		define __NR_restart 333
+#	endif
+
+#elif __i386__
+
+#	ifndef __NR_checkpoint
+#		define __NR_checkpoint 338
+#	endif
+
+#	ifndef __NR_restart
+#		define __NR_restart 339
+#	endif
+
+#else
+#error "Architecture does not have definitons for __NR_(checkpoint|restart)"
+#endif
+#endif /* _CHECKPOINT_SYSCALLS_H_ */
diff --git a/mktree.c b/mktree.c
index e42407f..1f9e3d3 100644
--- a/mktree.c
+++ b/mktree.c
@@ -29,8 +29,12 @@
 #include <asm/unistd.h>
 #include <sys/syscall.h>
 #include <sys/prctl.h>
+#include <sys/socket.h>
+#include <sys/un.h>
 
 #include <linux/sched.h>
+
+#include <linux/checkpoint_syscalls.h>
 #include <linux/checkpoint.h>
 #include <linux/checkpoint_hdr.h>
 
diff --git a/rstr.c b/rstr.c
index 9199cfe..8cd7a49 100644
--- a/rstr.c
+++ b/rstr.c
@@ -18,6 +18,7 @@
 #include <asm/unistd.h>
 #include <sys/syscall.h>
 
+#include <linux/checkpoint_syscalls.h>
 #include <linux/checkpoint.h>
 
 int main(int argc, char *argv[])
diff --git a/scripts/extract-headers.sh b/scripts/extract-headers.sh
new file mode 100755
index 0000000..9e6ea4a
--- /dev/null
+++ b/scripts/extract-headers.sh
@@ -0,0 +1,194 @@
+#!/bin/bash
+#
+# Copyright (C) 2009 IBM Corp.
+# Author: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License.  See the file COPYING in the main directory of the Linux
+# distribution for more details.
+#
+
+#
+# Sanitize checkpoint/restart kernel headers for userspace.
+#
+
+function usage()
+{
+	echo "Usage: $0 [-h|--help] -s|--kernel-src=DIR"
+}
+
+OUTPUT_INCLUDES="include"
+OPTIONS=`getopt -o s:o:h --long kernel-src:,output:,help -- "$@"`
+eval set -- "${OPTIONS}"
+while true ; do
+	case "$1" in
+	-s|--kernel-src)
+		KERNELSRC="$2"
+		shift 2 ;;
+	-o|--output)
+		OUTPUT_INCLUDES="$2"
+		shift 2 ;;
+	-h|--help)
+		usage
+		exit 0 ;;
+	--)
+		shift
+		break ;;
+	*)
+		echo "Unknown option: $1"
+		shift
+		echo "Unparsed options: $@"
+		usage 1>&2
+		exit 2 ;;
+	esac
+done
+
+if [ -z "${KERNELSRC}" -o '!' -d "${KERNELSRC}" ]; then
+	usage 1>&2
+	exit 2
+fi
+
+#
+# Run the kernel header through cpp to strip out __KERNEL__ sections but try
+# to leave the rest untouched.
+#
+function do_cpp ()
+{
+	local CPP_FILE="$1"
+	local START_DEFINE="$2"
+	shift 2
+
+	#
+	# Hide #include directives then run cpp. Make cpp keep comments, not
+	# insert line numbers, avoid system/gcc/std defines, and only expand
+	# directives. Strip cpp output until we get to #define START_DEFINE,
+	# and collapse the excessive number of blank lines that cpp outputas
+	# in place of directives.
+	#
+	sed -e 's/#[[:space:]]*include[[:space:]]*\([<"][^">]*[">]\)/\/*#include \1*\//g' \
+		"${CPP_FILE}" | \
+	cpp -CC -P -U__KERNEL__ -undef -nostdinc \
+		-fdirectives-only -dDI "$@" | \
+	awk 'BEGIN { x = 0; }
+	     /#define '"${START_DEFINE}"'/  { x = 1; next; }
+		(x == 1) { print }' | cat -s | \
+	sed -e 's|/\*#include \([<"][^">]*[">]\)\*/|#include \1|g'
+	echo ''
+}
+
+# Map KARCH to something suitable for CPP e.g. __i386__
+function karch_to_cpparch ()
+{
+	local KARCH="$1"
+	local WORDBITS="$2"
+	shift 2;
+
+	case "${KARCH}" in
+	x86)	[ "${WORDBITS}" == "32" ] && echo -n "i386"
+		[ "${WORDBITS}" == "64" ] && echo -n "x86_64"
+		[ -z "${WORDBITS}" ]      && echo -n 'i386__ || __x86_64' # HACK
+		;;
+	s390*)	echo -n "s390x" ;;
+	*)	echo -n "${KARCH}" ;;
+	esac
+	return 0
+}
+
+set -e
+
+mkdir -p "${OUTPUT_INCLUDES}/linux"
+mkdir -p "${OUTPUT_INCLUDES}/asm"
+
+cat - > "${OUTPUT_INCLUDES}/linux/checkpoint.h" <<-EOFOFEE
+/*
+ * Generated by $(basename "$0").
+ */
+#ifndef _LINUX_CHECKPOINT_H_
+#define _LINUX_CHECKPOINT_H_
+EOFOFEE
+
+do_cpp "${KERNELSRC}/include/linux/checkpoint.h" "_LINUX_CHECKPOINT_H_" >> "${OUTPUT_INCLUDES}/linux/checkpoint.h"
+echo '#endif /* _LINUX_CHECKPOINT_H_ */' >> "${OUTPUT_INCLUDES}/linux/checkpoint.h"
+
+cat - > "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h" <<-EOFOO
+/*
+ * Generated by $(basename "$0").
+ */
+#ifndef _CHECKPOINT_CKPT_HDR_H_
+#define _CHECKPOINT_CKPT_HDR_H_
+EOFOO
+
+do_cpp "${KERNELSRC}/include/linux/checkpoint_hdr.h" "_CHECKPOINT_CKPT_HDR_H_" >> "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h"
+echo '#endif /* _CHECKPOINT_CKPT_HDR_H_ */' >> "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h"
+
+(
+#
+# We use ARCH_COND to break up architecture-specific sections of the header.
+#
+ARCH_COND='#if'
+REGEX='[[:space:]]*#[[:space:]]*define[[:space:]]+__NR_(checkpoint|restart)[[:space:]]+[[:digit:]]+'
+
+cat - <<-EOFOE
+#ifndef _CHECKPOINT_SYSCALLS_H_
+#define _CHECKPOINT_SYSCALLS_H_
+/*
+ * Generated by $(basename "$0").
+ */
+
+EOFOE
+
+find "${KERNELSRC}/arch" -name 'unistd*.h' -print | sort | \
+while read UNISTDH ; do
+	[ -n "${UNISTDH}" ] || continue
+	grep -q -E "${REGEX}" "${UNISTDH}" || continue
+
+	KARCH=$(echo "${UNISTDH}" | sed -e 's|.*/arch/\([^/]\+\)/.*|\1|')
+	WORDBITS=$(basename "${UNISTDH}" | sed -e 's/unistd_*\([[:digit:]]\+\)\.h/\1/')
+	CPPARCH="$(karch_to_cpparch "${KARCH}" "${WORDBITS}")"
+	echo -e "${ARCH_COND} __${CPPARCH}__\\n"
+	grep -E "${REGEX}" "${UNISTDH}" | \
+	sed -e 's/^[[:space:]]*#[[:space:]]*define[[:space:]]\+__NR_\([^[:space:]]\+\)[[:space:]]\+\([^[:space:]]\+\).*$/#\tifndef __NR_\1\n#\t\tdefine __NR_\1 \2\n#\tendif\n/'
+	ARCH_COND='#elif'
+done
+
+cat - <<-EOFOFOE
+#else
+#error "Architecture does not have definitons for __NR_(checkpoint|restart)"
+#endif
+#endif /* _CHECKPOINT_SYSCALLS_H_ */
+EOFOFOE
+
+) > "${OUTPUT_INCLUDES}/linux/checkpoint_syscalls.h"
+
+(
+ARCH_COND='#if'
+
+cat - <<-EOFOEOF
+/*
+ * Generated by $(basename "$0").
+ */
+#ifndef __ASM_CHECKPOINT_HDR_H_
+#define __ASM_CHECKPOINT_HDR_H_
+#include <sys/user.h>
+
+EOFOEOF
+
+find "${KERNELSRC}/arch" -name 'checkpoint_hdr.h' -print | sort | \
+while read ARCH_CHECKPOINT_HDR_H ; do
+	[ -n "${ARCH_CHECKPOINT_HDR_H}" ] || continue
+
+	KARCH=$(echo "${ARCH_CHECKPOINT_HDR_H}" | sed -e 's|.*/arch/\([^/]\+\)/.*|\1|')
+	CPPARCH="$(karch_to_cpparch "${KARCH}" "")"
+	echo -e "${ARCH_COND} __${CPPARCH}__\\n"
+	do_cpp "${KERNELSRC}/arch/${KARCH}/include/asm/checkpoint_hdr.h" '__ASM.*_CKPT_HDR_H' -D_CHECKPOINT_CKPT_HDR_H_
+	ARCH_COND='#elif'
+done
+
+cat - <<-FOEOEOF
+#else
+#error "Architecture does not have definitons needed for checkpoint images."
+#endif
+#endif /* __ASM_CHECKPOINT_HDR_H_ */
+FOEOEOF
+
+) > "${OUTPUT_INCLUDES}/asm/checkpoint_hdr.h"
diff --git a/self.c b/self.c
index 546fa04..a2f2fb9 100644
--- a/self.c
+++ b/self.c
@@ -16,6 +16,7 @@
 #include <math.h>
 #include <sys/syscall.h>
 
+#include <linux/checkpoint_syscalls.h>
 #include <linux/checkpoint.h>
 
 #define OUTFILE  "/tmp/cr-self.out"
-- 
1.5.6.3

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] user-cr: Extract kernel headers
       [not found] ` <20090817152403.GA11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2009-08-17 16:33   ` Matt Helsley
       [not found]     ` <20090817163356.GB11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
  2009-08-17 20:55   ` Oren Laadan
  1 sibling, 1 reply; 4+ messages in thread
From: Matt Helsley @ 2009-08-17 16:33 UTC (permalink / raw)
  To: Matt Helsley; +Cc: Containers

On Mon, Aug 17, 2009 at 08:24:03AM -0700, Matt Helsley wrote:
> Using kernel headers directly from userspace is strongly discouraged.
> This patch attempts to sanitize kernel headers for userspace by
> extracting non-__KERNEL__ portions of the various checkpoint headers
> and placing them in a similar organization of userspace headers.
> 
> The script is run from the top level of the user-cr source tree like:
> 
> 	./scripts/extract-headers.sh -s <path-to-kern-source> -o ./include
> 
> 
> The patch includes a copy of the auto-generated headers and adjusts
> the user-cr programs to use them.
> 
> Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> 
> TODO: Builds on i386. Probably needs more testing, especially on
> 	other non-i386, non-32-bit platforms.

Argh. Still one build problem that the script doesn't resolve. From the
kernel headers:

#include <linux/socket.h>
#include <linux/un.h>

I think these need to be changed to use sys/ instead of linux/ but I
can't see a good way to do this without hardcoding it into the script
or replacing _all_ "linux/" includes with "sys/" (but I haven't checked
if that will work much less if it's a good solution..). Would be nice
to know if anyone has preferences or knows kernel/user header conventions
I don't....


> 	Look at mergiing checkpoint_syscalls.h with checkpoint.h
> 	Or at least find a better, shorter name for checkpoint_syscalls.h
> 
> NOTES: The script is much larger (2.5x) than for cr_tests because cr_tests
> 	only required the syscall numbers and a few flags for the syscalls.
> 
> 	The headers have a similar organization to the kernel headers
> 	because struct ckpt_hdr must be defined before the arch hdrs and 
> 	yet CKPT_ARCH_NSIG must be defined before the generic signal hdrs.
> 	Plus it's easier to avoid rewriting the paths within the include
> 	directories...
> 
> 	checkpoint_syscalls.h is a multi-arch file with all the syscall
> 	numbers normally found in the arch's unistd.h. I chose to use a
> 	different name to avoid clashes with /usr/include headers.
> ---
>  Makefile                            |   31 +--
>  ckpt.c                              |    1 +
>  ckptinfo.c                          |    2 +
>  ckptinfo.py                         |    2 +
>  include/asm/checkpoint_hdr.h        |  205 ++++++++++++
>  include/linux/checkpoint.h          |   26 ++
>  include/linux/checkpoint_hdr.h      |  580 +++++++++++++++++++++++++++++++++++
>  include/linux/checkpoint_syscalls.h |   30 ++
>  mktree.c                            |    4 +
>  rstr.c                              |    1 +
>  scripts/extract-headers.sh          |  194 ++++++++++++
>  self.c                              |    1 +
>  12 files changed, 1056 insertions(+), 21 deletions(-)
>  create mode 100644 include/asm/checkpoint_hdr.h
>  create mode 100644 include/linux/checkpoint.h
>  create mode 100644 include/linux/checkpoint_hdr.h
>  create mode 100644 include/linux/checkpoint_syscalls.h
>  create mode 100755 scripts/extract-headers.sh
> 
> diff --git a/Makefile b/Makefile
> index afff3f5..509a2a4 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1,24 +1,5 @@
> -
> -KERNELSRC ?= ../linux
> -KERNELBUILD ?= ../linux
> -
> -# default with 'make headers_install'
> -KERNELHDR ?= $(KERNELSRC)/usr/include
> -
> -ifneq "$(realpath $(KERNELHDR)/linux/checkpoint.h)" ""
> -# if .../usr/include contains our headers
> -CKPT_INCLUDE = -I$(KERNELHDR)
> -CKPT_HEADERS = $(KERNELHDR)/linux/checkpoint_hdr.h \
> -	       $(KERNELHDR)/asm/checkpoint_hdr.h
> -else
> -# else, usr the kernel source itself
> -# but first, find linux architecure
> -KERN_ARCH = $(shell readlink $(KERNELBUILD)/include/asm | sed 's/^asm-//')
> -CKPT_INCLUDE = -I$(KERNELSRC)/include \
> -	       -I$(KERNELSRC)/arch/$(KERN_ARCH)/include
> -CKPT_HEADERS = $(KERNELSRC)/include/linux/checkpoint_hdr.h \
> -	       $(KERNELSRC)/arch/$(KERN_ARCH)/include/asm/checkpoint_hdr.h
> -endif
> +CKPT_INCLUDE = -I./include
> +CKPT_HEADERS = $(shell find ./include -name '*.h')
> 
>  # compile with debug ?
>  DEBUG = -DCHECKPOINT_DEBUG
> @@ -39,6 +20,8 @@ OTHER = ckptinfo_types.c
> 
>  LDLIBS = -lm
> 
> +.PHONY: all distclean clean headers install
> +
>  all: $(PROGS)
>  	@make -C test
> 
> @@ -56,10 +39,16 @@ ckptinfo_types.c: $(CKPT_HEADERS) ckptinfo.py
> 
>  %.o:	%.c
> 
> +headers:
> +	./scripts/extract-headers.sh -s ../linux-2.6.git
> +
>  install:
>  	@echo /usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
>  	@/usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
> 
> +distclean: clean
> +	@rm -f $(CKPT_HEADERS)
> +
>  clean:
>  	@rm -f $(PROGS) $(OTHER) *~ *.o
>  	@make -C test clean
> diff --git a/ckpt.c b/ckpt.c
> index b3f8dea..fb853f9 100644
> --- a/ckpt.c
> +++ b/ckpt.c
> @@ -16,6 +16,7 @@
>  #include <unistd.h>
>  #include <sys/syscall.h>
> 
> +#include <linux/checkpoint_syscalls.h>
>  #include <linux/checkpoint.h>
> 
>  static char usage_str[] =
> diff --git a/ckptinfo.c b/ckptinfo.c
> index 962a62f..2b8aaef 100644
> --- a/ckptinfo.c
> +++ b/ckptinfo.c
> @@ -17,6 +17,8 @@
>  #include <sys/types.h>
>  #include <sys/stat.h>
>  #include <fcntl.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> 
>  #include <linux/checkpoint_hdr.h>
>  #include <asm/checkpoint_hdr.h>
> diff --git a/ckptinfo.py b/ckptinfo.py
> index ae7e5da..e0959f1 100755
> --- a/ckptinfo.py
> +++ b/ckptinfo.py
> @@ -78,6 +78,8 @@ print """
>   * This file is auto-generated by ckptinfo.py
>   */
> 
> +#include <sys/socket.h>
> +#include <sys/un.h>
>  #include <linux/checkpoint_hdr.h>
>  """
> 
> diff --git a/include/asm/checkpoint_hdr.h b/include/asm/checkpoint_hdr.h
> new file mode 100644
> index 0000000..57f3635
> --- /dev/null
> +++ b/include/asm/checkpoint_hdr.h
> @@ -0,0 +1,205 @@
> +/*
> + * Generated by extract-headers.sh.
> + */
> +#ifndef __ASM_CHECKPOINT_HDR_H_
> +#define __ASM_CHECKPOINT_HDR_H_
> +#include <sys/user.h>
> +
> +#if __s390x__
> +
> +/*
> + *  Checkpoint/restart - architecture specific headers s/390
> + *
> + *  Copyright IBM Corp. 2009
> + *
> + *  This file is subject to the terms and conditions of the GNU General Public
> + *  License.  See the file COPYING in the main directory of the Linux
> + *  distribution for more details.
> + */
> +
> +#include <linux/types.h>
> +#include <asm/ptrace.h>
> +
> +/*
> + * Notes
> + * NUM_GPRS defined in <asm/ptrace.h> to be 16
> + * NUM_FPRS defined in <asm/ptrace.h> to be 16
> + * NUM_APRS defined in <asm/ptrace.h> to be 16
> + * NUM_CR_WORDS defined in <asm/ptrace.h> to be 3
> + */
> +struct ckpt_hdr_cpu {
> +	struct ckpt_hdr h;
> +	__u64 args[1];
> +	__u64 gprs[NUM_GPRS];
> +	__u64 orig_gpr2;
> +	__u16 svcnr;
> +	__u16 ilc;
> +	__u32 acrs[NUM_ACRS];
> +	__u64 ieee_instruction_pointer;
> +
> +	/* psw_t */
> +	__u64 psw_t_mask;
> +	__u64 psw_t_addr;
> +
> +	/* s390_fp_regs_t */
> +	__u32 fpc;
> +	union {
> +		float f;
> +		double d;
> +		__u64 ui;
> +		struct {
> +			__u32 fp_hi;
> +			__u32 fp_lo;
> +		} fp;
> +	} fprs[NUM_FPRS];
> +
> +	/* per_struct */
> +	__u64 per_control_regs[NUM_CR_WORDS];
> +	__u64 starting_addr;
> +	__u64 ending_addr;
> +	__u64 address;
> +	__u16 perc_atmid;
> +	__u8 access_id;
> +	__u8 single_step;
> +	__u8 instruction_fetch;
> +};
> +
> +struct ckpt_hdr_mm_context {
> +	struct ckpt_hdr h;
> +	unsigned long vdso_base;
> +	int noexec;
> +	int has_pgste;
> +	int alloc_pgste;
> +	unsigned long asce_bits;
> +	unsigned long asce_limit;
> +};
> +
> +#define CKPT_ARCH_NSIG 64
> +
> +struct ckpt_hdr_header_arch {
> +	struct ckpt_hdr h;
> +};
> +
> +
> +#elif __i386__ || __x86_64__
> +
> +/*
> + *  Checkpoint/restart - architecture specific headers x86
> + *
> + *  Copyright (C) 2008-2009 Oren Laadan
> + *
> + *  This file is subject to the terms and conditions of the GNU General Public
> + *  License.  See the file COPYING in the main directory of the Linux
> + *  distribution for more details.
> + */
> +
> +#include <linux/types.h>
> +
> +/*
> + * To maintain compatibility between 32-bit and 64-bit architecture flavors,
> + * keep data 64-bit aligned: use padding for structure members, and use
> + * __attribute__((aligned (8))) for the entire structure.
> + *
> + * Quoting Arnd Bergmann:
> + *   "This structure has an odd multiple of 32-bit members, which means
> + *   that if you put it into a larger structure that also contains 64-bit
> + *   members, the larger structure may get different alignment on x86-32
> + *   and x86-64, which you might want to avoid. I can't tell if this is
> + *   an actual problem here. ... In this case, I'm pretty sure that
> + *   sizeof(ckpt_hdr_task) on x86-32 is different from x86-64, since it
> + *   will be 32-bit aligned on x86-32."
> + */
> +
> +/* i387 structure seen from kernel/userspace */
> +
> +/* arch dependent header types */
> +enum {
> +	CKPT_HDR_CPU_FPU = 201,
> +	CKPT_HDR_MM_CONTEXT_LDT,
> +};
> +
> +#define CKPT_ARCH_NSIG 64
> +
> +struct ckpt_hdr_header_arch {
> +	struct ckpt_hdr h;
> +	/* FIXME: add HAVE_HWFP */
> +	__u16 has_fxsr;
> +	__u16 has_xsave;
> +	__u16 xstate_size;
> +	__u16 _pading;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_thread {
> +	struct ckpt_hdr h;
> +	__u32 thread_info_flags;
> +	__u16 gdt_entry_tls_entries;
> +	__u16 sizeof_tls_array;
> +} __attribute__((aligned(8)));
> +
> +/* designed to work for both x86_32 and x86_64 */
> +struct ckpt_hdr_cpu {
> +	struct ckpt_hdr h;
> +	/* see struct pt_regs (x86_64) */
> +	__u64 r15;
> +	__u64 r14;
> +	__u64 r13;
> +	__u64 r12;
> +	__u64 bp;
> +	__u64 bx;
> +	__u64 r11;
> +	__u64 r10;
> +	__u64 r9;
> +	__u64 r8;
> +	__u64 ax;
> +	__u64 cx;
> +	__u64 dx;
> +	__u64 si;
> +	__u64 di;
> +	__u64 orig_ax;
> +	__u64 ip;
> +	__u64 sp;
> +
> +	__u64 flags;
> +
> +	/* segment registers */
> +	__u64 fs;
> +	__u64 gs;
> +
> +	__u16 fsindex;
> +	__u16 gsindex;
> +	__u16 cs;
> +	__u16 ss;
> +	__u16 ds;
> +	__u16 es;
> +
> +	__u32 used_math;
> +
> +	/* debug registers */
> +	__u64 debugreg0;
> +	__u64 debugreg1;
> +	__u64 debugreg2;
> +	__u64 debugreg3;
> +	__u64 debugreg6;
> +	__u64 debugreg7;
> +
> +	/* thread_xstate contents follow (if used_math) */
> +} __attribute__((aligned(8)));
> +
> +#define CKPT_X86_SEG_NULL 0
> +#define CKPT_X86_SEG_USER32_CS 1
> +#define CKPT_X86_SEG_USER32_DS 2
> +#define CKPT_X86_SEG_TLS 0x4000 /* 0100 0000 0000 00xx */
> +#define CKPT_X86_SEG_LDT 0x8000 /* 100x xxxx xxxx xxxx */
> +
> +struct ckpt_hdr_mm_context {
> +	struct ckpt_hdr h;
> +	__u64 vdso;
> +	__u32 ldt_entry_size;
> +	__u32 nldt;
> +} __attribute__((aligned(8)));
> +
> +
> +#else
> +#error "Architecture does not have definitons needed for checkpoint images."
> +#endif
> +#endif /* __ASM_CHECKPOINT_HDR_H_ */
> diff --git a/include/linux/checkpoint.h b/include/linux/checkpoint.h
> new file mode 100644
> index 0000000..db6c6a4
> --- /dev/null
> +++ b/include/linux/checkpoint.h
> @@ -0,0 +1,26 @@
> +/*
> + * Generated by extract-headers.sh.
> + */
> +#ifndef _LINUX_CHECKPOINT_H_
> +#define _LINUX_CHECKPOINT_H_
> +/*
> + *  Generic checkpoint-restart
> + *
> + *  Copyright (C) 2008-2009 Oren Laadan
> + *
> + *  This file is subject to the terms and conditions of the GNU General Public
> + *  License.  See the file COPYING in the main directory of the Linux
> + *  distribution for more details.
> + */
> +
> +#define CHECKPOINT_VERSION 1
> +
> +/* checkpoint user flags */
> +#define CHECKPOINT_SUBTREE 0x1
> +
> +/* restart user flags */
> +#define RESTART_TASKSELF 0x1
> +#define RESTART_FROZEN 0x2
> +
> +
> +#endif /* _LINUX_CHECKPOINT_H_ */
> diff --git a/include/linux/checkpoint_hdr.h b/include/linux/checkpoint_hdr.h
> new file mode 100644
> index 0000000..a6f4a57
> --- /dev/null
> +++ b/include/linux/checkpoint_hdr.h
> @@ -0,0 +1,580 @@
> +/*
> + * Generated by extract-headers.sh.
> + */
> +#ifndef _CHECKPOINT_CKPT_HDR_H_
> +#define _CHECKPOINT_CKPT_HDR_H_
> +/*
> + *  Generic container checkpoint-restart
> + *
> + *  Copyright (C) 2008-2009 Oren Laadan
> + *
> + *  This file is subject to the terms and conditions of the GNU General Public
> + *  License.  See the file COPYING in the main directory of the Linux
> + *  distribution for more details.
> + */
> +
> +#include <linux/types.h>
> +#include <linux/utsname.h>
> +
> +/*
> + * To maintain compatibility between 32-bit and 64-bit architecture flavors,
> + * keep data 64-bit aligned: use padding for structure members, and use
> + * __attribute__((aligned (8))) for the entire structure.
> + *
> + * Quoting Arnd Bergmann:
> + *   "This structure has an odd multiple of 32-bit members, which means
> + *   that if you put it into a larger structure that also contains 64-bit
> + *   members, the larger structure may get different alignment on x86-32
> + *   and x86-64, which you might want to avoid. I can't tell if this is
> + *   an actual problem here. ... In this case, I'm pretty sure that
> + *   sizeof(ckpt_hdr_task) on x86-32 is different from x86-64, since it
> + *   will be 32-bit aligned on x86-32."
> + */
> +
> +/*
> + * header format: 'struct ckpt_hdr' must prefix all other headers. Therfore
> + * when a header is passed around, the information about it (type, size)
> + * is readily available.
> + */
> +struct ckpt_hdr {
> +	__u32 type;
> +	__u32 len;
> +} __attribute__((aligned(8)));
> +
> +#include <asm/checkpoint_hdr.h>
> +
> +/* header types */
> +enum {
> +	CKPT_HDR_HEADER = 1,
> +	CKPT_HDR_HEADER_ARCH,
> +	CKPT_HDR_BUFFER,
> +	CKPT_HDR_STRING,
> +	CKPT_HDR_OBJREF,
> +
> +	CKPT_HDR_TREE = 101,
> +	CKPT_HDR_TASK,
> +	CKPT_HDR_TASK_NS,
> +	CKPT_HDR_TASK_OBJS,
> +	CKPT_HDR_RESTART_BLOCK,
> +	CKPT_HDR_THREAD,
> +	CKPT_HDR_CPU,
> +	CKPT_HDR_NS,
> +	CKPT_HDR_UTS_NS,
> +	CKPT_HDR_IPC_NS,
> +	CKPT_HDR_CAPABILITIES,
> +	CKPT_HDR_USER_NS,
> +	CKPT_HDR_CRED,
> +	CKPT_HDR_USER,
> +	CKPT_HDR_GROUPINFO,
> +	CKPT_HDR_TASK_CREDS,
> +
> +	/* 201-299: reserved for arch-dependent */
> +
> +	CKPT_HDR_FILE_TABLE = 301,
> +	CKPT_HDR_FILE_DESC,
> +	CKPT_HDR_FILE_NAME,
> +	CKPT_HDR_FILE,
> +	CKPT_HDR_PIPE_BUF,
> +
> +	CKPT_HDR_MM = 401,
> +	CKPT_HDR_VMA,
> +	CKPT_HDR_PGARR,
> +	CKPT_HDR_MM_CONTEXT,
> +
> +	CKPT_HDR_IPC = 501,
> +	CKPT_HDR_IPC_SHM,
> +	CKPT_HDR_IPC_MSG,
> +	CKPT_HDR_IPC_MSG_MSG,
> +	CKPT_HDR_IPC_SEM,
> +
> +	CKPT_HDR_SIGHAND = 601,
> +
> +	CKPT_HDR_FD_SOCKET = 701,
> +	CKPT_HDR_SOCKET,
> +	CKPT_HDR_SOCKET_QUEUE,
> +	CKPT_HDR_SOCKET_BUFFER,
> +	CKPT_HDR_SOCKET_UNIX,
> +
> +	CKPT_HDR_TAIL = 9001,
> +
> +	CKPT_HDR_ERROR = 9999,
> +};
> +
> +/* architecture */
> +enum {
> +	/* do not change order (will break ABI) */
> +	CKPT_ARCH_X86_32 = 1,
> +	CKPT_ARCH_S390X,
> +};
> +
> +/* shared objrects (objref) */
> +struct ckpt_hdr_objref {
> +	struct ckpt_hdr h;
> +	__u32 objtype;
> +	__s32 objref;
> +} __attribute__((aligned(8)));
> +
> +/* shared objects types */
> +enum obj_type {
> +	CKPT_OBJ_IGNORE = 0,
> +	CKPT_OBJ_INODE,
> +	CKPT_OBJ_FILE_TABLE,
> +	CKPT_OBJ_FILE,
> +	CKPT_OBJ_MM,
> +	CKPT_OBJ_SIGHAND,
> +	CKPT_OBJ_NS,
> +	CKPT_OBJ_UTS_NS,
> +	CKPT_OBJ_IPC_NS,
> +	CKPT_OBJ_USER_NS,
> +	CKPT_OBJ_CRED,
> +	CKPT_OBJ_USER,
> +	CKPT_OBJ_GROUPINFO,
> +	CKPT_OBJ_SOCK,
> +	CKPT_OBJ_MAX
> +};
> +
> +/* kernel constants */
> +struct ckpt_hdr_const {
> +	/* task */
> +	__u16 task_comm_len;
> +	/* mm */
> +	__u16 mm_saved_auxv_len;
> +	/* signal */
> +	__u16 signal_nsig;
> +	/* uts */
> +	__u16 uts_sysname_len;
> +	__u16 uts_nodename_len;
> +	__u16 uts_release_len;
> +	__u16 uts_version_len;
> +	__u16 uts_machine_len;
> +	__u16 uts_domainname_len;
> +} __attribute__((aligned(8)));
> +
> +/* checkpoint image header */
> +struct ckpt_hdr_header {
> +	struct ckpt_hdr h;
> +	__u64 magic;
> +
> +	__u16 arch_id;
> +
> +	__u16 major;
> +	__u16 minor;
> +	__u16 patch;
> +	__u16 rev;
> +
> +	struct ckpt_hdr_const constants;
> +
> +	__u64 time;	/* when checkpoint taken */
> +	__u64 uflags;	/* uflags from checkpoint */
> +
> +	/*
> +	 * the header is followed by three strings:
> +	 *   char release[const.uts_release_len];
> +	 *   char version[const.uts_version_len];
> +	 *   char machine[const.uts_machine_len];
> +	 */
> +} __attribute__((aligned(8)));
> +
> +/* checkpoint image trailer */
> +struct ckpt_hdr_tail {
> +	struct ckpt_hdr h;
> +	__u64 magic;
> +} __attribute__((aligned(8)));
> +
> +/* task tree */
> +struct ckpt_hdr_tree {
> +	struct ckpt_hdr h;
> +	__s32 nr_tasks;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_pids {
> +	__s32 vpid;
> +	__s32 vppid;
> +	__s32 vtgid;
> +	__s32 vpgid;
> +	__s32 vsid;
> +} __attribute__((aligned(8)));
> +
> +/* task data */
> +struct ckpt_hdr_task {
> +	struct ckpt_hdr h;
> +	__u32 state;
> +	__u32 exit_state;
> +	__u32 exit_code;
> +	__u32 exit_signal;
> +	__u32 pdeath_signal;
> +
> +	__u64 set_child_tid;
> +	__u64 clear_child_tid;
> +
> +	__u32 compat_robust_futex_head_len;
> +	__u32 compat_robust_futex_list; /* a compat __user ptr */
> +	__u32 robust_futex_head_len;
> +	__u64 robust_futex_list; /* a __user ptr */
> +
> +} __attribute__((aligned(8)));
> +
> +/* Posix capabilities */
> +struct ckpt_capabilities {
> +	__u32 cap_i_0, cap_i_1; /* inheritable set */
> +	__u32 cap_p_0, cap_p_1; /* permitted set */
> +	__u32 cap_e_0, cap_e_1; /* effective set */
> +	__u32 cap_b_0, cap_b_1; /* bounding set */
> +	__u32 securebits;
> +	__u32 padding;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_task_creds {
> +	struct ckpt_hdr h;
> +	__s32 cred_ref;
> +	__s32 ecred_ref;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_cred {
> +	struct ckpt_hdr h;
> +	__u32 uid, suid, euid, fsuid;
> +	__u32 gid, sgid, egid, fsgid;
> +	__s32 user_ref;
> +	__s32 groupinfo_ref;
> +	struct ckpt_capabilities cap_s;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_groupinfo {
> +	struct ckpt_hdr h;
> +	__u32 ngroups;
> +	/*
> +	 * This is followed by ngroups __u32s
> +	 */
> +	__u32 groups[0];
> +} __attribute__((aligned(8)));
> +
> +/*
> + * todo - keyrings and LSM
> + * These may be better done with userspace help though
> + */
> +struct ckpt_hdr_user_struct {
> +	struct ckpt_hdr h;
> +	__u32 uid;
> +	__s32 userns_ref;
> +} __attribute__((aligned(8)));
> +
> +/*
> + * The user-struct mostly tracks system resource usage.
> + * Most of it's contents therefore will simply be set
> + * correctly as restart opens resources
> + */
> +struct ckpt_hdr_user_ns {
> +	struct ckpt_hdr h;
> +	__s32 creator_ref;
> +} __attribute__((aligned(8)));
> +
> +/* namespaces */
> +struct ckpt_hdr_task_ns {
> +	struct ckpt_hdr h;
> +	__s32 ns_objref;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ns {
> +	struct ckpt_hdr h;
> +	__s32 uts_objref;
> +	__u32 ipc_objref;
> +} __attribute__((aligned(8)));
> +
> +/* task's shared resources */
> +struct ckpt_hdr_task_objs {
> +	struct ckpt_hdr h;
> +
> +	__s32 files_objref;
> +	__s32 mm_objref;
> +	__s32 sighand_objref;
> +} __attribute__((aligned(8)));
> +
> +/* restart blocks */
> +struct ckpt_hdr_restart_block {
> +	struct ckpt_hdr h;
> +	__u64 function_type;
> +	__u64 arg_0;
> +	__u64 arg_1;
> +	__u64 arg_2;
> +	__u64 arg_3;
> +	__u64 arg_4;
> +} __attribute__((aligned(8)));
> +
> +enum restart_block_type {
> +	CKPT_RESTART_BLOCK_NONE = 1,
> +	CKPT_RESTART_BLOCK_HRTIMER_NANOSLEEP,
> +	CKPT_RESTART_BLOCK_POSIX_CPU_NANOSLEEP,
> +	CKPT_RESTART_BLOCK_COMPAT_NANOSLEEP,
> +	CKPT_RESTART_BLOCK_COMPAT_CLOCK_NANOSLEEP,
> +	CKPT_RESTART_BLOCK_POLL,
> +	CKPT_RESTART_BLOCK_FUTEX
> +};
> +
> +/* file system */
> +struct ckpt_hdr_file_table {
> +	struct ckpt_hdr h;
> +	__s32 fdt_nfds;
> +} __attribute__((aligned(8)));
> +
> +/* file descriptors */
> +struct ckpt_hdr_file_desc {
> +	struct ckpt_hdr h;
> +	__s32 fd_objref;
> +	__s32 fd_descriptor;
> +	__u32 fd_close_on_exec;
> +} __attribute__((aligned(8)));
> +
> +enum file_type {
> +	CKPT_FILE_IGNORE = 0,
> +	CKPT_FILE_GENERIC,
> +	CKPT_FILE_PIPE,
> +	CKPT_FILE_FIFO,
> +	CKPT_FILE_SOCKET,
> +	CKPT_FILE_MAX
> +};
> +
> +/* file objects */
> +struct ckpt_hdr_file {
> +	struct ckpt_hdr h;
> +	__u32 f_type;
> +	__u32 f_mode;
> +	__u32 f_flags;
> +	__s32 f_credref;
> +	__u64 f_pos;
> +	__u64 f_version;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_file_generic {
> +	struct ckpt_hdr_file common;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_file_pipe {
> +	struct ckpt_hdr_file common;
> +	__s32 pipe_objref;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_file_socket {
> +	struct ckpt_hdr_file common;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_utsns {
> +	struct ckpt_hdr h;
> +	char sysname[__NEW_UTS_LEN + 1];
> +	char nodename[__NEW_UTS_LEN + 1];
> +	char release[__NEW_UTS_LEN + 1];
> +	char version[__NEW_UTS_LEN + 1];
> +	char machine[__NEW_UTS_LEN + 1];
> +	char domainname[__NEW_UTS_LEN + 1];
> +} __attribute__((aligned(8)));
> +
> +/* memory layout */
> +struct ckpt_hdr_mm {
> +	struct ckpt_hdr h;
> +	__u32 map_count;
> +	__s32 exe_objref;
> +
> +	__u64 def_flags;
> +	__u64 flags;
> +
> +	__u64 start_code, end_code, start_data, end_data;
> +	__u64 start_brk, brk, start_stack;
> +	__u64 arg_start, arg_end, env_start, env_end;
> +} __attribute__((aligned(8)));
> +
> +/* vma subtypes - index into restore_vma_dispatch[] */
> +enum vma_type {
> +	CKPT_VMA_IGNORE = 0,
> +	CKPT_VMA_VDSO,		/* special vdso vma */
> +	CKPT_VMA_ANON,		/* private anonymous */
> +	CKPT_VMA_FILE,		/* private mapped file */
> +	CKPT_VMA_SHM_ANON,	/* shared anonymous */
> +	CKPT_VMA_SHM_ANON_SKIP,	/* shared anonymous (skip contents) */
> +	CKPT_VMA_SHM_FILE,	/* shared mapped file, only msync */
> +	CKPT_VMA_SHM_IPC,	/* shared sysvipc */
> +	CKPT_VMA_SHM_IPC_SKIP,	/* shared sysvipc (skip contents) */
> +	CKPT_VMA_MAX,
> +};
> +
> +/* vma descriptor */
> +struct ckpt_hdr_vma {
> +	struct ckpt_hdr h;
> +	__u32 vma_type;
> +	__s32 vma_objref;	/* objref of backing file */
> +	__s32 ino_objref;	/* objref of shared segment */
> +	__u32 _padding;
> +	__u64 ino_size;		/* size of shared segment */
> +
> +	__u64 vm_start;
> +	__u64 vm_end;
> +	__u64 vm_page_prot;
> +	__u64 vm_flags;
> +	__u64 vm_pgoff;
> +} __attribute__((aligned(8)));
> +
> +/* page array */
> +struct ckpt_hdr_pgarr {
> +	struct ckpt_hdr h;
> +	__u64 nr_pages;		/* number of pages to saved */
> +} __attribute__((aligned(8)));
> +
> +/* signals */
> +struct ckpt_hdr_sigset {
> +	__u8 sigset[CKPT_ARCH_NSIG / 8];
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_sigaction {
> +	__u64 _sa_handler;
> +	__u64 sa_flags;
> +	__u64 sa_restorer;
> +	struct ckpt_hdr_sigset sa_mask;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_sighand {
> +	struct ckpt_hdr h;
> +	struct ckpt_hdr_sigaction action[0];
> +} __attribute__((aligned(8)));
> +
> +/* ipc commons */
> +struct ckpt_hdr_ipcns {
> +	struct ckpt_hdr h;
> +	__u64 shm_ctlmax;
> +	__u64 shm_ctlall;
> +	__s32 shm_ctlmni;
> +
> +	__s32 msg_ctlmax;
> +	__s32 msg_ctlmnb;
> +	__s32 msg_ctlmni;
> +
> +	__s32 sem_ctl_msl;
> +	__s32 sem_ctl_mns;
> +	__s32 sem_ctl_opm;
> +	__s32 sem_ctl_mni;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc {
> +	struct ckpt_hdr h;
> +	__u32 ipc_type;
> +	__u32 ipc_count;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc_perms {
> +	struct ckpt_hdr h;
> +	__s32 id;
> +	__u32 key;
> +	__u32 uid;
> +	__u32 gid;
> +	__u32 cuid;
> +	__u32 cgid;
> +	__u32 mode;
> +	__u32 _padding;
> +	__u64 seq;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc_shm {
> +	struct ckpt_hdr h;
> +	struct ckpt_hdr_ipc_perms perms;
> +	__u64 shm_segsz;
> +	__u64 shm_atim;
> +	__u64 shm_dtim;
> +	__u64 shm_ctim;
> +	__s32 shm_cprid;
> +	__s32 shm_lprid;
> +	__u32 mlock_uid;
> +	__u32 flags;
> +	__u32 objref;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc_msg {
> +	struct ckpt_hdr h;
> +	struct ckpt_hdr_ipc_perms perms;
> +	__u64 q_stime;
> +	__u64 q_rtime;
> +	__u64 q_ctime;
> +	__u64 q_cbytes;
> +	__u64 q_qnum;
> +	__u64 q_qbytes;
> +	__s32 q_lspid;
> +	__s32 q_lrpid;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc_msg_msg {
> +	struct ckpt_hdr h;
> +	__s32 m_type;
> +	__u32 m_ts;
> +} __attribute__((aligned(8)));
> +
> +struct ckpt_hdr_ipc_sem {
> +	struct ckpt_hdr h;
> +	struct ckpt_hdr_ipc_perms perms;
> +	__u64 sem_otime;
> +	__u64 sem_ctime;
> +	__u32 sem_nsems;
> +} __attribute__((aligned(8)));
> +
> +#define CKPT_UNIX_LINKED 1
> +struct ckpt_hdr_socket_unix {
> +	struct ckpt_hdr h;
> +	__s32 this;
> +	__s32 peer;
> +	__u32 flags;
> +	__u32 laddr_len;
> +	__u32 raddr_len;
> +	struct sockaddr_un laddr;
> +	struct sockaddr_un raddr;
> +} __attribute__ ((aligned(8)));
> +
> +struct ckpt_hdr_socket {
> +	struct ckpt_hdr h;
> +
> +	struct { /* struct socket */
> +		__u64 flags;
> +		__u8 state;
> +	} socket __attribute__ ((aligned(8)));
> +
> +	struct { /* struct sock_common */
> +		__u32 bound_dev_if;
> +		__u32 reuse;
> +		__u16 family;
> +		__u8 state;
> +	} sock_common __attribute__ ((aligned(8)));
> +
> +	struct { /* struct sock */
> +		__s64 rcvlowat;
> +		__u64 flags;
> +
> +		__u32 err;
> +		__u32 err_soft;
> +		__u32 priority;
> +		__s32 rcvbuf;
> +		__s32 sndbuf;
> +		__u16 type;
> +		__s16 backlog;
> +
> +		__u8 protocol;
> +		__u8 state;
> +		__u8 shutdown;
> +		__u8 userlocks;
> +		__u8 no_check;
> +
> +		struct linger linger;
> +		struct timeval rcvtimeo;
> +		struct timeval sndtimeo;
> +
> +	} sock __attribute__ ((aligned(8)));
> +
> +} __attribute__ ((aligned(8)));
> +
> +struct ckpt_hdr_socket_queue {
> +	struct ckpt_hdr h;
> +	__u32 skb_count;
> +	__u32 total_bytes;
> +} __attribute__ ((aligned(8)));
> +
> +#define CKPT_TST_OVERFLOW_16(a,b) ((sizeof(a) > sizeof(b)) && ((a) > SHORT_MAX))
> +
> +#define CKPT_TST_OVERFLOW_32(a,b) ((sizeof(a) > sizeof(b)) && ((a) > INT_MAX))
> +
> +#define CKPT_TST_OVERFLOW_64(a,b) ((sizeof(a) > sizeof(b)) && ((a) > LONG_MAX))
> +
> +
> +#endif /* _CHECKPOINT_CKPT_HDR_H_ */
> diff --git a/include/linux/checkpoint_syscalls.h b/include/linux/checkpoint_syscalls.h
> new file mode 100644
> index 0000000..2374bbc
> --- /dev/null
> +++ b/include/linux/checkpoint_syscalls.h
> @@ -0,0 +1,30 @@
> +#ifndef _CHECKPOINT_SYSCALLS_H_
> +#define _CHECKPOINT_SYSCALLS_H_
> +/*
> + * Generated by extract-headers.sh.
> + */
> +
> +#if __s390x__
> +
> +#	ifndef __NR_checkpoint
> +#		define __NR_checkpoint 332
> +#	endif
> +
> +#	ifndef __NR_restart
> +#		define __NR_restart 333
> +#	endif
> +
> +#elif __i386__
> +
> +#	ifndef __NR_checkpoint
> +#		define __NR_checkpoint 338
> +#	endif
> +
> +#	ifndef __NR_restart
> +#		define __NR_restart 339
> +#	endif
> +
> +#else
> +#error "Architecture does not have definitons for __NR_(checkpoint|restart)"
> +#endif
> +#endif /* _CHECKPOINT_SYSCALLS_H_ */
> diff --git a/mktree.c b/mktree.c
> index e42407f..1f9e3d3 100644
> --- a/mktree.c
> +++ b/mktree.c
> @@ -29,8 +29,12 @@
>  #include <asm/unistd.h>
>  #include <sys/syscall.h>
>  #include <sys/prctl.h>
> +#include <sys/socket.h>
> +#include <sys/un.h>
> 
>  #include <linux/sched.h>
> +
> +#include <linux/checkpoint_syscalls.h>
>  #include <linux/checkpoint.h>
>  #include <linux/checkpoint_hdr.h>
> 
> diff --git a/rstr.c b/rstr.c
> index 9199cfe..8cd7a49 100644
> --- a/rstr.c
> +++ b/rstr.c
> @@ -18,6 +18,7 @@
>  #include <asm/unistd.h>
>  #include <sys/syscall.h>
> 
> +#include <linux/checkpoint_syscalls.h>
>  #include <linux/checkpoint.h>
> 
>  int main(int argc, char *argv[])
> diff --git a/scripts/extract-headers.sh b/scripts/extract-headers.sh
> new file mode 100755
> index 0000000..9e6ea4a
> --- /dev/null
> +++ b/scripts/extract-headers.sh
> @@ -0,0 +1,194 @@
> +#!/bin/bash
> +#
> +# Copyright (C) 2009 IBM Corp.
> +# Author: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> +#
> +# This file is subject to the terms and conditions of the GNU General Public
> +# License.  See the file COPYING in the main directory of the Linux
> +# distribution for more details.
> +#
> +
> +#
> +# Sanitize checkpoint/restart kernel headers for userspace.
> +#
> +
> +function usage()
> +{
> +	echo "Usage: $0 [-h|--help] -s|--kernel-src=DIR"
> +}
> +
> +OUTPUT_INCLUDES="include"
> +OPTIONS=`getopt -o s:o:h --long kernel-src:,output:,help -- "$@"`
> +eval set -- "${OPTIONS}"
> +while true ; do
> +	case "$1" in
> +	-s|--kernel-src)
> +		KERNELSRC="$2"
> +		shift 2 ;;
> +	-o|--output)
> +		OUTPUT_INCLUDES="$2"
> +		shift 2 ;;
> +	-h|--help)
> +		usage
> +		exit 0 ;;
> +	--)
> +		shift
> +		break ;;
> +	*)
> +		echo "Unknown option: $1"
> +		shift
> +		echo "Unparsed options: $@"
> +		usage 1>&2
> +		exit 2 ;;
> +	esac
> +done
> +
> +if [ -z "${KERNELSRC}" -o '!' -d "${KERNELSRC}" ]; then
> +	usage 1>&2
> +	exit 2
> +fi
> +
> +#
> +# Run the kernel header through cpp to strip out __KERNEL__ sections but try
> +# to leave the rest untouched.
> +#
> +function do_cpp ()
> +{
> +	local CPP_FILE="$1"
> +	local START_DEFINE="$2"
> +	shift 2
> +
> +	#
> +	# Hide #include directives then run cpp. Make cpp keep comments, not
> +	# insert line numbers, avoid system/gcc/std defines, and only expand
> +	# directives. Strip cpp output until we get to #define START_DEFINE,
> +	# and collapse the excessive number of blank lines that cpp outputas
> +	# in place of directives.
> +	#
> +	sed -e 's/#[[:space:]]*include[[:space:]]*\([<"][^">]*[">]\)/\/*#include \1*\//g' \
> +		"${CPP_FILE}" | \
> +	cpp -CC -P -U__KERNEL__ -undef -nostdinc \
> +		-fdirectives-only -dDI "$@" | \
> +	awk 'BEGIN { x = 0; }
> +	     /#define '"${START_DEFINE}"'/  { x = 1; next; }
> +		(x == 1) { print }' | cat -s | \
> +	sed -e 's|/\*#include \([<"][^">]*[">]\)\*/|#include \1|g'
> +	echo ''
> +}
> +
> +# Map KARCH to something suitable for CPP e.g. __i386__
> +function karch_to_cpparch ()
> +{
> +	local KARCH="$1"
> +	local WORDBITS="$2"
> +	shift 2;
> +
> +	case "${KARCH}" in
> +	x86)	[ "${WORDBITS}" == "32" ] && echo -n "i386"
> +		[ "${WORDBITS}" == "64" ] && echo -n "x86_64"
> +		[ -z "${WORDBITS}" ]      && echo -n 'i386__ || __x86_64' # HACK
> +		;;
> +	s390*)	echo -n "s390x" ;;
> +	*)	echo -n "${KARCH}" ;;
> +	esac
> +	return 0
> +}
> +
> +set -e
> +
> +mkdir -p "${OUTPUT_INCLUDES}/linux"
> +mkdir -p "${OUTPUT_INCLUDES}/asm"
> +
> +cat - > "${OUTPUT_INCLUDES}/linux/checkpoint.h" <<-EOFOFEE
> +/*
> + * Generated by $(basename "$0").
> + */
> +#ifndef _LINUX_CHECKPOINT_H_
> +#define _LINUX_CHECKPOINT_H_
> +EOFOFEE
> +
> +do_cpp "${KERNELSRC}/include/linux/checkpoint.h" "_LINUX_CHECKPOINT_H_" >> "${OUTPUT_INCLUDES}/linux/checkpoint.h"
> +echo '#endif /* _LINUX_CHECKPOINT_H_ */' >> "${OUTPUT_INCLUDES}/linux/checkpoint.h"
> +
> +cat - > "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h" <<-EOFOO
> +/*
> + * Generated by $(basename "$0").
> + */
> +#ifndef _CHECKPOINT_CKPT_HDR_H_
> +#define _CHECKPOINT_CKPT_HDR_H_
> +EOFOO
> +
> +do_cpp "${KERNELSRC}/include/linux/checkpoint_hdr.h" "_CHECKPOINT_CKPT_HDR_H_" >> "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h"
> +echo '#endif /* _CHECKPOINT_CKPT_HDR_H_ */' >> "${OUTPUT_INCLUDES}/linux/checkpoint_hdr.h"
> +
> +(
> +#
> +# We use ARCH_COND to break up architecture-specific sections of the header.
> +#
> +ARCH_COND='#if'
> +REGEX='[[:space:]]*#[[:space:]]*define[[:space:]]+__NR_(checkpoint|restart)[[:space:]]+[[:digit:]]+'
> +
> +cat - <<-EOFOE
> +#ifndef _CHECKPOINT_SYSCALLS_H_
> +#define _CHECKPOINT_SYSCALLS_H_
> +/*
> + * Generated by $(basename "$0").
> + */
> +
> +EOFOE
> +
> +find "${KERNELSRC}/arch" -name 'unistd*.h' -print | sort | \
> +while read UNISTDH ; do
> +	[ -n "${UNISTDH}" ] || continue
> +	grep -q -E "${REGEX}" "${UNISTDH}" || continue
> +
> +	KARCH=$(echo "${UNISTDH}" | sed -e 's|.*/arch/\([^/]\+\)/.*|\1|')
> +	WORDBITS=$(basename "${UNISTDH}" | sed -e 's/unistd_*\([[:digit:]]\+\)\.h/\1/')
> +	CPPARCH="$(karch_to_cpparch "${KARCH}" "${WORDBITS}")"
> +	echo -e "${ARCH_COND} __${CPPARCH}__\\n"
> +	grep -E "${REGEX}" "${UNISTDH}" | \
> +	sed -e 's/^[[:space:]]*#[[:space:]]*define[[:space:]]\+__NR_\([^[:space:]]\+\)[[:space:]]\+\([^[:space:]]\+\).*$/#\tifndef __NR_\1\n#\t\tdefine __NR_\1 \2\n#\tendif\n/'
> +	ARCH_COND='#elif'
> +done
> +
> +cat - <<-EOFOFOE
> +#else
> +#error "Architecture does not have definitons for __NR_(checkpoint|restart)"
> +#endif
> +#endif /* _CHECKPOINT_SYSCALLS_H_ */
> +EOFOFOE
> +
> +) > "${OUTPUT_INCLUDES}/linux/checkpoint_syscalls.h"
> +
> +(
> +ARCH_COND='#if'
> +
> +cat - <<-EOFOEOF
> +/*
> + * Generated by $(basename "$0").
> + */
> +#ifndef __ASM_CHECKPOINT_HDR_H_
> +#define __ASM_CHECKPOINT_HDR_H_
> +#include <sys/user.h>
> +
> +EOFOEOF
> +
> +find "${KERNELSRC}/arch" -name 'checkpoint_hdr.h' -print | sort | \
> +while read ARCH_CHECKPOINT_HDR_H ; do
> +	[ -n "${ARCH_CHECKPOINT_HDR_H}" ] || continue
> +
> +	KARCH=$(echo "${ARCH_CHECKPOINT_HDR_H}" | sed -e 's|.*/arch/\([^/]\+\)/.*|\1|')
> +	CPPARCH="$(karch_to_cpparch "${KARCH}" "")"
> +	echo -e "${ARCH_COND} __${CPPARCH}__\\n"
> +	do_cpp "${KERNELSRC}/arch/${KARCH}/include/asm/checkpoint_hdr.h" '__ASM.*_CKPT_HDR_H' -D_CHECKPOINT_CKPT_HDR_H_
> +	ARCH_COND='#elif'
> +done
> +
> +cat - <<-FOEOEOF
> +#else
> +#error "Architecture does not have definitons needed for checkpoint images."
> +#endif
> +#endif /* __ASM_CHECKPOINT_HDR_H_ */
> +FOEOEOF
> +
> +) > "${OUTPUT_INCLUDES}/asm/checkpoint_hdr.h"
> diff --git a/self.c b/self.c
> index 546fa04..a2f2fb9 100644
> --- a/self.c
> +++ b/self.c
> @@ -16,6 +16,7 @@
>  #include <math.h>
>  #include <sys/syscall.h>
> 
> +#include <linux/checkpoint_syscalls.h>
>  #include <linux/checkpoint.h>
> 
>  #define OUTFILE  "/tmp/cr-self.out"
> -- 
> 1.5.6.3
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] user-cr: Extract kernel headers
       [not found] ` <20090817152403.GA11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
  2009-08-17 16:33   ` Matt Helsley
@ 2009-08-17 20:55   ` Oren Laadan
  1 sibling, 0 replies; 4+ messages in thread
From: Oren Laadan @ 2009-08-17 20:55 UTC (permalink / raw)
  To: Matt Helsley; +Cc: Containers



Matt Helsley wrote:
> Using kernel headers directly from userspace is strongly discouraged.
> This patch attempts to sanitize kernel headers for userspace by
> extracting non-__KERNEL__ portions of the various checkpoint headers
> and placing them in a similar organization of userspace headers.
> 
> The script is run from the top level of the user-cr source tree like:
> 
> 	./scripts/extract-headers.sh -s <path-to-kern-source> -o ./include
> 
> 
> The patch includes a copy of the auto-generated headers and adjusts
> the user-cr programs to use them.
> 
> Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> 
> TODO: Builds on i386. Probably needs more testing, especially on
> 	other non-i386, non-32-bit platforms.
> 
> 	Look at mergiing checkpoint_syscalls.h with checkpoint.h
> 	Or at least find a better, shorter name for checkpoint_syscalls.h

I suppose this will go away once the syscall numbers are accepted to
mainline and make it to the official (user space) headers ?

> 
> NOTES: The script is much larger (2.5x) than for cr_tests because cr_tests
> 	only required the syscall numbers and a few flags for the syscalls.
> 
> 	The headers have a similar organization to the kernel headers
> 	because struct ckpt_hdr must be defined before the arch hdrs and 
> 	yet CKPT_ARCH_NSIG must be defined before the generic signal hdrs.
> 	Plus it's easier to avoid rewriting the paths within the include
> 	directories...
> 
> 	checkpoint_syscalls.h is a multi-arch file with all the syscall
> 	numbers normally found in the arch's unistd.h. I chose to use a
> 	different name to avoid clashes with /usr/include headers.
> ---

[...]

> +CKPT_INCLUDE = -I./include
> +CKPT_HEADERS = $(shell find ./include -name '*.h')
>  
>  # compile with debug ?
>  DEBUG = -DCHECKPOINT_DEBUG
> @@ -39,6 +20,8 @@ OTHER = ckptinfo_types.c
>  
>  LDLIBS = -lm
>  
> +.PHONY: all distclean clean headers install
> +
>  all: $(PROGS)
>  	@make -C test
>  
> @@ -56,10 +39,16 @@ ckptinfo_types.c: $(CKPT_HEADERS) ckptinfo.py
>  
>  %.o:	%.c
>  
> +headers:
> +	./scripts/extract-headers.sh -s ../linux-2.6.git

Would be nice if this isn't hard-coded (perhaps an env-var ?)

> +
>  install:
>  	@echo /usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
>  	@/usr/bin/install -m 755 mktree ckpt rstr ckptinfo $(INSTALL_DIR)
>  
> +distclean: clean
> +	@rm -f $(CKPT_HEADERS)

Would 'make headers' be automagically called on the next 'make'
after the user does 'make distclean' ?

If not, then $(CKPT_HEADERS) will be empty and this will break the
dependencies of ckptinfo_types.c.

Oren.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [RFC][PATCH] user-cr: Extract kernel headers
       [not found]     ` <20090817163356.GB11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
@ 2009-08-17 21:00       ` Oren Laadan
  0 siblings, 0 replies; 4+ messages in thread
From: Oren Laadan @ 2009-08-17 21:00 UTC (permalink / raw)
  To: Matt Helsley; +Cc: Containers



Matt Helsley wrote:
> On Mon, Aug 17, 2009 at 08:24:03AM -0700, Matt Helsley wrote:
>> Using kernel headers directly from userspace is strongly discouraged.
>> This patch attempts to sanitize kernel headers for userspace by
>> extracting non-__KERNEL__ portions of the various checkpoint headers
>> and placing them in a similar organization of userspace headers.
>>
>> The script is run from the top level of the user-cr source tree like:
>>
>> 	./scripts/extract-headers.sh -s <path-to-kern-source> -o ./include
>>
>>
>> The patch includes a copy of the auto-generated headers and adjusts
>> the user-cr programs to use them.
>>
>> Signed-off-by: Matt Helsley <matthltc-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
>>
>> TODO: Builds on i386. Probably needs more testing, especially on
>> 	other non-i386, non-32-bit platforms.
> 
> Argh. Still one build problem that the script doesn't resolve. From the
> kernel headers:
> 
> #include <linux/socket.h>
> #include <linux/un.h>
> 
> I think these need to be changed to use sys/ instead of linux/ but I
> can't see a good way to do this without hardcoding it into the script
> or replacing _all_ "linux/" includes with "sys/" (but I haven't checked
> if that will work much less if it's a good solution..). Would be nice
> to know if anyone has preferences or knows kernel/user header conventions
> I don't....

I don't either .. but - I'd guess that s;linux/;sys/; should work ?
(you probably mean the ones in include/linux/checkpoint_hdr.h...)

Oren.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2009-08-17 21:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-17 15:24 [RFC][PATCH] user-cr: Extract kernel headers Matt Helsley
     [not found] ` <20090817152403.GA11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-08-17 16:33   ` Matt Helsley
     [not found]     ` <20090817163356.GB11415-52DBMbEzqgQ/wnmkkaCWp/UQ3DHhIser@public.gmane.org>
2009-08-17 21:00       ` Oren Laadan
2009-08-17 20:55   ` Oren Laadan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.