From: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
To: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
Cc: Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Linus Torvalds <torvalds-3NddpPZAyC0@public.gmane.org>,
containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>,
Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>,
Dave Hansen
<dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>,
Ingo Molnar <mingo-X9Un+BFzKDI@public.gmane.org>,
"H. Peter Anvin" <hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org>
Subject: Re: [RFC v10][PATCH 05/13] Dump memory address space
Date: Fri, 28 Nov 2008 10:53:51 +0000 [thread overview]
Message-ID: <20081128105351.GQ28946@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1227747884-14150-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
On Wed, Nov 26, 2008 at 08:04:36PM -0500, Oren Laadan wrote:
> For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped,
> it will be followed by the file name. Then comes the actual contents,
> in one or more chunk: each chunk begins with a header that specifies
> how many pages it holds, then the virtual addresses of all the dumped
> pages in that chunk, followed by the actual contents of all dumped
> pages. A header with zero number of pages marks the end of the contents.
> Then comes the next VMA and so on.
>
> Changelog[v10]:
> - Acquire dcache_lock around call to __d_path() in cr_fill_name()
>
> Changelog[v9]:
> - Introduce cr_ctx_checkpoint() for checkpoint-specific ctx setup
> - Test if __d_path() changes mnt/dentry (when crossing filesystem
> namespace boundary). for now cr_fill_fname() fails the checkpoint.
>
> Changelog[v7]:
> - Fix argument given to kunmap_atomic() in memory dump/restore
>
> Changelog[v6]:
> - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put()
> (even though it's not really needed)
>
> Changelog[v5]:
> - Improve memory dump code (following Dave Hansen's comments)
> - Change dump format (and code) to allow chunks of <vaddrs, pages>
> instead of one long list of each
> - Fix use of follow_page() to avoid faulting in non-present pages
>
> Changelog[v4]:
> - Use standard list_... for cr_pgarr
>
> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
> Acked-by: Serge Hallyn <serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
> ---
> arch/x86/include/asm/checkpoint_hdr.h | 5 +
> arch/x86/mm/checkpoint.c | 31 ++
> checkpoint/Makefile | 3 +-
> checkpoint/checkpoint.c | 81 ++++++
> checkpoint/checkpoint_arch.h | 2 +
> checkpoint/checkpoint_mem.h | 41 +++
> checkpoint/ckpt_mem.c | 500 +++++++++++++++++++++++++++++++++
> checkpoint/sys.c | 11 +
> include/linux/checkpoint.h | 12 +
> include/linux/checkpoint_hdr.h | 32 ++
> 10 files changed, 717 insertions(+), 1 deletions(-)
> create mode 100644 checkpoint/checkpoint_mem.h
> create mode 100644 checkpoint/ckpt_mem.c
>
> diff --git a/arch/x86/include/asm/checkpoint_hdr.h b/arch/x86/include/asm/checkpoint_hdr.h
> index 6325062..33f4c70 100644
> --- a/arch/x86/include/asm/checkpoint_hdr.h
> +++ b/arch/x86/include/asm/checkpoint_hdr.h
> @@ -82,4 +82,9 @@ struct cr_hdr_cpu {
> /* thread_xstate contents follow (if used_math) */
> } __attribute__((aligned(8)));
>
> +struct cr_hdr_mm_context {
> + __s16 ldt_entry_size;
> + __s16 nldt;
> +} __attribute__((aligned(8)));
> +
> #endif /* __ASM_X86_CKPT_HDR__H */
> diff --git a/arch/x86/mm/checkpoint.c b/arch/x86/mm/checkpoint.c
> index 8dd6d2d..757936e 100644
> --- a/arch/x86/mm/checkpoint.c
> +++ b/arch/x86/mm/checkpoint.c
> @@ -221,3 +221,34 @@ int cr_write_head_arch(struct cr_ctx *ctx)
>
> return ret;
> }
> +
> +/* dump the mm->context state */
> +int cr_write_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int parent)
> +{
> + struct cr_hdr h;
> + struct cr_hdr_mm_context *hh = cr_hbuf_get(ctx, sizeof(*hh));
> + int ret;
> +
> + h.type = CR_HDR_MM_CONTEXT;
> + h.len = sizeof(*hh);
> + h.parent = parent;
> +
> + mutex_lock(&mm->context.lock);
> +
> + hh->ldt_entry_size = LDT_ENTRY_SIZE;
> + hh->nldt = mm->context.size;
> +
> + cr_debug("nldt %d\n", hh->nldt);
> +
> + ret = cr_write_obj(ctx, &h, hh);
> + cr_hbuf_put(ctx, sizeof(*hh));
> + if (ret < 0)
> + goto out;
> +
> + ret = cr_kwrite(ctx, mm->context.ldt,
> + mm->context.size * LDT_ENTRY_SIZE);
> +
> + out:
> + mutex_unlock(&mm->context.lock);
> + return ret;
> +}
> diff --git a/checkpoint/Makefile b/checkpoint/Makefile
> index d2df68c..3a0df6d 100644
> --- a/checkpoint/Makefile
> +++ b/checkpoint/Makefile
> @@ -2,4 +2,5 @@
> # Makefile for linux checkpoint/restart.
> #
>
> -obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o
> +obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o \
> + ckpt_mem.o
> diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
> index 17cc8d2..6a8f810 100644
> --- a/checkpoint/checkpoint.c
> +++ b/checkpoint/checkpoint.c
> @@ -75,6 +75,66 @@ int cr_write_string(struct cr_ctx *ctx, char *str, int len)
> return cr_write_obj(ctx, &h, str);
> }
>
> +/**
> + * cr_fill_fname - return pathname of a given file
> + * @path: path name
> + * @root: relative root
> + * @buf: buffer for pathname
> + * @n: buffer length (in) and pathname length (out)
> + */
> +static char *
> +cr_fill_fname(struct path *path, struct path *root, char *buf, int *n)
> +{
> + struct path tmp = *root;
> + char *fname;
> +
> + BUG_ON(!buf);
> + spin_lock(&dcache_lock);
> + fname = __d_path(path, &tmp, buf, *n);
> + spin_unlock(&dcache_lock);
> + if (!IS_ERR(fname))
> + *n = (buf + (*n) - fname);
> + /*
> + * FIXME: if __d_path() changed these, it must have stepped out of
> + * init's namespace. Since currently we require a unified namespace
> + * within the container: simply fail.
> + */
> + if (tmp.mnt != root->mnt || tmp.dentry != root->dentry)
> + fname = ERR_PTR(-EBADF);
> +
> + return fname;
> +}
> +
> +/**
> + * cr_write_fname - write a file name
> + * @ctx: checkpoint context
> + * @path: path name
> + * @root: relative root
> + */
> +int cr_write_fname(struct cr_ctx *ctx, struct path *path, struct path *root)
> +{
> + struct cr_hdr h;
> + char *buf, *fname;
> + int ret, flen;
> +
> + flen = PATH_MAX;
> + buf = kmalloc(flen, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + fname = cr_fill_fname(path, root, buf, &flen);
> + if (!IS_ERR(fname)) {
> + h.type = CR_HDR_FNAME;
> + h.len = flen;
> + h.parent = 0;
> + ret = cr_write_obj(ctx, &h, fname);
> + } else
> + ret = PTR_ERR(fname);
> +
> + kfree(buf);
> + return ret;
> +}
> +
> /* write the checkpoint header */
> static int cr_write_head(struct cr_ctx *ctx)
> {
> @@ -168,6 +228,10 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> cr_debug("task_struct: ret %d\n", ret);
> if (ret < 0)
> goto out;
> + ret = cr_write_mm(ctx, t);
> + cr_debug("memory: ret %d\n", ret);
> + if (ret < 0)
> + goto out;
> ret = cr_write_thread(ctx, t);
> cr_debug("thread: ret %d\n", ret);
> if (ret < 0)
> @@ -178,10 +242,27 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> return ret;
> }
>
> +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid)
> +{
> + ctx->root_pid = pid;
> +
> + /*
> + * assume checkpointer is in container's root vfs
> + * FIXME: this works for now, but will change with real containers
> + */
> + ctx->vfsroot = ¤t->fs->root;
> + path_get(ctx->vfsroot);
This is going to break as soon as you get another thread doing e.g. chroot(2)
while you are in there. And it's a really, _really_ bad idea to take a
pointer to shared object, increment refcount on the current *contents* of
said object and assume that dropping refcount on the later contents of the
same will balance out.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@osdl.org>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-api@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Serge Hallyn <serue@us.ibm.com>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC v10][PATCH 05/13] Dump memory address space
Date: Fri, 28 Nov 2008 10:53:51 +0000 [thread overview]
Message-ID: <20081128105351.GQ28946@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1227747884-14150-6-git-send-email-orenl@cs.columbia.edu>
On Wed, Nov 26, 2008 at 08:04:36PM -0500, Oren Laadan wrote:
> For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped,
> it will be followed by the file name. Then comes the actual contents,
> in one or more chunk: each chunk begins with a header that specifies
> how many pages it holds, then the virtual addresses of all the dumped
> pages in that chunk, followed by the actual contents of all dumped
> pages. A header with zero number of pages marks the end of the contents.
> Then comes the next VMA and so on.
>
> Changelog[v10]:
> - Acquire dcache_lock around call to __d_path() in cr_fill_name()
>
> Changelog[v9]:
> - Introduce cr_ctx_checkpoint() for checkpoint-specific ctx setup
> - Test if __d_path() changes mnt/dentry (when crossing filesystem
> namespace boundary). for now cr_fill_fname() fails the checkpoint.
>
> Changelog[v7]:
> - Fix argument given to kunmap_atomic() in memory dump/restore
>
> Changelog[v6]:
> - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put()
> (even though it's not really needed)
>
> Changelog[v5]:
> - Improve memory dump code (following Dave Hansen's comments)
> - Change dump format (and code) to allow chunks of <vaddrs, pages>
> instead of one long list of each
> - Fix use of follow_page() to avoid faulting in non-present pages
>
> Changelog[v4]:
> - Use standard list_... for cr_pgarr
>
> Signed-off-by: Oren Laadan <orenl@cs.columbia.edu>
> Acked-by: Serge Hallyn <serue@us.ibm.com>
> Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
> ---
> arch/x86/include/asm/checkpoint_hdr.h | 5 +
> arch/x86/mm/checkpoint.c | 31 ++
> checkpoint/Makefile | 3 +-
> checkpoint/checkpoint.c | 81 ++++++
> checkpoint/checkpoint_arch.h | 2 +
> checkpoint/checkpoint_mem.h | 41 +++
> checkpoint/ckpt_mem.c | 500 +++++++++++++++++++++++++++++++++
> checkpoint/sys.c | 11 +
> include/linux/checkpoint.h | 12 +
> include/linux/checkpoint_hdr.h | 32 ++
> 10 files changed, 717 insertions(+), 1 deletions(-)
> create mode 100644 checkpoint/checkpoint_mem.h
> create mode 100644 checkpoint/ckpt_mem.c
>
> diff --git a/arch/x86/include/asm/checkpoint_hdr.h b/arch/x86/include/asm/checkpoint_hdr.h
> index 6325062..33f4c70 100644
> --- a/arch/x86/include/asm/checkpoint_hdr.h
> +++ b/arch/x86/include/asm/checkpoint_hdr.h
> @@ -82,4 +82,9 @@ struct cr_hdr_cpu {
> /* thread_xstate contents follow (if used_math) */
> } __attribute__((aligned(8)));
>
> +struct cr_hdr_mm_context {
> + __s16 ldt_entry_size;
> + __s16 nldt;
> +} __attribute__((aligned(8)));
> +
> #endif /* __ASM_X86_CKPT_HDR__H */
> diff --git a/arch/x86/mm/checkpoint.c b/arch/x86/mm/checkpoint.c
> index 8dd6d2d..757936e 100644
> --- a/arch/x86/mm/checkpoint.c
> +++ b/arch/x86/mm/checkpoint.c
> @@ -221,3 +221,34 @@ int cr_write_head_arch(struct cr_ctx *ctx)
>
> return ret;
> }
> +
> +/* dump the mm->context state */
> +int cr_write_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int parent)
> +{
> + struct cr_hdr h;
> + struct cr_hdr_mm_context *hh = cr_hbuf_get(ctx, sizeof(*hh));
> + int ret;
> +
> + h.type = CR_HDR_MM_CONTEXT;
> + h.len = sizeof(*hh);
> + h.parent = parent;
> +
> + mutex_lock(&mm->context.lock);
> +
> + hh->ldt_entry_size = LDT_ENTRY_SIZE;
> + hh->nldt = mm->context.size;
> +
> + cr_debug("nldt %d\n", hh->nldt);
> +
> + ret = cr_write_obj(ctx, &h, hh);
> + cr_hbuf_put(ctx, sizeof(*hh));
> + if (ret < 0)
> + goto out;
> +
> + ret = cr_kwrite(ctx, mm->context.ldt,
> + mm->context.size * LDT_ENTRY_SIZE);
> +
> + out:
> + mutex_unlock(&mm->context.lock);
> + return ret;
> +}
> diff --git a/checkpoint/Makefile b/checkpoint/Makefile
> index d2df68c..3a0df6d 100644
> --- a/checkpoint/Makefile
> +++ b/checkpoint/Makefile
> @@ -2,4 +2,5 @@
> # Makefile for linux checkpoint/restart.
> #
>
> -obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o
> +obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o \
> + ckpt_mem.o
> diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
> index 17cc8d2..6a8f810 100644
> --- a/checkpoint/checkpoint.c
> +++ b/checkpoint/checkpoint.c
> @@ -75,6 +75,66 @@ int cr_write_string(struct cr_ctx *ctx, char *str, int len)
> return cr_write_obj(ctx, &h, str);
> }
>
> +/**
> + * cr_fill_fname - return pathname of a given file
> + * @path: path name
> + * @root: relative root
> + * @buf: buffer for pathname
> + * @n: buffer length (in) and pathname length (out)
> + */
> +static char *
> +cr_fill_fname(struct path *path, struct path *root, char *buf, int *n)
> +{
> + struct path tmp = *root;
> + char *fname;
> +
> + BUG_ON(!buf);
> + spin_lock(&dcache_lock);
> + fname = __d_path(path, &tmp, buf, *n);
> + spin_unlock(&dcache_lock);
> + if (!IS_ERR(fname))
> + *n = (buf + (*n) - fname);
> + /*
> + * FIXME: if __d_path() changed these, it must have stepped out of
> + * init's namespace. Since currently we require a unified namespace
> + * within the container: simply fail.
> + */
> + if (tmp.mnt != root->mnt || tmp.dentry != root->dentry)
> + fname = ERR_PTR(-EBADF);
> +
> + return fname;
> +}
> +
> +/**
> + * cr_write_fname - write a file name
> + * @ctx: checkpoint context
> + * @path: path name
> + * @root: relative root
> + */
> +int cr_write_fname(struct cr_ctx *ctx, struct path *path, struct path *root)
> +{
> + struct cr_hdr h;
> + char *buf, *fname;
> + int ret, flen;
> +
> + flen = PATH_MAX;
> + buf = kmalloc(flen, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + fname = cr_fill_fname(path, root, buf, &flen);
> + if (!IS_ERR(fname)) {
> + h.type = CR_HDR_FNAME;
> + h.len = flen;
> + h.parent = 0;
> + ret = cr_write_obj(ctx, &h, fname);
> + } else
> + ret = PTR_ERR(fname);
> +
> + kfree(buf);
> + return ret;
> +}
> +
> /* write the checkpoint header */
> static int cr_write_head(struct cr_ctx *ctx)
> {
> @@ -168,6 +228,10 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> cr_debug("task_struct: ret %d\n", ret);
> if (ret < 0)
> goto out;
> + ret = cr_write_mm(ctx, t);
> + cr_debug("memory: ret %d\n", ret);
> + if (ret < 0)
> + goto out;
> ret = cr_write_thread(ctx, t);
> cr_debug("thread: ret %d\n", ret);
> if (ret < 0)
> @@ -178,10 +242,27 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> return ret;
> }
>
> +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid)
> +{
> + ctx->root_pid = pid;
> +
> + /*
> + * assume checkpointer is in container's root vfs
> + * FIXME: this works for now, but will change with real containers
> + */
> + ctx->vfsroot = ¤t->fs->root;
> + path_get(ctx->vfsroot);
This is going to break as soon as you get another thread doing e.g. chroot(2)
while you are in there. And it's a really, _really_ bad idea to take a
pointer to shared object, increment refcount on the current *contents* of
said object and assume that dropping refcount on the later contents of the
same will balance out.
WARNING: multiple messages have this Message-ID (diff)
From: Al Viro <viro@ZenIV.linux.org.uk>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@osdl.org>,
containers@lists.linux-foundation.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-api@vger.kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Serge Hallyn <serue@us.ibm.com>,
Dave Hansen <dave@linux.vnet.ibm.com>,
Ingo Molnar <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [RFC v10][PATCH 05/13] Dump memory address space
Date: Fri, 28 Nov 2008 10:53:51 +0000 [thread overview]
Message-ID: <20081128105351.GQ28946@ZenIV.linux.org.uk> (raw)
In-Reply-To: <1227747884-14150-6-git-send-email-orenl@cs.columbia.edu>
On Wed, Nov 26, 2008 at 08:04:36PM -0500, Oren Laadan wrote:
> For each VMA, there is a 'struct cr_vma'; if the VMA is file-mapped,
> it will be followed by the file name. Then comes the actual contents,
> in one or more chunk: each chunk begins with a header that specifies
> how many pages it holds, then the virtual addresses of all the dumped
> pages in that chunk, followed by the actual contents of all dumped
> pages. A header with zero number of pages marks the end of the contents.
> Then comes the next VMA and so on.
>
> Changelog[v10]:
> - Acquire dcache_lock around call to __d_path() in cr_fill_name()
>
> Changelog[v9]:
> - Introduce cr_ctx_checkpoint() for checkpoint-specific ctx setup
> - Test if __d_path() changes mnt/dentry (when crossing filesystem
> namespace boundary). for now cr_fill_fname() fails the checkpoint.
>
> Changelog[v7]:
> - Fix argument given to kunmap_atomic() in memory dump/restore
>
> Changelog[v6]:
> - Balance all calls to cr_hbuf_get() with matching cr_hbuf_put()
> (even though it's not really needed)
>
> Changelog[v5]:
> - Improve memory dump code (following Dave Hansen's comments)
> - Change dump format (and code) to allow chunks of <vaddrs, pages>
> instead of one long list of each
> - Fix use of follow_page() to avoid faulting in non-present pages
>
> Changelog[v4]:
> - Use standard list_... for cr_pgarr
>
> Signed-off-by: Oren Laadan <orenl@cs.columbia.edu>
> Acked-by: Serge Hallyn <serue@us.ibm.com>
> Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
> ---
> arch/x86/include/asm/checkpoint_hdr.h | 5 +
> arch/x86/mm/checkpoint.c | 31 ++
> checkpoint/Makefile | 3 +-
> checkpoint/checkpoint.c | 81 ++++++
> checkpoint/checkpoint_arch.h | 2 +
> checkpoint/checkpoint_mem.h | 41 +++
> checkpoint/ckpt_mem.c | 500 +++++++++++++++++++++++++++++++++
> checkpoint/sys.c | 11 +
> include/linux/checkpoint.h | 12 +
> include/linux/checkpoint_hdr.h | 32 ++
> 10 files changed, 717 insertions(+), 1 deletions(-)
> create mode 100644 checkpoint/checkpoint_mem.h
> create mode 100644 checkpoint/ckpt_mem.c
>
> diff --git a/arch/x86/include/asm/checkpoint_hdr.h b/arch/x86/include/asm/checkpoint_hdr.h
> index 6325062..33f4c70 100644
> --- a/arch/x86/include/asm/checkpoint_hdr.h
> +++ b/arch/x86/include/asm/checkpoint_hdr.h
> @@ -82,4 +82,9 @@ struct cr_hdr_cpu {
> /* thread_xstate contents follow (if used_math) */
> } __attribute__((aligned(8)));
>
> +struct cr_hdr_mm_context {
> + __s16 ldt_entry_size;
> + __s16 nldt;
> +} __attribute__((aligned(8)));
> +
> #endif /* __ASM_X86_CKPT_HDR__H */
> diff --git a/arch/x86/mm/checkpoint.c b/arch/x86/mm/checkpoint.c
> index 8dd6d2d..757936e 100644
> --- a/arch/x86/mm/checkpoint.c
> +++ b/arch/x86/mm/checkpoint.c
> @@ -221,3 +221,34 @@ int cr_write_head_arch(struct cr_ctx *ctx)
>
> return ret;
> }
> +
> +/* dump the mm->context state */
> +int cr_write_mm_context(struct cr_ctx *ctx, struct mm_struct *mm, int parent)
> +{
> + struct cr_hdr h;
> + struct cr_hdr_mm_context *hh = cr_hbuf_get(ctx, sizeof(*hh));
> + int ret;
> +
> + h.type = CR_HDR_MM_CONTEXT;
> + h.len = sizeof(*hh);
> + h.parent = parent;
> +
> + mutex_lock(&mm->context.lock);
> +
> + hh->ldt_entry_size = LDT_ENTRY_SIZE;
> + hh->nldt = mm->context.size;
> +
> + cr_debug("nldt %d\n", hh->nldt);
> +
> + ret = cr_write_obj(ctx, &h, hh);
> + cr_hbuf_put(ctx, sizeof(*hh));
> + if (ret < 0)
> + goto out;
> +
> + ret = cr_kwrite(ctx, mm->context.ldt,
> + mm->context.size * LDT_ENTRY_SIZE);
> +
> + out:
> + mutex_unlock(&mm->context.lock);
> + return ret;
> +}
> diff --git a/checkpoint/Makefile b/checkpoint/Makefile
> index d2df68c..3a0df6d 100644
> --- a/checkpoint/Makefile
> +++ b/checkpoint/Makefile
> @@ -2,4 +2,5 @@
> # Makefile for linux checkpoint/restart.
> #
>
> -obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o
> +obj-$(CONFIG_CHECKPOINT_RESTART) += sys.o checkpoint.o restart.o \
> + ckpt_mem.o
> diff --git a/checkpoint/checkpoint.c b/checkpoint/checkpoint.c
> index 17cc8d2..6a8f810 100644
> --- a/checkpoint/checkpoint.c
> +++ b/checkpoint/checkpoint.c
> @@ -75,6 +75,66 @@ int cr_write_string(struct cr_ctx *ctx, char *str, int len)
> return cr_write_obj(ctx, &h, str);
> }
>
> +/**
> + * cr_fill_fname - return pathname of a given file
> + * @path: path name
> + * @root: relative root
> + * @buf: buffer for pathname
> + * @n: buffer length (in) and pathname length (out)
> + */
> +static char *
> +cr_fill_fname(struct path *path, struct path *root, char *buf, int *n)
> +{
> + struct path tmp = *root;
> + char *fname;
> +
> + BUG_ON(!buf);
> + spin_lock(&dcache_lock);
> + fname = __d_path(path, &tmp, buf, *n);
> + spin_unlock(&dcache_lock);
> + if (!IS_ERR(fname))
> + *n = (buf + (*n) - fname);
> + /*
> + * FIXME: if __d_path() changed these, it must have stepped out of
> + * init's namespace. Since currently we require a unified namespace
> + * within the container: simply fail.
> + */
> + if (tmp.mnt != root->mnt || tmp.dentry != root->dentry)
> + fname = ERR_PTR(-EBADF);
> +
> + return fname;
> +}
> +
> +/**
> + * cr_write_fname - write a file name
> + * @ctx: checkpoint context
> + * @path: path name
> + * @root: relative root
> + */
> +int cr_write_fname(struct cr_ctx *ctx, struct path *path, struct path *root)
> +{
> + struct cr_hdr h;
> + char *buf, *fname;
> + int ret, flen;
> +
> + flen = PATH_MAX;
> + buf = kmalloc(flen, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + fname = cr_fill_fname(path, root, buf, &flen);
> + if (!IS_ERR(fname)) {
> + h.type = CR_HDR_FNAME;
> + h.len = flen;
> + h.parent = 0;
> + ret = cr_write_obj(ctx, &h, fname);
> + } else
> + ret = PTR_ERR(fname);
> +
> + kfree(buf);
> + return ret;
> +}
> +
> /* write the checkpoint header */
> static int cr_write_head(struct cr_ctx *ctx)
> {
> @@ -168,6 +228,10 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> cr_debug("task_struct: ret %d\n", ret);
> if (ret < 0)
> goto out;
> + ret = cr_write_mm(ctx, t);
> + cr_debug("memory: ret %d\n", ret);
> + if (ret < 0)
> + goto out;
> ret = cr_write_thread(ctx, t);
> cr_debug("thread: ret %d\n", ret);
> if (ret < 0)
> @@ -178,10 +242,27 @@ static int cr_write_task(struct cr_ctx *ctx, struct task_struct *t)
> return ret;
> }
>
> +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid)
> +{
> + ctx->root_pid = pid;
> +
> + /*
> + * assume checkpointer is in container's root vfs
> + * FIXME: this works for now, but will change with real containers
> + */
> + ctx->vfsroot = ¤t->fs->root;
> + path_get(ctx->vfsroot);
This is going to break as soon as you get another thread doing e.g. chroot(2)
while you are in there. And it's a really, _really_ bad idea to take a
pointer to shared object, increment refcount on the current *contents* of
said object and assume that dropping refcount on the later contents of the
same will balance out.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-11-28 10:53 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-27 1:04 [RFC v10][PATCH 00/13] Kernel based checkpoint/restart Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
[not found] ` <1227747884-14150-1-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-27 1:04 ` [RFC v10][PATCH 01/13] Create syscalls: sys_checkpoint, sys_restart Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 02/13] Checkpoint/restart: initial documentation Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
[not found] ` <1227747884-14150-3-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:45 ` Al Viro
2008-11-28 10:45 ` Al Viro
2008-11-28 10:45 ` Al Viro
[not found] ` <20081128104554.GP28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 18:15 ` Dave Hansen
2008-12-01 18:15 ` Dave Hansen
2008-12-01 18:15 ` Dave Hansen
2008-12-01 18:15 ` Dave Hansen
2008-11-28 10:45 ` Al Viro
2008-11-27 1:04 ` [RFC v10][PATCH 03/13] General infrastructure for checkpoint restart Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 04/13] x86 support for checkpoint/restart Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 05/13] Dump memory address space Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
[not found] ` <1227747884-14150-6-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:53 ` Al Viro
2008-11-28 10:53 ` Al Viro [this message]
2008-11-28 10:53 ` Al Viro
2008-11-28 10:53 ` Al Viro
[not found] ` <20081128105351.GQ28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 18:00 ` Dave Hansen
2008-12-01 18:00 ` Dave Hansen
2008-12-01 18:00 ` Dave Hansen
2008-12-01 20:57 ` Oren Laadan
2008-12-01 20:57 ` Oren Laadan
2008-12-01 20:57 ` Oren Laadan
2008-12-01 20:57 ` Oren Laadan
2008-12-01 18:00 ` Dave Hansen
2008-11-27 1:04 ` [RFC v10][PATCH 06/13] Restore " Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 07/13] Infrastructure for shared objects Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 08/13] Dump open file descriptors Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
[not found] ` <1227747884-14150-9-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 10:19 ` Al Viro
2008-11-28 10:19 ` Al Viro
2008-11-28 10:19 ` Al Viro
[not found] ` <20081128101919.GO28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 17:47 ` Dave Hansen
2008-12-01 17:47 ` Dave Hansen
2008-12-01 17:47 ` Dave Hansen
2008-12-01 17:47 ` Dave Hansen
2008-12-01 20:23 ` Oren Laadan
2008-12-01 20:23 ` Oren Laadan
2008-12-01 20:23 ` Oren Laadan
[not found] ` <493447DD.7010102-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 20:51 ` Dave Hansen
2008-12-01 20:51 ` Dave Hansen
2008-12-01 20:51 ` Dave Hansen
2008-12-01 20:51 ` Dave Hansen
2008-12-01 21:02 ` Linus Torvalds
2008-12-01 21:02 ` Linus Torvalds
2008-12-01 21:02 ` Linus Torvalds
2008-12-01 21:02 ` Linus Torvalds
[not found] ` <alpine.LFD.2.00.0812011258390.3256-nfNrOhbfy2R17+2ddN/4kux8cNe9sq/dYPYVAmT7z5s@public.gmane.org>
2008-12-01 21:25 ` Dave Hansen
2008-12-01 21:25 ` Dave Hansen
2008-12-01 21:25 ` Dave Hansen
2008-12-01 21:25 ` Dave Hansen
2008-12-01 21:20 ` Oren Laadan
2008-12-01 21:20 ` Oren Laadan
2008-12-01 21:20 ` Oren Laadan
2008-12-01 20:23 ` Oren Laadan
2008-11-28 10:19 ` Al Viro
2008-11-27 1:04 ` [RFC v10][PATCH 09/13] Restore open file descriprtors Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
[not found] ` <1227747884-14150-10-git-send-email-orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-11-28 11:27 ` Al Viro
2008-11-28 11:27 ` Al Viro
2008-11-28 11:27 ` Al Viro
2008-11-28 11:27 ` Al Viro
[not found] ` <20081128112745.GR28946-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2008-12-01 19:22 ` Dave Hansen
2008-12-01 19:22 ` Dave Hansen
2008-12-01 19:22 ` Dave Hansen
2008-12-01 19:22 ` Dave Hansen
2008-12-01 20:41 ` Oren Laadan
2008-12-01 20:41 ` Oren Laadan
2008-12-01 20:41 ` Oren Laadan
2008-12-01 20:41 ` Oren Laadan
[not found] ` <49344C11.6090204-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 20:54 ` Dave Hansen
2008-12-01 20:54 ` Dave Hansen
2008-12-01 20:54 ` Dave Hansen
2008-12-01 20:54 ` Dave Hansen
2008-12-01 21:00 ` Oren Laadan
2008-12-01 21:00 ` Oren Laadan
2008-12-01 21:00 ` Oren Laadan
[not found] ` <49345086.4-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-12-01 21:07 ` Dave Hansen
2008-12-01 21:07 ` Dave Hansen
2008-12-01 21:07 ` Dave Hansen
2008-12-01 21:07 ` Dave Hansen
2008-12-02 1:31 ` Dave Hansen
2008-12-02 1:31 ` Dave Hansen
2008-12-02 1:31 ` Dave Hansen
2008-12-02 1:31 ` Dave Hansen
2008-12-02 1:12 ` Dave Hansen
2008-12-02 1:12 ` Dave Hansen
2008-12-02 1:12 ` Dave Hansen
2008-12-02 1:12 ` Dave Hansen
2008-12-01 21:00 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 10/13] External checkpoint of a task other than ourself Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 11/13] Track in-kernel when we expect checkpoint/restart to work Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 12/13] Checkpoint multiple processes Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` [RFC v10][PATCH 13/13] Restart " Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-11-27 1:04 ` Oren Laadan
2008-12-03 23:58 ` [RFC v10][PATCH 00/13] Kernel based checkpoint/restart Serge E. Hallyn
2008-12-03 23:58 ` Serge E. Hallyn
2008-12-03 23:58 ` Serge E. Hallyn
2008-12-03 23:58 ` Serge E. Hallyn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081128105351.GQ28946@ZenIV.linux.org.uk \
--to=viro-3bdd1+5odreifsdqtta3olvcufugdwfn@public.gmane.org \
--cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
--cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
--cc=hpa-YMNOUZJC4hwAvxtiuMwx3w@public.gmane.org \
--cc=linux-api-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
--cc=mingo-X9Un+BFzKDI@public.gmane.org \
--cc=orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org \
--cc=serue-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org \
--cc=tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org \
--cc=torvalds-3NddpPZAyC0@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.