All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
To: KOSAKI Motohiro
	<kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
Cc: Linux Containers
	<containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org>
Subject: Re: [RFC][PATCH 2/2] CR: handle a single task with private memory maps
Date: Wed, 30 Jul 2008 14:22:26 -0400	[thread overview]
Message-ID: <4890B162.9050405@cs.columbia.edu> (raw)
In-Reply-To: <20080730132257.9DF2.KOSAKI.MOTOHIRO-+CUm20s59erQFUHtdCDX3A@public.gmane.org>



KOSAKI Motohiro wrote:
> Hi
> 
>> Expand the template sys_checkpoint and sys_restart to be able to dump
>> and restore a single task. The task's address space may consist of only
>> private, simple vma's - anonymous or file-mapped.
>>
>> This big patch adds a mechanism to transfer data between kernel or user
>> space to and from the file given by the caller (sys.c), alloc/setup/free
>> of the checkpoint/restart context (sys.c), output wrappers and basic
>> checkpoint handling (checkpoint.c), memory dump (ckpt_mem.c), input
>> wrappers and basic restart handling (restart.c), and finally the memory
>> restore (rstr_mem.c).
>>
>> Signed-off-by: Oren Laadan <orenl-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
> 
> please write a documentation of describe memory dump file format,
> and split save and restore to two patches.

While save and restore functionality is already split to different source
files, I can easily refine the patch.

Dump file format: as agreed during the OLS, the format will be nested (as
in "depth-first" as opposed to "breadth-first"). The rationale is to be
able to stream the entire checkpoint image without file seeks. The suggested
layout looks like this:

1. Image header: information about kernel version, CR version, kernel
configuration, CPU capabilities etc.

2. Container global section: state that is global to the container, e.g.
SysV IPC, network setup.

3. Task tree/forest state: number of tasks and their relationships

4. State of each task (one by one): including task_struct state, thread
state, cpu registers, followed by memory, files, signals etc.

5. Image trailer: marking the end of the image and providing checksum and
the like.

Since this patch is only a proof-of-concept, it has a very simple #1,
no #2 or #3, limited #4 and very simple #5.

This patch still doesn't handle shared objects, but they will be handled
as follows: the first time a shared object is accessed (to dump it) it is
given a unique identifier and dumped in full. The next time(s) the object
is found, only the identifier is saved instead.

A bit more specific about the format: it will be composed of "records",
such that each record has a pre-header that identifies its contents and a
payload. (The idea here is to enable parallel checkpointing in the future
in which multiple threads interleave data from multiple processes into
a single stream).

The pre-header is:

struct cr_hdr {
	__s16 type;
	__s16 len;
	__u32 id;
};

'type' identified the type of the following payload, 'len' tells its length.
The 'id' identifies the object instance to which it belongs (it is currently
unused). The meaning of the 'id' field may vary depending on the type. For
example, for type CR_HDR_MM, the 'id' will identify the task to which this
MM belongs. The payload varies depending on its type, for instance, the data
describing a task_struct is given by a 'struct cr_hdr_task' (type CR_HDR_TASK)
and so on.

The format of the memory dump is slightly different: for each vma, there is
a 'struct cr_vma'; if the vma is file-mapped, it will be followed by the file
name. The cr_vma->npages will tell how many pages were dumped for this vma.
Then it will be followed by the actual data: first a dump of the addresses of
all dumped pages (npages entries) followed by a dump of the contents of all
dumped pages (npages pages). Then will come the next vma and so on.

For a single simple task, the format of the resulting checkpoint image would
look like this (assume 2 vma's, one file mapped with 2 dumped pages and the
other anonymous with 3 dumped pages):

cr_hdr + cr_hdr_head
cr_hdr + cr_hdr_task
	cr_hdr + cr_hdr_mm
		cr_hdr + cr_hdr_vma + cr_hdr + string
			addr1, addr2
			page1, page2
		cr_hdr + cr_hdr_vma
			addr3, addr4, addr5
			page3, page4, page5
		cr_hdr + cr_mm_context
	cr_hdr + cr_hdr_thread
	cr_hdr + cr_hdr_cpu
cr_hdr + cr_hdr_tail

Will add this documentation to the next version of the patch.

Oren.

  parent reply	other threads:[~2008-07-30 18:22 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-30  3:27 [RFC][PATCH 2/2] CR: handle a single task with private memory maps Oren Laadan
     [not found] ` <Pine.LNX.4.64.0807292325290.9868-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-07-30  4:51   ` KOSAKI Motohiro
     [not found]     ` <20080730132257.9DF2.KOSAKI.MOTOHIRO-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2008-07-30 18:22       ` Oren Laadan [this message]
2008-07-30 20:58   ` Dave Hansen
2008-07-30 22:07   ` Serge E. Hallyn
     [not found]     ` <20080730220752.GA3518-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-30 22:20       ` Oren Laadan
     [not found]         ` <4890E930.9090204-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-07-31 13:57           ` Louis Rilling
     [not found]             ` <20080731135703.GC22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-31 15:09               ` Oren Laadan
     [not found]                 ` <4891D5C2.8090000-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-07-31 15:58                   ` Louis Rilling
     [not found]                     ` <20080731155856.GH22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-31 16:28                       ` Oren Laadan
     [not found]                         ` <4891E849.1050701-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-07-31 17:50                           ` Louis Rilling
     [not found]                             ` <20080731175058.GI22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-31 19:12                               ` Oren Laadan
     [not found]                                 ` <48920EA0.1060608-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-01 10:26                                   ` Louis Rilling
     [not found]                                     ` <20080801102600.GJ22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-01 14:15                                       ` Oren Laadan
     [not found]                                         ` <48931A7E.1040302-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-01 18:00                                           ` Louis Rilling
     [not found]                                             ` <20080801180038.GL22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-01 18:51                                               ` Oren Laadan
     [not found]                                                 ` <48935B4D.7070302-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-04 10:16                                                   ` Louis Rilling
2008-08-05  2:37                                                     ` Oren Laadan
     [not found]                                                       ` <4897BCE0.1080508-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-05  3:51                                                         ` Joseph Ruscio
     [not found]                                                           ` <1FA56146-7C30-4C36-982D-A50AA8BC8392-ccALPSaRSA5Wk0Htik3J/w@public.gmane.org>
2008-08-05  9:19                                                             ` Louis Rilling
2008-08-05 16:20                                                               ` Oren Laadan
     [not found]                                                                 ` <48987DE7.3060408-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-06 15:41                                                                   ` Joseph Ruscio
     [not found]                                                                     ` <3A99F254-E9B3-484B-85B0-29023ADA04C4-ccALPSaRSA5Wk0Htik3J/w@public.gmane.org>
2008-08-07  9:25                                                                       ` Louis Rilling
2008-08-05 16:23                                                             ` Dave Hansen
2008-08-06 16:15                                                               ` Joseph Ruscio
     [not found]                                                                 ` <FE4D936E-06F1-45D2-8E7C-85D87149BDC0-ccALPSaRSA5Wk0Htik3J/w@public.gmane.org>
2008-08-07  9:29                                                                   ` Louis Rilling
2008-08-08 17:20                                                               ` Joseph Ruscio
     [not found]                                                                 ` <03CE5BD3-E84A-4617-93BC-722ECB846C63-ccALPSaRSA5Wk0Htik3J/w@public.gmane.org>
2008-08-08 17:24                                                                   ` Dave Hansen
2008-08-05  9:32                                                         ` Louis Rilling
2008-07-31 21:25           ` Serge E. Hallyn
     [not found] ` <20080730161535.GB22403@hawkmoon.kerlabs.com>
     [not found]   ` <20080730161535.GB22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-30 18:27     ` Oren Laadan
     [not found]       ` <4890B2A8.8010808-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-07-31 14:08         ` Louis Rilling
     [not found]           ` <20080731140844.GE22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-31 14:44             ` Oren Laadan
  -- strict thread matches above, loose matches on Subject: below --
2008-07-30 16:52 Serge E. Hallyn
     [not found] ` <20080730165249.GA23802-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-07-30 17:40   ` Dave Hansen
2008-07-31 13:59     ` Louis Rilling
     [not found]       ` <20080731135910.GD22403-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-07-31 14:14         ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4890B162.9050405@cs.columbia.edu \
    --to=orenl-eqauephvms7envbuuze7ea@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=kosaki.motohiro-+CUm20s59erQFUHtdCDX3A@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.