From mboxrd@z Thu Jan 1 00:00:00 1970 From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman) Subject: Re: Checkpoint/restart (was Re: [PATCH 0/4] - v2 - Object creation with a specified id) Date: Thu, 10 Jul 2008 11:55:34 -0700 Message-ID: References: <20080418054459.891481000@bull.net> <20080422193612.GA15835@martell.zuzino.mipt.ru> <1208890580.17117.14.camel@nimitz.home.sr71.net> <20080422210130.GA15937@martell.zuzino.mipt.ru> <1208904967.17117.51.camel@nimitz.home.sr71.net> <480ED9D5.1010906@parallels.com> <480FE037.2010302@cs.columbia.edu> <1215709949.9398.15.camel@nimitz> <20080710173246.GA1857@us.ibm.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20080710173246.GA1857-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org> (Serge E. Hallyn's message of "Thu, 10 Jul 2008 12:32:46 -0500") List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org Errors-To: containers-bounces-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org To: "Serge E. Hallyn" Cc: Kirill Korotaev , containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org, linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Dave Hansen , Alexey Dobriyan , Andrew Morton , nick-dnCCA748QAperShJXdIFYw@public.gmane.org, Nadia.Derbey-6ktuUTfB/bM@public.gmane.org List-Id: containers.vger.kernel.org "Serge E. Hallyn" writes: >> So, the checkpoint-as-a-corefile idea sounds good to me, but it >> definitely leaves a lot of questions about exactly how we'll need to do >> the restore. > > Talking with Dave over irc, I kind of liked the idea of creating a new > fs/binfmt_cr.c that executes a checkpoint-as-a-coredump file. > > One thing I do not like about the checkpoint-as-coredump is that it begs > us to dump all memory out into the file. Our plan/hope was to save > ourselves from writing out most memory by: > > 1. associating a separate swapfile with each container > 2. doing a swapfile snapshot at each checkpoint > 3. dumping the pte entries (/proc/self/) > > If we do checkpoint-as-a-coredump, then we need userspace to coordinate > a kernel-generated coredump with a user-generated (?) swapfile snapshot. > But I guess we figure that out later. Well it is a matter of which VMAs you dump. For things that are file backed you need to dump them. I don't know that even a binfmt for per process level checkpoints is sufficient but I do know having something of that granularity looks much easier. Otherwise it takes a bazillian little syscalls to do things no one else is interested in doing. Eric