All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [BIG RFC] Filesystem-based checkpoint
Date: Thu, 30 Oct 2008 17:09:17 -0700	[thread overview]
Message-ID: <1225411757.12673.383.camel@nimitz> (raw)
In-Reply-To: <m163n9y7yb.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>

On Thu, 2008-10-30 at 16:33 -0700, Eric W. Biederman wrote:
> Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> > I hate the syscall.  It's a very un-Linux-y way of doing things.  There,
> > I said it.  Here's an alternative.  It still uses the syscall to
> > initiate things, but it uses debugfs to transport the data instead.
> > This is just a concept demonstration.  It doesn't actually work, and I
> > wouldn't be using debugfs in practice.
> 
> A syscall is a very linux-y way to do it.

Darn, I thought I'd be able to sneak that one by.

> If you called it a core dump instead of a checkpoint you have exactly the same set
> of issues.

I completely agree with you that there's a lot of common ground here
between coredumps and checkpoints.  I'm not aware of any applications
like, let's say Oracle, that use coredumps in the process of normal
execution.  Checkpoints must be more scalable and lower overhead than
coredumps are.

> Why we are doing vfs_write instead of file->f_op->write I don't understand.

That's an excellent question.  I assume you're asking because at least
the elf core dump code uses it, right?

> > System calls in Linux are fast.  Doing lots of them is not a problem.
> > If it becomes one, we can always export a condensed version of this
> > format next to the expanded one, kinda like ftrace does.  Atomicity with
> > this approach is also not a problem.  The system call in this approach
> > doesn't return until the checkpoint is completely written out.
> 
> Extra copies for something (memory) you want to transfer quickly
> and efficiently is a problem.

That's definitely true.  But, as I said, this approach isn't bound to
copying everything.  We have the flexibility to choose what we do.

> Reading the memory of another process is a problem, to the point
> that the /proc/<pid>/mem interface has been removed from the kernel.

Yes, this is certainly true.  All of the ptrace-related security issues
surely tell us something.  But, I'm not sure of your point here.  Are
you saying that using sys_checkpoint() to dump a process's pages is
inherently safer than approach that uses a filesystem in order to do the
same?

> > Want to do a fast checkpoint?  Fine, copy all data, use a lot of memory,
> > store it in-kernel.  Dump that out when the filesystem is accessed.
> > Destroy it when userspace asks.
> 
> > So, why not?
> 
> Besides the part of creating a bunch of questionable interfaces
> that we need to support forever.
> 
> Ultimately the question is how do you do checkpoint restore and I just
> don't see that happening with a filesystem interface.  Way way way too many
> dangerous syscalls that are only needed for one thing.

I completely understand what you're saying here.  But, could you
distinguish how this differs from the current way that sys_checkpoint()
does it?  Surely, the checkpoint format is an ABI.  It is a complex ABI
with many, many constituent structures.  This is an ABI with many, many,
ways of reading simple data.  Seems like just slicing up the problem
differently to me.

-- Dave

  parent reply	other threads:[~2008-10-31  0:09 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-28 18:37 [BIG RFC] Filesystem-based checkpoint Dave Hansen
2008-10-28 20:56 ` Serge E. Hallyn
     [not found]   ` <20081028205654.GA17487-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-28 21:00     ` Dave Hansen
2008-10-28 21:10     ` Dave Hansen
2008-10-30 16:25       ` Oren Laadan
     [not found]         ` <4909E000.9070201-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 16:36           ` Dave Hansen
2008-10-30 18:19 ` Oren Laadan
     [not found]   ` <4909FAA8.5000107-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 19:28     ` Serge E. Hallyn
     [not found]       ` <20081030192817.GA16340-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-30 19:39         ` Dave Hansen
2008-10-30 19:50           ` Serge E. Hallyn
2008-10-30 19:47         ` Oren Laadan
     [not found]           ` <490A0F67.5000303-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:03             ` Serge E. Hallyn
2008-10-30 20:11             ` Dave Hansen
2008-11-04 21:33               ` Mike Waychison
2008-10-30 19:37     ` Dave Hansen
2008-10-30 20:15       ` Oren Laadan
     [not found]         ` <490A15F5.6010702-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:40           ` Dave Hansen
2008-10-30 23:33 ` Eric W. Biederman
     [not found]   ` <m163n9y7yb.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31  0:09     ` Dave Hansen [this message]
2008-10-31  3:12       ` Eric W. Biederman
     [not found]         ` <m1k5bpwj8j.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31 10:22           ` Louis Rilling
2008-10-31 13:48           ` Serge E. Hallyn
2008-10-31 14:21           ` Dave Hansen
2008-10-31 20:51             ` Eric W. Biederman
     [not found]               ` <m1r65wpjx2.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-11-03 17:23                 ` Dave Hansen
2008-11-03 17:48                   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1225411757.12673.383.camel@nimitz \
    --to=dave-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.