From: ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org (Eric W. Biederman)
To: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
Cc: containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [BIG RFC] Filesystem-based checkpoint
Date: Thu, 30 Oct 2008 16:33:16 -0700 [thread overview]
Message-ID: <m163n9y7yb.fsf@frodo.ebiederm.org> (raw)
In-Reply-To: <1225219047.12673.182.camel@nimitz> (Dave Hansen's message of "Tue, 28 Oct 2008 11:37:27 -0700")
Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> I hate the syscall. It's a very un-Linux-y way of doing things. There,
> I said it. Here's an alternative. It still uses the syscall to
> initiate things, but it uses debugfs to transport the data instead.
> This is just a concept demonstration. It doesn't actually work, and I
> wouldn't be using debugfs in practice.
A syscall is a very linux-y way to do it.
If you called it a core dump instead of a checkpoint you have exactly the same set
of issues.
Why we are doing vfs_write instead of file->f_op->write I don't understand.
> System calls in Linux are fast. Doing lots of them is not a problem.
> If it becomes one, we can always export a condensed version of this
> format next to the expanded one, kinda like ftrace does. Atomicity with
> this approach is also not a problem. The system call in this approach
> doesn't return until the checkpoint is completely written out.
Extra copies for something (memory) you want to transfer quickly
and efficiently is a problem.
Reading the memory of another process is a problem, to the point
that the /proc/<pid>/mem interface has been removed from the kernel.
> This lets userspace pick and choose what parts of the checkpoint it
> cares about. It enables us to do all the I/O from userspace: no
> in-kernel sys_read/write(). I think this interface is much more
> flexible than a plain syscall.
Then get with Roland McGraff and build the next generation user
space debugging interface.
> Want to do a fast checkpoint? Fine, copy all data, use a lot of memory,
> store it in-kernel. Dump that out when the filesystem is accessed.
> Destroy it when userspace asks.
> So, why not?
Besides the part of creating a bunch of questionable interfaces
that we need to support forever.
Ultimately the question is how do you do checkpoint restore and I just
don't see that happening with a filesystem interface. Way way way too many
dangerous syscalls that are only needed for one thing.
Checkpoint/Restore are an atomic operation, and filesystems suck and building
high level atomic primitives.
Eric
next prev parent reply other threads:[~2008-10-30 23:33 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-28 18:37 [BIG RFC] Filesystem-based checkpoint Dave Hansen
2008-10-28 20:56 ` Serge E. Hallyn
[not found] ` <20081028205654.GA17487-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-28 21:00 ` Dave Hansen
2008-10-28 21:10 ` Dave Hansen
2008-10-30 16:25 ` Oren Laadan
[not found] ` <4909E000.9070201-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 16:36 ` Dave Hansen
2008-10-30 18:19 ` Oren Laadan
[not found] ` <4909FAA8.5000107-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 19:28 ` Serge E. Hallyn
[not found] ` <20081030192817.GA16340-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-30 19:39 ` Dave Hansen
2008-10-30 19:50 ` Serge E. Hallyn
2008-10-30 19:47 ` Oren Laadan
[not found] ` <490A0F67.5000303-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:03 ` Serge E. Hallyn
2008-10-30 20:11 ` Dave Hansen
2008-11-04 21:33 ` Mike Waychison
2008-10-30 19:37 ` Dave Hansen
2008-10-30 20:15 ` Oren Laadan
[not found] ` <490A15F5.6010702-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:40 ` Dave Hansen
2008-10-30 23:33 ` Eric W. Biederman [this message]
[not found] ` <m163n9y7yb.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31 0:09 ` Dave Hansen
2008-10-31 3:12 ` Eric W. Biederman
[not found] ` <m1k5bpwj8j.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31 10:22 ` Louis Rilling
2008-10-31 13:48 ` Serge E. Hallyn
2008-10-31 14:21 ` Dave Hansen
2008-10-31 20:51 ` Eric W. Biederman
[not found] ` <m1r65wpjx2.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-11-03 17:23 ` Dave Hansen
2008-11-03 17:48 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m163n9y7yb.fsf@frodo.ebiederm.org \
--to=ebiederm-as9lmozglivwk0htik3j/w@public.gmane.org \
--cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
--cc=dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox