All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: "Eric W. Biederman" <ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org>
Cc: containers <containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org>
Subject: Re: [BIG RFC] Filesystem-based checkpoint
Date: Fri, 31 Oct 2008 07:21:42 -0700	[thread overview]
Message-ID: <1225462902.12673.398.camel@nimitz> (raw)
In-Reply-To: <m1k5bpwj8j.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>

On Thu, 2008-10-30 at 20:12 -0700, Eric W. Biederman wrote:
> Dave Hansen <dave-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> writes:
> >> > System calls in Linux are fast.  Doing lots of them is not a problem.
> >> > If it becomes one, we can always export a condensed version of this
> >> > format next to the expanded one, kinda like ftrace does.  Atomicity with
> >> > this approach is also not a problem.  The system call in this approach
> >> > doesn't return until the checkpoint is completely written out.
> >> 
> >> Extra copies for something (memory) you want to transfer quickly
> >> and efficiently is a problem.
> >
> > That's definitely true.  But, as I said, this approach isn't bound to
> > copying everything.  We have the flexibility to choose what we do.
> 
> With a file descriptor I can push the data onto a network socket and
> the receiving process is on another computer.  0 copies, 0 trips
> to user space.  I'm not certain how you would achieve that with filesystem
> approach.

for sys_checkpoint() does:
	1. copy from task_struct (or whatever kernel struct) into buffer
	2. run vfs_write() with that buffer and the user fd
	3. fd target reads from that buffer

The fs approach would:
	1. user calls read()
	2. fs fills data in directly into *userspace* buffer
	3. user does sendfile, etc...

See?  sys_checkpoint() *does* a copy.  It just does it into a kernel
buffer.  That's why we need to call vfs_write().

> I'm saying inspecting another process is a very racy operation so something
> we need to be especially careful with. 

No disagreement from me on that one.  

> >> Ultimately the question is how do you do checkpoint restore and I just
> >> don't see that happening with a filesystem interface.  Way way way too many
> >> dangerous syscalls that are only needed for one thing.
> >
> > I completely understand what you're saying here.  But, could you
> > distinguish how this differs from the current way that sys_checkpoint()
> > does it?  Surely, the checkpoint format is an ABI.  It is a complex ABI
> > with many, many constituent structures.  This is an ABI with many, many,
> > ways of reading simple data.  Seems like just slicing up the problem
> > differently to me.
> 
> I was thinking about restore.  Creating objects with a certain id can
> easily be a security risk if you are not creating the namespace those
> objects live in at the same time.  There is currently the downside
> that we can't create namespaces as unprivileged users ( The
> implementation of suid is so annoying). But the general concept still
> applies, and if we ever get the uid namespace correct we will be able
> to create namespaces as unprivileged users.

Eric, you were saying that my interface had way too many "dangerous
syscalls".  How does this relate to user namespaces and creating objects
with particular ids?  Surely if the true problem with my suggested
approach has to do with creating empty namespaces, the same problem
exists with the sys_checkpoint() approach.

-- Dave

  parent reply	other threads:[~2008-10-31 14:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-10-28 18:37 [BIG RFC] Filesystem-based checkpoint Dave Hansen
2008-10-28 20:56 ` Serge E. Hallyn
     [not found]   ` <20081028205654.GA17487-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-28 21:00     ` Dave Hansen
2008-10-28 21:10     ` Dave Hansen
2008-10-30 16:25       ` Oren Laadan
     [not found]         ` <4909E000.9070201-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 16:36           ` Dave Hansen
2008-10-30 18:19 ` Oren Laadan
     [not found]   ` <4909FAA8.5000107-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 19:28     ` Serge E. Hallyn
     [not found]       ` <20081030192817.GA16340-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-10-30 19:39         ` Dave Hansen
2008-10-30 19:50           ` Serge E. Hallyn
2008-10-30 19:47         ` Oren Laadan
     [not found]           ` <490A0F67.5000303-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:03             ` Serge E. Hallyn
2008-10-30 20:11             ` Dave Hansen
2008-11-04 21:33               ` Mike Waychison
2008-10-30 19:37     ` Dave Hansen
2008-10-30 20:15       ` Oren Laadan
     [not found]         ` <490A15F5.6010702-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-10-30 20:40           ` Dave Hansen
2008-10-30 23:33 ` Eric W. Biederman
     [not found]   ` <m163n9y7yb.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31  0:09     ` Dave Hansen
2008-10-31  3:12       ` Eric W. Biederman
     [not found]         ` <m1k5bpwj8j.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-10-31 10:22           ` Louis Rilling
2008-10-31 13:48           ` Serge E. Hallyn
2008-10-31 14:21           ` Dave Hansen [this message]
2008-10-31 20:51             ` Eric W. Biederman
     [not found]               ` <m1r65wpjx2.fsf-B27657KtZYmhTnVgQlOflh2eb7JE58TQ@public.gmane.org>
2008-11-03 17:23                 ` Dave Hansen
2008-11-03 17:48                   ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1225462902.12673.398.camel@nimitz \
    --to=dave-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=containers-qjLDD68F18O7TbgM5vRIOg@public.gmane.org \
    --cc=ebiederm-aS9lmoZGLiVWk0Htik3J/w@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.