All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave@linux.vnet.ibm.com>
To: Oren Laadan <orenl@cs.columbia.edu>
Cc: arnd@arndb.de, jeremy@goop.org, linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org
Subject: Re: [RFC v2][PATCH 4/9] Memory management - dump state
Date: Wed, 27 Aug 2008 08:41:36 -0700	[thread overview]
Message-ID: <1219851696.8680.67.camel@nimitz> (raw)
In-Reply-To: <48B49C61.1040003@cs.columbia.edu>

On Tue, 2008-08-26 at 20:14 -0400, Oren Laadan wrote:
> >>>> +
> >>>> +	while (addr < end) {
> >>>> +		struct page *page;
> >>>> +
> >>>> +		/* simplified version of get_user_pages(): already have vma,
> >>>> +		* only need FOLL_TOUCH, and (for now) ignore fault stats */
> >>>> +
> >>>> +		cond_resched();
> >>>> +		while (!(page = follow_page(vma, addr, FOLL_TOUCH))) {
> >>>> +			ret = handle_mm_fault(vma->vm_mm, vma, addr, 0);
> >>>> +			if (ret & VM_FAULT_ERROR) {
> >>>> +				if (ret & VM_FAULT_OOM)
> >>>> +					ret = -ENOMEM;
> >>>> +				else if (ret & VM_FAULT_SIGBUS)
> >>>> +					ret = -EFAULT;
> >>>> +				else
> >>>> +					BUG();
> >>>> +				break;
> >>>> +			}
> >>>> +			cond_resched();
> >>>> +		}
> >>> At best this needs to get folded back into its own function.  This is
> >> This is almost identical to the original - see the preceding comment.
> > 
> > Exactly.  The code is copy-and-pasted.  If there's a bug in the
> > original, who will change this one?  Better to simply consolidate the
> > code into one copy.
> > 
> >>> pretty hard to read as-is.  Also, are there truly no in-kernel functions
> >>> that can be used for this?
> >> Can you suggest one ?
> > 
> > This is where the mentality has to shift.  Instead of thinking "there is
> > no exact in-kernel match for what I want here" we need to consider that
> > we can modify the in-kernel ones.  They can be modified to suit both
> > existing and the new needs that we have.
> 
> I agree.
> 
> However, my main goal now is not to make this patch perfect, but to provide
> a viable proof-of-concept that demonstrates how we want to do things. Unless
> you feel we are near ready to send these for inclusion soon (...), I intend
> to prioritize design and functionality.

So, you currently have buy-in on this basic approach from the OpenVZ
guys, and probably Ingo.  If you get too far along in the "design and
functionality" and a community member comes up with a really good
objection to some basic part of the "design and functionality" you're
going to have a lot of code to rewrite.

I'm trying to get minimized the quantity of "design and functionality"
down to the bare minimum set where we can get this merged.  So, yes, I
do think these should be sent for inclusion soon.

I believe that we're on course to create another large out-of-tree patch
set that does in-kernel checkpoint/restart.  We have no shortage of
those.  It's been done many times before.

This one will *HOPEFULLY* be different from all the rest when it gets
merged.  

> >>>> +	for (pgarr = ctx->pgarr; pgarr; pgarr = pgarr->next) {
> >>>> +		struct page **pages = pgarr->pages;
> >>>> +		int nr = pgarr->nused;
> >>>> +		void *ptr;
> >>>> +
> >>>> +		while (nr--) {
> >>>> +			ptr = kmap(*pages);
> >>>> +			ret = cr_kwrite(ctx, ptr, PAGE_SIZE);
> >>>> +			kunmap(*pages);
> >>> Why not use kmap_atomic() here?
> >> It is my understanding that the code must not sleep between kmap_atomic()
> >> and kunmap_atomic().
> > 
> > Yes, but you're going to absolutely kill performance on systems which
> > have expensive global TLB flushes.  Frequent kmaps()s should be avoided
> > at all costs.
> > 
> > The best way around this would be to change the interface to cr_kwrite()
> > so that it didn't have to use *mapped* kernel memory.  Maybe an
> > interface that takes 'struct page'.  Or, one where we kmap_atomic() the
> > buffer, kunmap_atomic(), copy to a temporary buffer, then cr_kwrite().
> > 
> >>>> +static int cr_write_vma(struct cr_ctx *ctx, struct vm_area_struct *vma)
> >>>> +{
> >>>> +	struct cr_hdr h;
> >>>> +	struct cr_hdr_vma *hh = ctx->hbuf;
> >>>> +	char *fname = NULL;
> >>>> +	int flen = 0, how, nr, ret;
> >>>> +
> >>>> +	h.type = CR_HDR_VMA;
> >>>> +	h.len = sizeof(*hh);
> >>>> +	h.ptag = 0;
> >>>> +
> >>>> +	hh->vm_start = vma->vm_start;
> >>>> +	hh->vm_end = vma->vm_end;
> >>>> +	hh->vm_page_prot = vma->vm_page_prot.pgprot;
> >>>> +	hh->vm_flags = vma->vm_flags;
> >>>> +	hh->vm_pgoff = vma->vm_pgoff;
> >>>> +
> >>>> +	if (vma->vm_flags & (VM_SHARED | VM_IO | VM_HUGETLB | VM_NONLINEAR)) {
> >>>> +		pr_warning("CR: unknown VMA %#lx\n", vma->vm_flags);
> >>>> +		return -ETXTBSY;
> >>>> +	}
> >>> Hmmm.  Interesting error code for VM_HUGETLB.  :)
> >> :)  well, the usual EINVAL didn't seem suitable. Any better suggestions ?
> > 
> > -ENOSUP might be clearest for now.  "Your system call tried to do
> > something unsupported."
> 
> Are you suggesting adding a new error code ?

Dang.  What's the thing we return from system calls that are
unsupported?  Did I just dream that up?

-- Dave


  reply	other threads:[~2008-08-27 15:41 UTC|newest]

Thread overview: 107+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-21  2:58 [RFC v2][PATCH 1/9] kernel based checkpoint-restart Oren Laadan
2008-08-21  3:03 ` [RFC v2][PATCH 1/9] Create trivial sys_checkpoint/sys_restart syscalls Oren Laadan
     [not found]   ` <Pine.LNX.4.64.0808202302510.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21  5:17     ` Oren Laadan
2008-08-22 19:32     ` Dave Hansen
2008-08-22 19:32       ` Dave Hansen
2008-08-22 20:11       ` Dave Hansen
2008-08-22 20:11       ` Dave Hansen
2008-08-22 21:20       ` Oren Laadan
2008-08-22 21:20       ` Oren Laadan
2008-08-21  5:17   ` Oren Laadan
     [not found] ` <Pine.LNX.4.64.0808202251500.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21  3:03   ` Oren Laadan
2008-08-21  3:04   ` [RFC v2][PATCH 2/9] General infrastructure for checkpoint restart Oren Laadan
2008-08-21  3:04     ` Oren Laadan
     [not found]     ` <Pine.LNX.4.64.0808202303460.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21  9:35       ` Louis Rilling
2008-08-22 20:01       ` Dave Hansen
2008-08-21  9:35     ` Louis Rilling
2008-08-24  5:58       ` Oren Laadan
     [not found]       ` <20080821093529.GG581-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-24  5:58         ` Oren Laadan
2008-08-22 20:01     ` Dave Hansen
2008-08-25  2:47       ` Oren Laadan
2008-08-26 16:42         ` Dave Hansen
2008-08-27  0:38           ` Oren Laadan
2008-08-27  0:38           ` Oren Laadan
     [not found]         ` <48B21D3C.8090601-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-26 16:42           ` Dave Hansen
2008-08-25  2:47       ` Oren Laadan
2008-08-21  3:04   ` [RFC v2][PATCH 3/9] x86 support for checkpoint/restart Oren Laadan
2008-08-21  3:05   ` [RFC v2][PATCH 4/9] Memory management - dump state Oren Laadan
2008-08-21  3:05   ` [RFC v2][PATCH 5/9] Memory managemnet - restore state Oren Laadan
2008-08-21  3:06   ` [RFC v2][PATCH 6/9] Checkpoint/restart: initial documentation Oren Laadan
2008-08-21  3:06   ` [RFC v2][PATCH 7/9] Infrastructure for shared objects Oren Laadan
2008-08-21  3:06     ` Oren Laadan
     [not found]     ` <Pine.LNX.4.64.0808202306170.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21 10:40       ` Louis Rilling
2008-08-21 10:40     ` Louis Rilling
     [not found]       ` <20080821104016.GJ581-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-26 17:01         ` Dave Hansen
2008-08-26 17:01       ` Dave Hansen
2008-08-27  8:26         ` Louis Rilling
2008-08-27  8:26         ` Louis Rilling
2008-08-21  3:07   ` [RFC v2][PATCH 8/9] File descriprtors - dump state Oren Laadan
2008-08-21  3:07     ` Oren Laadan
2008-08-21 11:06     ` Louis Rilling
     [not found]       ` <20080821110614.GK581-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-25  3:28         ` Oren Laadan
2008-08-25  3:28       ` Oren Laadan
2008-08-25 10:30         ` Louis Rilling
     [not found]         ` <48B226CE.2060700-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-25 10:30           ` Louis Rilling
     [not found]     ` <Pine.LNX.4.64.0808202306530.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21 11:06       ` Louis Rilling
2008-08-21  3:07   ` [RFC v2][PATCH 9/9] File descriprtors (restore) Oren Laadan
2008-08-21  3:07     ` Oren Laadan
2008-08-21  5:26     ` Oren Laadan
     [not found]     ` <Pine.LNX.4.64.0808202307270.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21  5:26       ` Oren Laadan
2008-08-21  5:15   ` [RFC v2][PATCH 1/9] kernel based checkpoint-restart Oren Laadan
2008-08-21  3:04 ` [RFC v2][PATCH 3/9] x86 support for checkpoint/restart Oren Laadan
2008-08-22 20:17   ` Dave Hansen
     [not found]   ` <Pine.LNX.4.64.0808202304180.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-22 20:17     ` Dave Hansen
2008-08-21  3:05 ` [RFC v2][PATCH 4/9] Memory management - dump state Oren Laadan
2008-08-21  7:30   ` Ingo Molnar
2008-08-21  8:01     ` Justin P. Mattock
2008-08-21 10:28     ` Balbir Singh
     [not found]       ` <48AD433E.7010905-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2008-08-21 11:59         ` Ingo Molnar
2008-08-21 11:59       ` Ingo Molnar
     [not found]     ` <20080821073020.GA28386-X9Un+BFzKDI@public.gmane.org>
2008-08-21  8:01       ` Justin P. Mattock
2008-08-21 10:28       ` Balbir Singh
     [not found]   ` <Pine.LNX.4.64.0808202304490.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21  7:30     ` Ingo Molnar
2008-08-21  9:53     ` Louis Rilling
2008-08-22 20:37     ` Dave Hansen
2008-08-21  9:53   ` Louis Rilling
     [not found]     ` <20080821095316.GH581-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-22 21:21       ` Oren Laadan
2008-08-22 21:21     ` Oren Laadan
2008-08-22 20:37   ` Dave Hansen
2008-08-24  5:40     ` Oren Laadan
2008-08-24  5:40     ` Oren Laadan
     [not found]       ` <48B0F449.2000006-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-26 16:33         ` Dave Hansen
2008-08-26 16:33       ` Dave Hansen
2008-08-27  0:14         ` Oren Laadan
2008-08-27  0:14         ` Oren Laadan
2008-08-27 15:41           ` Dave Hansen [this message]
2008-08-27 15:57             ` Louis Rilling
     [not found]               ` <20080827155725.GD14473-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2008-08-27 16:12                 ` Dave Hansen
2008-08-27 16:12               ` Dave Hansen
2008-08-27 16:19                 ` Jeremy Fitzhardinge
2008-08-27 16:19                 ` Jeremy Fitzhardinge
2008-08-27 15:57             ` Louis Rilling
2008-08-27 20:34             ` Serge E. Hallyn
2008-08-27 20:34             ` Serge E. Hallyn
2008-08-27 20:38               ` Dave Hansen
2008-08-27 20:48                 ` Serge E. Hallyn
2008-08-27 20:48                 ` Serge E. Hallyn
2008-08-27 20:56                   ` Dave Hansen
2008-08-31  7:16                     ` Oren Laadan
2008-08-31  7:16                     ` Oren Laadan
2008-08-31 17:34                       ` Cedric Le Goater
     [not found]                         ` <48BAD637.2050505-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 11:43                           ` [Devel] " Andrey Mirkin
2008-09-03 11:43                         ` Andrey Mirkin
     [not found]                           ` <200809031543.24001.amirkin-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2008-09-03 12:15                             ` Cedric Le Goater
2008-09-03 12:15                           ` Cedric Le Goater
2008-09-03 13:29                             ` Andrey Mirkin
     [not found]                             ` <48BE7FDF.3080202-NmTC/0ZBporQT0dZR+AlfA@public.gmane.org>
2008-09-03 13:29                               ` Andrey Mirkin
     [not found]                       ` <48BA454F.7050308-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-31 17:34                         ` Cedric Le Goater
2008-09-02 15:32                         ` Serge E. Hallyn
2008-09-02 15:32                       ` Serge E. Hallyn
     [not found]                   ` <20080827204853.GA4189-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-08-27 20:56                     ` Dave Hansen
     [not found]               ` <20080827203427.GA1158-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2008-08-27 20:38                 ` Dave Hansen
     [not found]           ` <48B49C61.1040003-eQaUEPhvms7ENvBUuze7eA@public.gmane.org>
2008-08-27 15:41             ` Dave Hansen
2008-08-21  3:05 ` [RFC v2][PATCH 5/9] Memory managemnet - restore state Oren Laadan
     [not found]   ` <Pine.LNX.4.64.0808202305210.17436-CXF6herHY6ykSYb+qCZC/1i27PF6R63G9nwVQlTi/Pw@public.gmane.org>
2008-08-21 10:07     ` Louis Rilling
2008-08-21 10:07   ` Louis Rilling
2008-08-21  3:06 ` [RFC v2][PATCH 6/9] Checkpoint/restart: initial documentation Oren Laadan
2008-08-21  5:15 ` [RFC v2][PATCH 1/9] kernel based checkpoint-restart Oren Laadan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1219851696.8680.67.camel@nimitz \
    --to=dave@linux.vnet.ibm.com \
    --cc=arnd@arndb.de \
    --cc=containers@lists.linux-foundation.org \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=orenl@cs.columbia.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.