From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:36467) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RgFGa-0004a2-59 for qemu-devel@nongnu.org; Thu, 29 Dec 2011 07:39:29 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RgFGY-0007Cu-Pp for qemu-devel@nongnu.org; Thu, 29 Dec 2011 07:39:28 -0500 Received: from mail.valinux.co.jp ([210.128.90.3]:56842) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RgFGY-0007CR-9g for qemu-devel@nongnu.org; Thu, 29 Dec 2011 07:39:26 -0500 Date: Thu, 29 Dec 2011 21:39:22 +0900 From: Isaku Yamahata Message-ID: <20111229123922.GG19274@valinux.co.jp> References: <4EFC4DF0.2040708@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EFC4DF0.2040708@redhat.com> Subject: Re: [Qemu-devel] [PATCH 0/2][RFC] postcopy migration: Linux char device for postcopy List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: t.hirofuchi@aist.go.jp, qemu-devel@nongnu.org, kvm@vger.kernel.org, satoshi.itoh@aist.go.jp On Thu, Dec 29, 2011 at 01:24:32PM +0200, Avi Kivity wrote: > On 12/29/2011 03:26 AM, Isaku Yamahata wrote: > > This is Linux kernel driver for qemu/kvm postcopy live migration. > > This is used by qemu/kvm postcopy live migration patch. > > > > TODO: > > - Consider FUSE/CUSE option > > So far several mmap patches for FUSE/CUSE are floating around. (their > > purpose isn't different from our purpose, though). They haven't merged > > into the upstream yet. > > The driver specific part in qemu patches is modularized. So I expect it > > wouldn't be difficult to switch kernel driver to CUSE based driver. > > It would be good to get more input about this, please involve lkml and > the FUSE/CUSE people. Okay. > > ioctl commands: > > > > UMEM_DEV_CRATE_UMEM: create umem device for qemu > > UMEM_DEV_LIST: list created umem devices > > UMEM_DEV_REATTACH: re-attach the created umem device > > UMEM_DEV_LIST and UMEM_DEV_REATTACH are used when > > the process that services page fault disappears or get stack. > > Then, administrator can list the umem devices and unblock > > the process which is waiting for page. > > Ah, I asked about this in my patch comments. I think this is done > better by using SCM_RIGHTS to pass fds along, or asking qemu to launch a > new process. Can you please elaborate? I think those ways you are suggesting doesn't solve the issue. Let me clarify the problem. process A (typically incoming qemu) | | mmap("/dev/umem") and access those pages triggering page faults | (the file descriptor might be closed after mmap() before page faults) | V /dev/umem ^ | | daemon X resolving page faults triggered by process A (typically this daemon forked from incoming qemu:process A) If daemon X disappears accidentally, there is no one that resolves page faults of process A. At this moment process A is blocked due to page fault. There is no file descriptor available corresponding to the VMA. Here there is no way to kill process A, but system reboot. > Introducing a global namespace has a lot of complications attached. > > > > > UMEM_GET_PAGE_REQUEST: retrieve page fault of qemu process > > UMEM_MARK_PAGE_CACHED: mark the specified pages pulled from the source > > for daemon > > > > UMEM_MAKE_VMA_ANONYMOUS: make the specified vma in the qemu process > > This is _NOT_ implemented yet. > > anonymous I'm not sure whether this can be implemented > > or not. > > How do we find out? This is fairly important, stuff like transparent > hugepages and ksm only works on anonymous memory. I agree that this is important. At KVM-forum 2011, Andrea said THP and KSM works with non-anonymous VMA. (Or at lease he'll look into those stuff. My memory is vague, though. Please correct me if I'm wrong) -- yamahata