From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:38319) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RgGMH-0006vh-8s for qemu-devel@nongnu.org; Thu, 29 Dec 2011 08:49:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RgGMG-0004wg-5C for qemu-devel@nongnu.org; Thu, 29 Dec 2011 08:49:25 -0500 Received: from mail.valinux.co.jp ([210.128.90.3]:38272) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RgGMF-0004wH-Ks for qemu-devel@nongnu.org; Thu, 29 Dec 2011 08:49:24 -0500 Date: Thu, 29 Dec 2011 22:49:20 +0900 From: Isaku Yamahata Message-ID: <20111229134920.GH19274@valinux.co.jp> References: <4EFC4DF0.2040708@redhat.com> <20111229123922.GG19274@valinux.co.jp> <4EFC634E.10406@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4EFC634E.10406@redhat.com> Subject: Re: [Qemu-devel] [PATCH 0/2][RFC] postcopy migration: Linux char device for postcopy List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Avi Kivity Cc: Andrea Arcangeli , t.hirofuchi@aist.go.jp, qemu-devel@nongnu.org, kvm@vger.kernel.org, satoshi.itoh@aist.go.jp On Thu, Dec 29, 2011 at 02:55:42PM +0200, Avi Kivity wrote: > On 12/29/2011 02:39 PM, Isaku Yamahata wrote: > > > > ioctl commands: > > > > > > > > UMEM_DEV_CRATE_UMEM: create umem device for qemu > > > > UMEM_DEV_LIST: list created umem devices > > > > UMEM_DEV_REATTACH: re-attach the created umem device > > > > UMEM_DEV_LIST and UMEM_DEV_REATTACH are used when > > > > the process that services page fault disappears or get stack. > > > > Then, administrator can list the umem devices and unblock > > > > the process which is waiting for page. > > > > > > Ah, I asked about this in my patch comments. I think this is done > > > better by using SCM_RIGHTS to pass fds along, or asking qemu to launch a > > > new process. > > > > Can you please elaborate? I think those ways you are suggesting doesn't solve > > the issue. Let me clarify the problem. > > > > process A (typically incoming qemu) > > | > > | mmap("/dev/umem") and access those pages triggering page faults > > | (the file descriptor might be closed after mmap() before page faults) > > | > > V > > /dev/umem > > ^ > > | > > | > > daemon X resolving page faults triggered by process A > > (typically this daemon forked from incoming qemu:process A) > > > > If daemon X disappears accidentally, there is no one that resolves > > page faults of process A. At this moment process A is blocked due to page > > fault. There is no file descriptor available corresponding to the VMA. > > Here there is no way to kill process A, but system reboot. > > qemu can have an extra thread that wait4()s the daemon, and relaunch > it. This extra thread would not be blocked by the page fault. It can > keep the fd so it isn't lost. > > The unkillability of process A is a security issue; it could be done on > purpose. Is it possible to change umem to sleep with > TASK_INTERRUPTIBLE, so it can be killed? The issue is how to solve the page fault, not whether TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE. I can think of several options. - When daemon X is dead, all page faults are served by zero pages. - When daemon X is dead, all page faults are resovled as VM_FAULT_SIGBUS - list/reattach: complications. You don't like it - other? > > > Introducing a global namespace has a lot of complications attached. > > > > > > > > > > > UMEM_GET_PAGE_REQUEST: retrieve page fault of qemu process > > > > UMEM_MARK_PAGE_CACHED: mark the specified pages pulled from the source > > > > for daemon > > > > > > > > UMEM_MAKE_VMA_ANONYMOUS: make the specified vma in the qemu process > > > > This is _NOT_ implemented yet. > > > > anonymous I'm not sure whether this can be implemented > > > > or not. > > > > > > How do we find out? This is fairly important, stuff like transparent > > > hugepages and ksm only works on anonymous memory. > > > > I agree that this is important. > > At KVM-forum 2011, Andrea said THP and KSM works with non-anonymous VMA. > > (Or at lease he'll look into those stuff. My memory is vague, though. > > Please correct me if I'm wrong) > > += Andrea (who can also provide feedback on umem in general) > > -- > error compiling committee.c: too many arguments to function > -- yamahata