From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([208.118.235.92]:43502) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIYik-0002iW-2Z for qemu-devel@nongnu.org; Thu, 21 Mar 2013 02:11:29 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1UIYih-0004EP-L2 for qemu-devel@nongnu.org; Thu, 21 Mar 2013 02:11:25 -0400 Received: from mx1.redhat.com ([209.132.183.28]:58609) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1UIYih-0004ED-Ct for qemu-devel@nongnu.org; Thu, 21 Mar 2013 02:11:23 -0400 Date: Thu, 21 Mar 2013 08:11:59 +0200 From: "Michael S. Tsirkin" Message-ID: <20130321061159.GA28328@redhat.com> References: <20130318212646.GB20406@redhat.com> <5147A209.80202@linux.vnet.ibm.com> <20130319081939.GC11259@redhat.com> <51487F68.2060305@linux.vnet.ibm.com> <20130319151606.GA13649@redhat.com> <51488521.4010909@linux.vnet.ibm.com> <20130319153658.GA14317@redhat.com> <51489BC3.3030504@linux.vnet.ibm.com> <51489D05.2000400@redhat.com> <5148A52E.6020208@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5148A52E.6020208@linux.vnet.ibm.com> Subject: Re: [Qemu-devel] [RFC PATCH RDMA support v4: 03/10] more verbose documentation of the RDMA transport List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael R. Hines" Cc: aliguori@us.ibm.com, qemu-devel@nongnu.org, owasserm@redhat.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, Paolo Bonzini On Tue, Mar 19, 2013 at 01:49:34PM -0400, Michael R. Hines wrote: > I also did a test using RDMA + cgroup, and the kernel killed my QEMU :) > > So, infiniband is not smart enough to know how to avoid pinning a > zero page, I guess. > > - Michael > > On 03/19/2013 01:14 PM, Paolo Bonzini wrote: > >Il 19/03/2013 18:09, Michael R. Hines ha scritto: > >>Allowing QEMU to swap due to a cgroup limit during migration is a viable > >>overcommit option? > >> > >>I'm trying to keep an open mind, but that would kill the migration > >>time..... > >Would it swap? Doesn't the kernel back all zero pages with a single > >copy-on-write page? If that still accounts towards cgroup limits, it > >would be a bug. > > > >Old kernels do not have a shared zero hugepage, and that includes some > >distro kernels. Perhaps that's the problem. > > > >Paolo > > I really shouldn't break COW if you don't request LOCAL_WRITE. I think it's a kernel bug, and apparently has been there in the code since the first version: get_user_pages parameters swapped. I'll send a patch. If it's applied, you should also change your code from + IBV_ACCESS_LOCAL_WRITE | + IBV_ACCESS_REMOTE_WRITE | + IBV_ACCESS_REMOTE_READ); to + IBV_ACCESS_REMOTE_READ); on send side. Then, each time we detect a page has changed we must make sure to unregister and re-register it. Or if you want to be very smart, check that the PFN didn't change and reregister if it did. This will make overcommit work. -- MST