From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40654) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ehFJI-0005Ny-Tk for qemu-devel@nongnu.org; Thu, 01 Feb 2018 08:53:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ehFJF-0001Rd-Sx for qemu-devel@nongnu.org; Thu, 01 Feb 2018 08:53:53 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55952) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ehFJF-0001Ph-LW for qemu-devel@nongnu.org; Thu, 01 Feb 2018 08:53:49 -0500 Date: Thu, 1 Feb 2018 11:53:40 -0200 From: Eduardo Habkost Message-ID: <20180201135340.GU26425@localhost.localdomain> References: <20180117095421.124787-1-marcel@redhat.com> <20180117095421.124787-2-marcel@redhat.com> <20180131204059.GG21702@localhost.localdomain> <20180131230607-mutt-send-email-mst@kernel.org> <20180131233422.GP26425@localhost.localdomain> <20180201040608-mutt-send-email-mst@kernel.org> <8dbc7c99-84f6-0023-526b-359fdf2b5162@redhat.com> <20180201121009.GR26425@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Subject: Re: [Qemu-devel] [PATCH V8 1/4] mem: add share parameter to memory-backend-ram List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcel Apfelbaum Cc: "Michael S. Tsirkin" , qemu-devel@nongnu.org, cohuck@redhat.com, f4bug@amsat.org, yuval.shaia@oracle.com, borntraeger@de.ibm.com, pbonzini@redhat.com, imammedo@redhat.com On Thu, Feb 01, 2018 at 02:29:25PM +0200, Marcel Apfelbaum wrote: > On 01/02/2018 14:10, Eduardo Habkost wrote: > > On Thu, Feb 01, 2018 at 07:36:50AM +0200, Marcel Apfelbaum wrote: > >> On 01/02/2018 4:22, Michael S. Tsirkin wrote: > >>> On Wed, Jan 31, 2018 at 09:34:22PM -0200, Eduardo Habkost wrote: > > [...] > >>>> BTW, what's the root cause for requiring HVAs in the buffer? > >>> > >>> It's a side effect of the kernel/userspace API which always wants > >>> a single HVA/len pair to map memory for the application. > >>> > >>> > >> > >> Hi Eduardo and Michael, > >> > >>>> Can > >>>> this be fixed? > >>> > >>> I think yes. It'd need to be a kernel patch for the RDMA subsystem > >>> mapping an s/g list with actual memory. The HVA/len pair would then just > >>> be used to refer to the region, without creating the two mappings. > >>> > >>> Something like splitting the register mr into > >>> > >>> mr = create mr (va/len) - allocate a handle and record the va/len > >>> > >>> addmemory(mr, offset, hva, len) - pin memory > >>> > >>> register mr - pass it to HW > >>> > >>> As a nice side effect we won't burn so much virtual address space. > >>> > >> > >> We would still need a contiguous virtual address space range (for post-send) > >> which we don't have since guest contiguous virtual address space > >> will always end up as non-contiguous host virtual address space. > >> > >> I am not sure the RDMA HW can handle a large VA with holes. > > > > I'm confused. Why would the hardware see and care about virtual > > addresses? > > The post-send operations bypasses the kernel, and the process > puts in the work request GVA addresses. > > > How exactly does the hardware translates VAs to > > PAs? > > The HW maintains a page-directory like structure different form MMU > VA -> phys pages > > > What if the process page tables change? > > > > Since the page tables the HW uses are their own, we just need the phys > page to be pinned. So there's no hardware-imposed requirement that the hardware VAs (mapped by the HW page directory) match the VAs in QEMU address-space, right? If the RDMA API is updated to remove this requirement, couldn't you just use the untranslated guest VAs directly? -- Eduardo