From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma Date: Wed, 30 Jan 2019 16:45:25 -0500 Message-ID: <20190130214525.GG5061@redhat.com> References: <655a335c-ab91-d1fc-1ed3-b5f0d37c6226@deltatee.com> <20190130041841.GB30598@mellanox.com> <20190130185652.GB17080@mellanox.com> <20190130192234.GD5061@redhat.com> <20190130193759.GE17080@mellanox.com> <20190130201114.GB17915@mellanox.com> <20190130204332.GF5061@redhat.com> <20190130204954.GI17080@mellanox.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <20190130204954.GI17080@mellanox.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Jason Gunthorpe Cc: Joerg Roedel , "Rafael J . Wysocki" , Greg Kroah-Hartman , Felix Kuehling , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Christoph Hellwig , "linux-mm@kvack.org" , "iommu@lists.linux-foundation.org" , "linux-pci@vger.kernel.org" , Bjorn Helgaas , Robin Murphy , Logan Gunthorpe , Christian Koenig , Marek Szyprowski List-Id: iommu@lists.linux-foundation.org T24gV2VkLCBKYW4gMzAsIDIwMTkgYXQgMDg6NTA6MDBQTSArMDAwMCwgSmFzb24gR3VudGhvcnBl IHdyb3RlOgo+IE9uIFdlZCwgSmFuIDMwLCAyMDE5IGF0IDAzOjQzOjMyUE0gLTA1MDAsIEplcm9t ZSBHbGlzc2Ugd3JvdGU6Cj4gPiBPbiBXZWQsIEphbiAzMCwgMjAxOSBhdCAwODoxMToxOVBNICsw MDAwLCBKYXNvbiBHdW50aG9ycGUgd3JvdGU6Cj4gPiA+IE9uIFdlZCwgSmFuIDMwLCAyMDE5IGF0 IDAxOjAwOjAyUE0gLTA3MDAsIExvZ2FuIEd1bnRob3JwZSB3cm90ZToKPiA+ID4gCj4gPiA+ID4g V2UgbmV2ZXIgY2hhbmdlZCBTR0xzLiBXZSBzdGlsbCB1c2UgdGhlbSB0byBwYXNzIHAycGRtYSBw YWdlcywgb25seSB3ZQo+ID4gPiA+IG5lZWQgdG8gYmUgYSBiaXQgY2FyZWZ1bCB3aGVyZSB3ZSBz ZW5kIHRoZSBlbnRpcmUgU0dMLiBJIHNlZSBubyByZWFzb24KPiA+ID4gPiB3aHkgd2UgY2FuJ3Qg Y29udGludWUgdG8gYmUgY2FyZWZ1bCBvbmNlIHRoZWlyIGluIHVzZXJzcGFjZSBpZiB0aGVyZSdz Cj4gPiA+ID4gc29tZXRoaW5nIGluIEdVUCB0byBkZW55IHRoZW0uCj4gPiA+ID4gCj4gPiA+ID4g SXQgd291bGQgYmUgbmljZSB0byBoYXZlIGhldGVyb2dlbmVvdXMgU0dMcyBhbmQgaXQgaXMgc29t ZXRoaW5nIHdlCj4gPiA+ID4gc2hvdWxkIHdvcmsgdG93YXJkIGJ1dCBpbiBwcmFjdGljZSB0aGV5 IGFyZW4ndCByZWFsbHkgbmVjZXNzYXJ5IGF0IHRoZQo+ID4gPiA+IG1vbWVudC4KPiA+ID4gCj4g PiA+IFJETUEgZ2VuZXJhbGx5IGNhbm5vdCBjb3BlIHdlbGwgd2l0aCBhbiBBUEkgdGhhdCByZXF1 aXJlcyBob21vZ2VuZW91cwo+ID4gPiBTR0xzLi4gVXNlciBzcGFjZSBjYW4gY29uc3RydWN0IGNv bXBsZXggTVJzIChwYXJ0aWN1bGFybHkgd2l0aCB0aGUKPiA+ID4gcHJvcG9zZWQgU0dMIE1SIGZs b3cpIGFuZCB3ZSBtdXN0IG1hcnNoYWwgdGhhdCBpbnRvIGEgc2luZ2xlIFNHTCBvcgo+ID4gPiB0 aGUgZHJpdmVycyBmYWxsIGFwYXJ0Lgo+ID4gPiAKPiA+ID4gSmVyb21lIGV4cGxhaW5lZCB0aGF0 IEdQVSBpcyB3b3JzZSwgYSBzaW5nbGUgVk1BIG1heSBoYXZlIGEgcmFuZG9tIG1peAo+ID4gPiBv ZiBDUFUgb3IgZGV2aWNlIHBhZ2VzLi4KPiA+ID4gCj4gPiA+IFRoaXMgaXMgYSBwcmV0dHkgYmln IGJsb2NrZXIgdGhhdCB3b3VsZCBoYXZlIHRvIHNvbWVob3cgYmUgZml4ZWQuCj4gPiAKPiA+IE5v dGUgdGhhdCBITU0gdGFrZXMgY2FyZSBvZiB0aGF0IFJETUEgT0RQIHdpdGggbXkgT0RQIHRvIEhN TSBwYXRjaCwKPiA+IHNvIHdoYXQgeW91IGdldCBmb3IgYW4gT0RQIHVtZW0gaXMganVzdCBhIGxp c3Qgb2YgZG1hIGFkZHJlc3MgeW91Cj4gPiBjYW4gcHJvZ3JhbSB5b3VyIGRldmljZSB0by4gVGhl IGFpbSBpcyB0byBhdm9pZCB0aGUgZHJpdmVyIHRvIGNhcmUKPiA+IGFib3V0IHRoYXQuIFRoZSBh Y2Nlc3MgcG9saWN5IHdoZW4gdGhlIFVNRU0gb2JqZWN0IGlzIGNyZWF0ZWQgYnkKPiA+IHVzZXJz cGFjZSB0aHJvdWdoIHZlcmJzIEFQSSBzaG91bGQgaG93ZXZlciBhc2NlcnRhaW4gdGhhdCBmb3Ig bW1hcAo+ID4gb2YgZGV2aWNlIGZpbGUgaXQgaXMgb25seSBjcmVhdGluZyBhIFVNRU0gdGhhdCBp cyBmdWxseSBjb3ZlcmVkIGJ5Cj4gPiBvbmUgYW5kIG9ubHkgb25lIHZtYS4gR1BVIGRldmljZSBk cml2ZXIgd2lsbCBoYXZlIG9uZSB2bWEgcGVyIGxvZ2ljYWwKPiA+IEdQVSBvYmplY3QuIEkgZXhw ZWN0IG90aGVyIGtpbmQgb2YgZGV2aWNlIGRvIHRoYXQgc2FtZSBzbyB0aGF0IHRoZXkKPiA+IGNh biBtYXRjaCBhIHZtYSB0byBhIHVuaXF1ZSBvYmplY3QgaW4gdGhlaXIgZHJpdmVyLgo+IAo+IEEg b25lIFZNQSBydWxlIGlzIG5vdCByZWFsbHkgd29ya2FibGUuCj4gCj4gV2l0aCBPRFAgVk1BIGJv dW5kYXJpZXMgY2FuIG1vdmUgYXJvdW5kIGFjcm9zcyB0aGUgbGlmZXRpbWUgb2YgdGhlIE1SCj4g YW5kIHdlIGhhdmUgbm8gb2J2aW91cyB3YXkgdG8gZmFpbCBhbnl0aGluZyBpZiB1c2VycGFjZSBw dXRzIGEgVk1BCj4gYm91bmRhcnkgaW4gdGhlIG1pZGRsZSBvZiBhbiBleGlzdGluZyBPRFAgTVIg YWRkcmVzcyByYW5nZS4KClRoaXMgaXMgdHJ1ZSBvbmx5IGZvciB2bWEgdGhhdCBhcmUgbm90IG1t YXAgb2YgYSBkZXZpY2UgZmlsZS4gVGhpcyBpcwp3aGF0IGkgd2FzIHRyeWluZyB0byBnZXQgYWNj cm9zcy4gQW4gbW1hcCBvZiBhIGZpbGUgaXMgbmV2ZXIgbWVyZ2UKc28gaXQgY2FuIG9ubHkgZ2V0 IHNwbGl0L2J1dGNoZXIgYnkgbXVubWFwL21yZW1hcCBidXQgd2hlbiB0aGF0IGhhcHBlbgp5b3Ug YWxzbyBuZWVkIHRvIHJlZmxlY3QgdGhlIHZpcnR1YWwgYWRkcmVzcyBzcGFjZSBjaGFuZ2UgdG8g dGhlCmRldmljZSBpZSBhbnkgYWNjZXNzIHRvIGEgbm93IGludmFsaWQgcmFuZ2UgbXVzdCB0cmln Z2VyIGVycm9yLgoKPiAKPiBJIHRoaW5rIHRoZSBITU0gbWlycm9yIEFQSSByZWFsbHkgbmVlZHMg dG8gZGVhbCB3aXRoIHRoaXMgZm9yIHRoZQo+IGRyaXZlciBzb21laG93LgoKWWVzIHRoZSBITU0g ZG9lcyBkZWFsIHdpdGggdGhpcyBmb3IgeW91LCB5b3UgZG8gbm90IGhhdmUgdG8gd29ycnkgYWJv dXQKaXQuIFNvcnJ5IGlmIHRoYXQgd2FzIG5vdCBjbGVhci4gSSBqdXN0IHdhbnRlZCB0byBzdHJl c3MgdGhhdCB2bWEgdGhhdAphcmUgbW1hcCBvZiBhIGZpbGUgZG8gbm90IGJlaGF2ZSBsaWtlIG90 aGVyIHZtYSBoZW5jZSB3aGVuIHlvdSBjcmVhdGUKdGhlIFVNRU0geW91IGNhbiBjaGVjayBmb3Ig dGhvc2UgaWYgeW91IGZlZWwgdGhlIG5lZWQuCgpDaGVlcnMsCkrDqXLDtG1lCl9fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWlsaW5nIGxp c3QKZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNr dG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7E2EAC282D7 for ; Wed, 30 Jan 2019 21:45:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 27D0020881 for ; Wed, 30 Jan 2019 21:45:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731805AbfA3Vpc (ORCPT ); Wed, 30 Jan 2019 16:45:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:47720 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725890AbfA3Vpc (ORCPT ); Wed, 30 Jan 2019 16:45:32 -0500 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id ECD93C0586AE; Wed, 30 Jan 2019 21:45:30 +0000 (UTC) Received: from redhat.com (ovpn-126-0.rdu2.redhat.com [10.10.126.0]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 90D7B608E5; Wed, 30 Jan 2019 21:45:27 +0000 (UTC) Date: Wed, 30 Jan 2019 16:45:25 -0500 From: Jerome Glisse To: Jason Gunthorpe Cc: Logan Gunthorpe , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Greg Kroah-Hartman , "Rafael J . Wysocki" , Bjorn Helgaas , Christian Koenig , Felix Kuehling , "linux-pci@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Joerg Roedel , "iommu@lists.linux-foundation.org" Subject: Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma Message-ID: <20190130214525.GG5061@redhat.com> References: <655a335c-ab91-d1fc-1ed3-b5f0d37c6226@deltatee.com> <20190130041841.GB30598@mellanox.com> <20190130185652.GB17080@mellanox.com> <20190130192234.GD5061@redhat.com> <20190130193759.GE17080@mellanox.com> <20190130201114.GB17915@mellanox.com> <20190130204332.GF5061@redhat.com> <20190130204954.GI17080@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190130204954.GI17080@mellanox.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Wed, 30 Jan 2019 21:45:31 +0000 (UTC) Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Wed, Jan 30, 2019 at 08:50:00PM +0000, Jason Gunthorpe wrote: > On Wed, Jan 30, 2019 at 03:43:32PM -0500, Jerome Glisse wrote: > > On Wed, Jan 30, 2019 at 08:11:19PM +0000, Jason Gunthorpe wrote: > > > On Wed, Jan 30, 2019 at 01:00:02PM -0700, Logan Gunthorpe wrote: > > > > > > > We never changed SGLs. We still use them to pass p2pdma pages, only we > > > > need to be a bit careful where we send the entire SGL. I see no reason > > > > why we can't continue to be careful once their in userspace if there's > > > > something in GUP to deny them. > > > > > > > > It would be nice to have heterogeneous SGLs and it is something we > > > > should work toward but in practice they aren't really necessary at the > > > > moment. > > > > > > RDMA generally cannot cope well with an API that requires homogeneous > > > SGLs.. User space can construct complex MRs (particularly with the > > > proposed SGL MR flow) and we must marshal that into a single SGL or > > > the drivers fall apart. > > > > > > Jerome explained that GPU is worse, a single VMA may have a random mix > > > of CPU or device pages.. > > > > > > This is a pretty big blocker that would have to somehow be fixed. > > > > Note that HMM takes care of that RDMA ODP with my ODP to HMM patch, > > so what you get for an ODP umem is just a list of dma address you > > can program your device to. The aim is to avoid the driver to care > > about that. The access policy when the UMEM object is created by > > userspace through verbs API should however ascertain that for mmap > > of device file it is only creating a UMEM that is fully covered by > > one and only one vma. GPU device driver will have one vma per logical > > GPU object. I expect other kind of device do that same so that they > > can match a vma to a unique object in their driver. > > A one VMA rule is not really workable. > > With ODP VMA boundaries can move around across the lifetime of the MR > and we have no obvious way to fail anything if userpace puts a VMA > boundary in the middle of an existing ODP MR address range. This is true only for vma that are not mmap of a device file. This is what i was trying to get accross. An mmap of a file is never merge so it can only get split/butcher by munmap/mremap but when that happen you also need to reflect the virtual address space change to the device ie any access to a now invalid range must trigger error. > > I think the HMM mirror API really needs to deal with this for the > driver somehow. Yes the HMM does deal with this for you, you do not have to worry about it. Sorry if that was not clear. I just wanted to stress that vma that are mmap of a file do not behave like other vma hence when you create the UMEM you can check for those if you feel the need. Cheers, Jérôme