From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() Date: Thu, 15 Aug 2019 12:32:38 -0400 Message-ID: <20190815163238.GA30781@redhat.com> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> <20190815130415.GD21596@ziepe.ca> <20190815143759.GG21596@ziepe.ca> <20190815151028.GJ21596@ziepe.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <20190815151028.GJ21596@ziepe.ca> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Jason Gunthorpe Cc: Feng Tang , Randy Dunlap , Kees Cook , Masahiro Yamada , Peter Zijlstra , Daniel Vetter , Intel Graphics Development , Jann Horn , LKML , DRI Development , Michal Hocko , Linux MM , Ingo Molnar , Thomas Gleixner , David Rientjes , Wei Wang , Daniel Vetter , Andrew Morton , Andy Shevchenko , Christian =?iso-8859-1?Q?K=F6nig?= List-Id: dri-devel@lists.freedesktop.org T24gVGh1LCBBdWcgMTUsIDIwMTkgYXQgMTI6MTA6MjhQTSAtMDMwMCwgSmFzb24gR3VudGhvcnBl IHdyb3RlOgo+IE9uIFRodSwgQXVnIDE1LCAyMDE5IGF0IDA0OjQzOjM4UE0gKzAyMDAsIERhbmll bCBWZXR0ZXIgd3JvdGU6Cj4gCj4gPiBZb3UgaGF2ZSB0byB3YWl0IGZvciB0aGUgZ3B1IHRvIGZp bm5pc2ggY3VycmVudCBwcm9jZXNzaW5nIGluCj4gPiBpbnZhbGlkYXRlX3JhbmdlX3N0YXJ0LiBP dGhlcndpc2UgdGhlcmUncyBubyBwb2ludCB0byBhbnkgb2YgdGhpcwo+ID4gcmVhbGx5LiBTbyB0 aGUgd2FpdF9ldmVudC9kbWFfZmVuY2Vfd2FpdCBhcmUgdW5hdm9pZGFibGUgcmVhbGx5Lgo+IAo+ IEkgZG9uJ3QgZW52eSB5b3VyIHRhc2sgOnwKPiAKPiBCdXQsIHdoYXQgeW91IGRlc2NyaWJlIHN1 cmUgc291bmRzIGxpa2UgYSAncmVnaXN0cmF0aW9uIGNhY2hlJyBtb2RlbCwKPiBub3QgdGhlICdz aGFkb3cgcHRlJyBtb2RlbCBvZiBjb2hlcmVuY3kuCj4gCj4gVGhlIGtleSBkaWZmZXJlbmNlIGlz IHRoYXQgYSByZWdpcnN0YXRpb25jYWNoZSBpcyBhbGxvd2VkIHRvIGJlY29tZQo+IGluY29oZXJl bnQgd2l0aCB0aGUgVk1BJ3MgYmVjYXVzZSBpdCBob2xkcyBwYWdlIHBpbnMuIEl0IGlzIGEKPiBw cm9ncmFtbWluZyBidWcgaW4gdXNlcnNwYWNlIHRvIGNoYW5nZSBWQSBtYXBwaW5ncyB2aWEgbW1h cC9tdW5tYXAvZXRjCj4gd2hpbGUgdGhlIGRldmljZSBpcyB3b3JraW5nIG9uIHRoYXQgVkEsIGJ1 dCBpdCBkb2VzIG5vdCBoYXJtIHN5c3RlbQo+IGludGVncml0eSBiZWNhdXNlIG9mIHRoZSBwYWdl IHBpbi4KPiAKPiBUaGUgY2FjaGUgZW5zdXJlcyB0aGF0IGVhY2ggaW5pdGlhdGVkIG9wZXJhdGlv biBzZWVzIGEgRE1BIHNldHVwIHRoYXQKPiBtYXRjaGVzIHRoZSBjdXJyZW50IFZBIG1hcCB3aGVu IHRoZSBvcGVyYXRpb24gaXMgaW5pdGlhdGVkIGFuZCBhbGxvd3MKPiBleHBlbnNpdmUgZGV2aWNl IERNQSBzZXR1cHMgdG8gYmUgcmUtdXNlZC4KPiAKPiBBICdzaGFkb3cgcHRlJyBtb2RlbCAoaWUg aG1tKSAqcmVhbGx5KiBuZWVkcyBkZXZpY2Ugc3VwcG9ydCB0bwo+IGRpcmVjdGx5IGJsb2NrIERN QSBhY2Nlc3MgLSBpZSB0cmlnZ2VyICdkZXZpY2UgcGFnZSBmYXVsdCcuIGllIHRoZQo+IGludmFs aWRhdGVfc3RhcnQgc2hvdWxkIGluZm9ybSB0aGUgZGV2aWNlIHRvIGVudGVyIGEgZmF1bHQgbW9k ZSBhbmQKPiB0aGF0IGlzIGl0LiAgSWYgdGhlIGRldmljZSBjYW4ndCBkbyB0aGF0LCB0aGVuIHRo ZSBkcml2ZXIgcHJvYmFibHkKPiBzaG91bGRuJ3QgcGVyc3VlIHRoaXMgbGV2ZWwgb2YgY29oZXJl bmN5LiBUaGUgZHJpdmVyIHdvdWxkIHF1aWNrbHkgZ2V0Cj4gaW50byB0aGUgbWVzc3kgbG9ja2lu ZyBwcm9ibGVtcyBsaWtlIGRtYV9mZW5jZV93YWl0IGZyb20gYSBub3RpZmllci4KCkkgdGhpbmsg aGVyZSB3ZSBkbyBub3QgYWdyZWUgb24gdGhlIGhhcmR3YXJlIHJlcXVpcmVtZW50LiBGb3IgR1BV CndlIHdpbGwgYWx3YXlzIG5lZWQgdG8gYmUgYWJsZSB0byB3YWl0IGZvciBzb21lIEdQVSBmZW5j ZSBmcm9tIGluc2lkZQp0aGUgbm90aWZpZXIgY2FsbGJhY2ssIHRoZXJlIGlzIGp1c3Qgbm8gd2F5 IGFyb3VuZCB0aGF0IGZvciBtYW55IG9mCnRoZSBHUFVzIHRvZGF5IChpIGRvIG5vdCBzZWUgYW55 IGluZGljYXRpb24gb2YgdGhhdCBjaGFuZ2luZykuCgpEcml2ZXIgc2hvdWxkIGF2b2lkIGxvY2sg Y29tcGxleGl0eSBieSB1c2luZyB3YWl0IHF1ZXVlIHNvIHRoYXQgdGhlCmRyaXZlciBub3RpZmll ciBjYWxsYmFjayBjYW4gd2FpdCB3aXRob3V0IGhhdmluZyB0byBob2xkIHNvbWUgZHJpdmVyCmxv Y2suIEhvd2V2ZXIgdGhlcmUgd2lsbCBiZSBhdCBsZWFzdCBvbmUgbG9jayBuZWVkZWQgdG8gdXBk YXRlIHRoZQppbnRlcm5hbCBkcml2ZXIgc3RhdGUgZm9yIHRoZSByYW5nZSBiZWluZyBpbnZhbGlk YXRlZC4gVGhhdCBsb2NrIGlzCmp1c3QgdGhlIGRldmljZSBkcml2ZXIgcGFnZSB0YWJsZSBsb2Nr IGZvciB0aGUgR1BVIHBhZ2UgdGFibGUKYXNzb2NpYXRlZCB3aXRoIHRoZSBtbV9zdHJ1Y3QuIElu IGFsbCBHUFUgZHJpdmVyIHNvIGZhciBpdCBpcyBhIHNob3J0CmxpdmVkIGxvY2sgYW5kIG5vdGhp bmcgYmxvY2tpbmcgaXMgZG9uZSB3aGlsZSBob2xkaW5nIGl0IChpdCBpcyBqdXN0CmFib3V0IHVw ZGF0aW5nIHBhZ2UgdGFibGUgZGlyZWN0b3J5IHJlYWxseSB3ZXRoZXIgaXQgaXMgZmlsbGluZyBp dCBvcgpjbGVhcmluZyBpdCkuCgoKPiAKPiBJdCBpcyBpbXBvcnRhbnQgdG8gaWRlbnRpZnkgd2hh dCBtb2RlbCB5b3UgYXJlIGdvaW5nIGZvciBhcyBkZWZpbmluZyBhCj4gJ3JlZ2lzdHJhdGlvbiBj YWNoZScgY29oZXJlbmNlIGV4cGVjdGF0aW9uIGFsbG93cyB0aGUgZHJpdmVyIHRvIHNraXAKPiBi bG9ja2luZyBpbiBpbnZhbGlkYXRlX3JhbmdlX3N0YXJ0LiBBbGwgaXQgZG9lcyBpcyBpbnZhbGlk YXRlIHRoZQo+IGNhY2hlIHNvIHRoYXQgZnV0dXJlIG9wZXJhdGlvbnMgcGljayB1cCB0aGUgbmV3 IFZBIG1hcHBpbmcuCj4gCj4gSW50ZWwncyBIRkkgUkRNQSBkcml2ZXIgdXNlcyB0aGlzIG1vZGVs IGV4dGVuc2l2ZWx5LCBhbmQgSSB0aGluayBpdCBpcwo+IHdlbGwgcHJvdmVuLCB3aXRoaW4gc29t ZSBsaW1pdGF0aW9ucyBvZiBjb3Vyc2UuCj4gCj4gQXQgbGVhc3QsICdyZWdpc3RyYXRpb24gY2Fj aGUnIGlzIHRoZSBvbmx5IHVzZSBtb2RlbCBJIGtub3cgb2Ygd2hlcmUKPiBpdCBpcyBhY2NlcHRh YmxlIHRvIHNraXAgaW52YWxpZGF0ZV9yYW5nZV9lbmQuCgpIZXJlIEdQVSBhcmUgbm90IGluIHRo ZSByZWdpc3RyYXRpb24gY2FjaGUgbW9kZWwsIGkga25vdyBpdCBtaWdodCBsb29rcwpsaWtlIGl0 IGJlY2F1c2Ugb2YgR1VQIGJ1dCBHVVAgd2FzIHVzZSBqdXN0IGJlY2F1c2UgaG1tIGRpZCBub3Qg ZXhpc3QKYXQgdGhlIHRpbWUuCgpDaGVlcnMsCkrDqXLDtG1lCl9fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fCkludGVsLWdmeCBtYWlsaW5nIGxpc3QKSW50ZWwt Z2Z4QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9t YWlsbWFuL2xpc3RpbmZvL2ludGVsLWdmeA== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EC5FC3A589 for ; Thu, 15 Aug 2019 16:32:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CF64520578 for ; Thu, 15 Aug 2019 16:32:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CF64520578 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 699416B02C0; Thu, 15 Aug 2019 12:32:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 649316B02C2; Thu, 15 Aug 2019 12:32:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 537E26B02C3; Thu, 15 Aug 2019 12:32:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0045.hostedemail.com [216.40.44.45]) by kanga.kvack.org (Postfix) with ESMTP id 2C99F6B02C0 for ; Thu, 15 Aug 2019 12:32:46 -0400 (EDT) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id C024C181AC9B4 for ; Thu, 15 Aug 2019 16:32:45 +0000 (UTC) X-FDA: 75825205890.30.act18_6f59ed878842c X-HE-Tag: act18_6f59ed878842c X-Filterd-Recvd-Size: 5576 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Thu, 15 Aug 2019 16:32:45 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C67853E2D3; Thu, 15 Aug 2019 16:32:42 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A8BCB17AD1; Thu, 15 Aug 2019 16:32:40 +0000 (UTC) Date: Thu, 15 Aug 2019 12:32:38 -0400 From: Jerome Glisse To: Jason Gunthorpe Cc: Daniel Vetter , Michal Hocko , Andrew Morton , LKML , Linux MM , DRI Development , Intel Graphics Development , Peter Zijlstra , Ingo Molnar , David Rientjes , Christian =?iso-8859-1?Q?K=F6nig?= , Masahiro Yamada , Wei Wang , Andy Shevchenko , Thomas Gleixner , Jann Horn , Feng Tang , Kees Cook , Randy Dunlap , Daniel Vetter Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() Message-ID: <20190815163238.GA30781@redhat.com> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> <20190815130415.GD21596@ziepe.ca> <20190815143759.GG21596@ziepe.ca> <20190815151028.GJ21596@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20190815151028.GJ21596@ziepe.ca> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 15 Aug 2019 16:32:43 +0000 (UTC) Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Aug 15, 2019 at 12:10:28PM -0300, Jason Gunthorpe wrote: > On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote: >=20 > > You have to wait for the gpu to finnish current processing in > > invalidate_range_start. Otherwise there's no point to any of this > > really. So the wait_event/dma_fence_wait are unavoidable really. >=20 > I don't envy your task :| >=20 > But, what you describe sure sounds like a 'registration cache' model, > not the 'shadow pte' model of coherency. >=20 > The key difference is that a regirstationcache is allowed to become > incoherent with the VMA's because it holds page pins. It is a > programming bug in userspace to change VA mappings via mmap/munmap/etc > while the device is working on that VA, but it does not harm system > integrity because of the page pin. >=20 > The cache ensures that each initiated operation sees a DMA setup that > matches the current VA map when the operation is initiated and allows > expensive device DMA setups to be re-used. >=20 > A 'shadow pte' model (ie hmm) *really* needs device support to > directly block DMA access - ie trigger 'device page fault'. ie the > invalidate_start should inform the device to enter a fault mode and > that is it. If the device can't do that, then the driver probably > shouldn't persue this level of coherency. The driver would quickly get > into the messy locking problems like dma_fence_wait from a notifier. I think here we do not agree on the hardware requirement. For GPU we will always need to be able to wait for some GPU fence from inside the notifier callback, there is just no way around that for many of the GPUs today (i do not see any indication of that changing). Driver should avoid lock complexity by using wait queue so that the driver notifier callback can wait without having to hold some driver lock. However there will be at least one lock needed to update the internal driver state for the range being invalidated. That lock is just the device driver page table lock for the GPU page table associated with the mm_struct. In all GPU driver so far it is a short lived lock and nothing blocking is done while holding it (it is just about updating page table directory really wether it is filling it or clearing it). >=20 > It is important to identify what model you are going for as defining a > 'registration cache' coherence expectation allows the driver to skip > blocking in invalidate_range_start. All it does is invalidate the > cache so that future operations pick up the new VA mapping. >=20 > Intel's HFI RDMA driver uses this model extensively, and I think it is > well proven, within some limitations of course. >=20 > At least, 'registration cache' is the only use model I know of where > it is acceptable to skip invalidate_range_end. Here GPU are not in the registration cache model, i know it might looks like it because of GUP but GUP was use just because hmm did not exist at the time. Cheers, J=E9r=F4me