From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jerome Glisse Subject: Re: [Intel-gfx] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers Date: Fri, 22 Jun 2018 12:18:46 -0400 Message-ID: <20180622161845.GA3497@redhat.com> References: <20180622150242.16558-1-mhocko@kernel.org> <152968180950.11773.3374981930722769733@mail.alporthouse.com> <20180622155716.GE10465@dhcp22.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: <20180622155716.GE10465@dhcp22.suse.cz> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Michal Hocko Cc: Rodrigo@kvack.org, Michal Hocko =?utf-8?B?PG1ob2Nrb0BzdXNlLmNvbT4sIGt2bUB2Z2VyLmtlcm5l?= =?utf-8?B?bC5vcmcsICAiIFJhZGltIEtyxI1tw6HFmSA8cmtyY21hckByZWRoYXQuY29t?= =?utf-8?B?Piw=?= David Airlie , Sudeep Dutt , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Mike@kvack.org, Vivi@kvack.org, Juergen@kvack.org, Andrea Arcangeli , Dimitri Sivanich , Paolo@kvack.org, Dennis@kvack.org, linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Boris@kvack.org, Jason Gunthorpe , Doug Ledford , David Rientjes , xen-devel@lists.xenproject.org, Ashutosh@kvack.org, Marciniszyn@kvack.org, Alex@kvack.org, intel-gfx@lists.freedesktop.org, Dalessandro@kvack.org, Deucher@kvack.org, Ostrovsky@kvack.org, Bonzini@kvack.org, LKML List-Id: amd-gfx.lists.freedesktop.org T24gRnJpLCBKdW4gMjIsIDIwMTggYXQgMDU6NTc6MTZQTSArMDIwMCwgTWljaGFsIEhvY2tvIHdy b3RlOgo+IE9uIEZyaSAyMi0wNi0xOCAxNjozNjo0OSwgQ2hyaXMgV2lsc29uIHdyb3RlOgo+ID4g UXVvdGluZyBNaWNoYWwgSG9ja28gKDIwMTgtMDYtMjIgMTY6MDI6NDIpCj4gPiA+IEhpLAo+ID4g PiB0aGlzIGlzIGFuIFJGQyBhbmQgbm90IHRlc3RlZCBhdCBhbGwuIEkgYW0gbm90IHZlcnkgZmFt aWxpYXIgd2l0aCB0aGUKPiA+ID4gbW11IG5vdGlmaWVycyBzZW1hbnRpY3MgdmVyeSBtdWNoIHNv IHRoaXMgaXMgYSBjcnVkZSBhdHRlbXB0IHRvIGFjaGlldmUKPiA+ID4gd2hhdCBJIG5lZWQgYmFz aWNhbGx5LiBJdCBtaWdodCBiZSBjb21wbGV0ZWx5IHdyb25nIGJ1dCBJIHdvdWxkIGxpa2UKPiA+ ID4gdG8gZGlzY3VzcyB3aGF0IHdvdWxkIGJlIGEgYmV0dGVyIHdheSBpZiB0aGF0IGlzIHRoZSBj YXNlLgo+ID4gPiAKPiA+ID4gZ2V0X21haW50YWluZXJzIGdhdmUgbWUgcXVpdGUgbGFyZ2UgbGlz dCBvZiBwZW9wbGUgdG8gQ0Mgc28gSSBoYWQgdG8gdHJpbQo+ID4gPiBpdCBkb3duLiBJZiB5b3Ug dGhpbmsgSSBoYXZlIGZvcmdvdCBzb21lYm9keSwgcGxlYXNlIGxldCBtZSBrbm93Cj4gPiAKPiA+ ID4gZGlmZiAtLWdpdCBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZ2VtX3VzZXJwdHIuYyBi L2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZ2VtX3VzZXJwdHIuYwo+ID4gPiBpbmRleCA4NTRi ZDUxYjk0NzguLjUyODVkZjkzMzFmYSAxMDA2NDQKPiA+ID4gLS0tIGEvZHJpdmVycy9ncHUvZHJt L2k5MTUvaTkxNV9nZW1fdXNlcnB0ci5jCj4gPiA+ICsrKyBiL2RyaXZlcnMvZ3B1L2RybS9pOTE1 L2k5MTVfZ2VtX3VzZXJwdHIuYwo+ID4gPiBAQCAtMTEyLDEwICsxMTIsMTEgQEAgc3RhdGljIHZv aWQgZGVsX29iamVjdChzdHJ1Y3QgaTkxNV9tbXVfb2JqZWN0ICptbykKPiA+ID4gICAgICAgICBt by0+YXR0YWNoZWQgPSBmYWxzZTsKPiA+ID4gIH0KPiA+ID4gIAo+ID4gPiAtc3RhdGljIHZvaWQg aTkxNV9nZW1fdXNlcnB0cl9tbl9pbnZhbGlkYXRlX3JhbmdlX3N0YXJ0KHN0cnVjdCBtbXVfbm90 aWZpZXIgKl9tbiwKPiA+ID4gK3N0YXRpYyBpbnQgaTkxNV9nZW1fdXNlcnB0cl9tbl9pbnZhbGlk YXRlX3JhbmdlX3N0YXJ0KHN0cnVjdCBtbXVfbm90aWZpZXIgKl9tbiwKPiA+ID4gICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHN0cnVjdCBtbV9z dHJ1Y3QgKm1tLAo+ID4gPiAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgdW5zaWduZWQgbG9uZyBzdGFydCwKPiA+ID4gLSAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHVuc2lnbmVkIGxvbmcgZW5k KQo+ID4gPiArICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgdW5zaWduZWQgbG9uZyBlbmQsCj4gPiA+ICsgICAgICAgICAgICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgICAgICAgICBib29sIGJsb2NrYWJsZSkKPiA+ID4gIHsKPiA+ ID4gICAgICAgICBzdHJ1Y3QgaTkxNV9tbXVfbm90aWZpZXIgKm1uID0KPiA+ID4gICAgICAgICAg ICAgICAgIGNvbnRhaW5lcl9vZihfbW4sIHN0cnVjdCBpOTE1X21tdV9ub3RpZmllciwgbW4pOwo+ ID4gPiBAQCAtMTI0LDcgKzEyNSw3IEBAIHN0YXRpYyB2b2lkIGk5MTVfZ2VtX3VzZXJwdHJfbW5f aW52YWxpZGF0ZV9yYW5nZV9zdGFydChzdHJ1Y3QgbW11X25vdGlmaWVyICpfbW4sCj4gPiA+ICAg ICAgICAgTElTVF9IRUFEKGNhbmNlbGxlZCk7Cj4gPiA+ICAKPiA+ID4gICAgICAgICBpZiAoUkJf RU1QVFlfUk9PVCgmbW4tPm9iamVjdHMucmJfcm9vdCkpCj4gPiA+IC0gICAgICAgICAgICAgICBy ZXR1cm47Cj4gPiA+ICsgICAgICAgICAgICAgICByZXR1cm4gMDsKPiA+IAo+ID4gVGhlIHByaW5j aXBsZSB3YWl0IGhlcmUgaXMgZm9yIHRoZSBIVyAoZXZlbiBhZnRlciBmaXhpbmcgYWxsIHRoZSBs b2Nrcwo+ID4gdG8gYmUgbm90IHNvIGNvYXJzZSwgd2Ugc3RpbGwgaGF2ZSB0byB3YWl0IGZvciB0 aGUgSFcgdG8gZmluaXNoIGl0cwo+ID4gYWNjZXNzKS4KPiAKPiBJcyB0aGlzIHdhaXQgYm91bmQg b3IgaXQgY2FuIHRha2UgYmFzaWNhbGx5IGFyYml0cmFyeSBhbW91bnQgb2YgdGltZT8KCkFyYml0 cmFyeSBhbW91bnQgb2YgdGltZSBidXQgaW4gZGVza3RvcCB1c2UgY2FzZSB5b3UgY2FuIGFzc3Vt ZSB0aGF0Cml0IHNob3VsZCBuZXZlciBnbyBhYm92ZSAxNm1zIGZvciBhIDYwZnJhbWUgcGVyIHNl Y29uZCByZW5kZXJpbmcgb2YKeW91ciBkZXNrdG9wIChpbiBHUFUgY29tcHV0ZSBjYXNlIHRoaXMg a2luZCBvZiBhc3N1bXB0aW9uIGRvZXMgbm90CmhvbGQpLiBJcyB0aGUgcHJvY2VzcyBleGl0X3N0 YXRlIGFscmVhZHkgdXBkYXRlZCBieSB0aGUgdGltZSB0aGlzIG1tdQpub3RpZmllciBjYWxsYmFj a3MgaGFwcGVuID8KCj4gCj4gPiBUaGUgZmlyc3QgcGFzcyB3b3VsZCBiZSB0aGVuIHRvIG5vdCBk byBhbnl0aGluZyBoZXJlIGlmCj4gPiAhYmxvY2thYmxlLgo+IAo+IHNvbWV0aGluZyBsaWtlIHRo aXM/IChpbmNyZW1lbnRhbCBkaWZmKQoKV2hhdCBpIHdhbnRlZCB0byBkbyB3aXRoIEhNTSBhbmQg bW11IG5vdGlmaWVyIGlzIHNwbGl0IHRoZSBpbnZhbGlkYXRpb24KaW4gMiBwYXNzLiBGaXJzdCBw YXNzIHRlbGwgdGhlIGRyaXZlcnMgdG8gc3RvcC9jYW5jZWwgcGVuZGluZyBqb2JzIHRoYXQKZGVw ZW5kcyBvbiB0aGUgcmFuZ2UgYW5kIGludmFsaWRhdGUgaW50ZXJuYWwgZHJpdmVyIHN0YXRlcyAo bGlrZSBjbGVhcgpidWZmZXIgb2JqZWN0IHBhZ2VzIGFycmF5IGluIGNhc2Ugb2YgR1BVIGJ1dCBu b3QgR1BVIHBhZ2UgdGFibGUpLiBXaGlsZQp0aGUgc2Vjb25kIGNhbGxiYWNrIHdvdWxkIGRvIHRo ZSBhY3R1YWwgd2FpdCBvbiB0aGUgR1BVIHRvIGJlIGRvbmUgYW5kCnVwZGF0ZSB0aGUgR1BVIHBh Z2UgdGFibGUuCgpOb3cgaW4gdGhpcyBzY2hlbWUgaW4gY2FzZSB0aGUgdGFzayBpcyBhbHJlYWR5 IGluIHNvbWUgZXhpdCBzdGF0ZSBhbmQKdGhhdCBhbGwgQ1BVIHRocmVhZHMgYXJlIGZyb3plbi9r aWxsIHRoZW4gd2UgY2FuIHByb2JhYmx5IGZpbmQgYSB3YXkgdG8KZG8gdGhlIGZpcnN0IHBhdGgg bW9zdGx5IGxvY2sgbGVzcy4gQUZBSUNSIG5vciBBTUQgbm9yIEludGVsIGFsbG93IHRvCnNoYXJl IHVzZXJwdHIgYm8gaGVuY2UgYSB1cHRyIGJvIHNob3VsZCBvbmx5IGV2ZXIgYmUgYWNjZXNzIHRo cm91Z2gKaW9jdGwgc3VibWl0ZWQgYnkgdGhlIHByb2Nlc3MuCgpUaGUgc2Vjb25kIGNhbGwgY2Fu IHRoZW4gYmUgZGVsYXllZCBhbmQgcGluZyBmcm9tIHRpbWUgdG8gdGltZSB0byBzZWUKaWYgR1BV IGpvYnMgYXJlIGRvbmUuCgoKTm90ZSB0aGF0IHdoYXQgeW91IHByb3Bvc2UgbWlnaHQgc3RpbGwg YmUgdXNlZnVsIGFzIGluIGNhc2UgdGhlcmUgaXMKbm8gYnVmZmVyIG9iamVjdCBmb3IgYSByYW5n ZSB0aGVuIE9PTSBjYW4gbWFrZSBwcm9ncmVzcyBpbiBmcmVlaW5nIGEKcmFuZ2Ugb2YgbWVtb3J5 LiBJdCBpcyB2ZXJ5IGxpa2VseSB0aGF0IHNpZ25pZmljYW50IHZpcnR1YWwgYWRkcmVzcwpyYW5n ZSBvZiBhIHByb2Nlc3MgYW5kIGJhY2tpbmcgbWVtb3J5IGNhbiBiZSByZWNsYWltIHRoYXQgd2F5 LiBUaGlzCmFzc3VtZSBPT00gcmVjbGFpbSB2bWEgYnkgdm1hIG9yIGluIHNvbWUgZm9ybSBvZiBn cmFudWxhcml0eSBsaWtlCnJlY2xhaW1pbmcgMUdCIGJ5IDFHQi4gT3Igd2UgY291bGQgYWxzbyB1 cGRhdGUgYmxvY2tpbmcgY2FsbGJhY2sgdG8KcmV0dXJuIHJhbmdlIHRoYXQgYXJlIGJsb2NraW5n IHRoYXQgd2F5IE9PTSBjYW4gcmVjbGFpbSBhcm91bmQuCgpDaGVlcnMsCkrDqXLDtG1lCl9fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCmRyaS1kZXZlbCBtYWls aW5nIGxpc3QKZHJpLWRldmVsQGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZy ZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f199.google.com (mail-qk0-f199.google.com [209.85.220.199]) by kanga.kvack.org (Postfix) with ESMTP id 0A75A6B0269 for ; Fri, 22 Jun 2018 12:18:52 -0400 (EDT) Received: by mail-qk0-f199.google.com with SMTP id u20-v6so5990277qkk.20 for ; Fri, 22 Jun 2018 09:18:52 -0700 (PDT) Date: Fri, 22 Jun 2018 12:18:46 -0400 From: Jerome Glisse Subject: Re: [Intel-gfx] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers Message-ID: <20180622161845.GA3497@redhat.com> References: <20180622150242.16558-1-mhocko@kernel.org> <152968180950.11773.3374981930722769733@mail.alporthouse.com> <20180622155716.GE10465@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180622155716.GE10465@dhcp22.suse.cz> Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Chris Wilson , LKML , Michal Hocko =?utf-8?B?PG1ob2Nrb0BzdXNlLmNvbT4sIGt2bUB2Z2VyLmtlcm5l?= =?utf-8?B?bC5vcmcsICAiIFJhZGltIEtyxI1tw6HFmSA8cmtyY21hckByZWRoYXQuY29t?= =?utf-8?B?Piw=?= David Airlie , Sudeep Dutt , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Andrea Arcangeli , "David (ChunMing) Zhou" , Dimitri Sivanich , linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Jason Gunthorpe , Doug Ledford , David Rientjes , xen-devel@lists.xenproject.org, intel-gfx@lists.freedesktop.org, Rodrigo@kvack.org, Vivi@kvack.org, Boris@kvack.org, Ostrovsky@kvack.org, Juergen@kvack.org, Gross@kvack.org, Mike@kvack.org, Marciniszyn@kvack.org, Dennis@kvack.org, Dalessandro@kvack.org, Ashutosh@kvack.org, Dixit@kvack.org, Alex@kvack.org, Deucher@kvack.org, Paolo@kvack.org, Bonzini@kvack.org On Fri, Jun 22, 2018 at 05:57:16PM +0200, Michal Hocko wrote: > On Fri 22-06-18 16:36:49, Chris Wilson wrote: > > Quoting Michal Hocko (2018-06-22 16:02:42) > > > Hi, > > > this is an RFC and not tested at all. I am not very familiar with the > > > mmu notifiers semantics very much so this is a crude attempt to achieve > > > what I need basically. It might be completely wrong but I would like > > > to discuss what would be a better way if that is the case. > > > > > > get_maintainers gave me quite large list of people to CC so I had to trim > > > it down. If you think I have forgot somebody, please let me know > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > index 854bd51b9478..5285df9331fa 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > @@ -112,10 +112,11 @@ static void del_object(struct i915_mmu_object *mo) > > > mo->attached = false; > > > } > > > > > > -static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > +static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > struct mm_struct *mm, > > > unsigned long start, > > > - unsigned long end) > > > + unsigned long end, > > > + bool blockable) > > > { > > > struct i915_mmu_notifier *mn = > > > container_of(_mn, struct i915_mmu_notifier, mn); > > > @@ -124,7 +125,7 @@ static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > LIST_HEAD(cancelled); > > > > > > if (RB_EMPTY_ROOT(&mn->objects.rb_root)) > > > - return; > > > + return 0; > > > > The principle wait here is for the HW (even after fixing all the locks > > to be not so coarse, we still have to wait for the HW to finish its > > access). > > Is this wait bound or it can take basically arbitrary amount of time? Arbitrary amount of time but in desktop use case you can assume that it should never go above 16ms for a 60frame per second rendering of your desktop (in GPU compute case this kind of assumption does not hold). Is the process exit_state already updated by the time this mmu notifier callbacks happen ? > > > The first pass would be then to not do anything here if > > !blockable. > > something like this? (incremental diff) What i wanted to do with HMM and mmu notifier is split the invalidation in 2 pass. First pass tell the drivers to stop/cancel pending jobs that depends on the range and invalidate internal driver states (like clear buffer object pages array in case of GPU but not GPU page table). While the second callback would do the actual wait on the GPU to be done and update the GPU page table. Now in this scheme in case the task is already in some exit state and that all CPU threads are frozen/kill then we can probably find a way to do the first path mostly lock less. AFAICR nor AMD nor Intel allow to share userptr bo hence a uptr bo should only ever be access through ioctl submited by the process. The second call can then be delayed and ping from time to time to see if GPU jobs are done. Note that what you propose might still be useful as in case there is no buffer object for a range then OOM can make progress in freeing a range of memory. It is very likely that significant virtual address range of a process and backing memory can be reclaim that way. This assume OOM reclaim vma by vma or in some form of granularity like reclaiming 1GB by 1GB. Or we could also update blocking callback to return range that are blocking that way OOM can reclaim around. Cheers, Jerome From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.3 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 18003C43144 for ; Fri, 22 Jun 2018 16:18:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CE4F822B1F for ; Fri, 22 Jun 2018 16:18:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CE4F822B1F Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934267AbeFVQSw (ORCPT ); Fri, 22 Jun 2018 12:18:52 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:41876 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933995AbeFVQSv (ORCPT ); Fri, 22 Jun 2018 12:18:51 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 8093C8D76C; Fri, 22 Jun 2018 16:18:50 +0000 (UTC) Received: from redhat.com (ovpn-120-43.rdu2.redhat.com [10.10.120.43]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 58C2F1C71D; Fri, 22 Jun 2018 16:18:47 +0000 (UTC) Date: Fri, 22 Jun 2018 12:18:46 -0400 From: Jerome Glisse To: Michal Hocko Cc: Chris Wilson , LKML , Michal Hocko =?utf-8?B?PG1ob2Nrb0BzdXNlLmNvbT4sIGt2bUB2Z2VyLmtlcm5l?= =?utf-8?B?bC5vcmcsICAiIFJhZGltIEtyxI1tw6HFmSA8cmtyY21hckByZWRoYXQuY29t?= =?utf-8?B?Piw=?= David Airlie , Sudeep Dutt , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, Andrea Arcangeli , "David (ChunMing) Zhou" , Dimitri Sivanich , linux-rdma@vger.kernel.org, amd-gfx@lists.freedesktop.org, Jason Gunthorpe , Doug Ledford , David Rientjes , xen-devel@lists.xenproject.org, intel-gfx@lists.freedesktop.org, Rodrigo@kvack.org, Vivi@kvack.org, Boris@kvack.org, Ostrovsky@kvack.org, Juergen@kvack.org, Gross@kvack.org, Mike@kvack.org, Marciniszyn@kvack.org, Dennis@kvack.org, Dalessandro@kvack.org, Ashutosh@kvack.org, Dixit@kvack.org, Alex@kvack.org, Deucher@kvack.org, Paolo@kvack.org, Bonzini@kvack.org Subject: Re: [Intel-gfx] [RFC PATCH] mm, oom: distinguish blockable mode for mmu notifiers Message-ID: <20180622161845.GA3497@redhat.com> References: <20180622150242.16558-1-mhocko@kernel.org> <152968180950.11773.3374981930722769733@mail.alporthouse.com> <20180622155716.GE10465@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180622155716.GE10465@dhcp22.suse.cz> User-Agent: Mutt/1.10.0 (2018-05-17) X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 22 Jun 2018 16:18:50 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 22 Jun 2018 16:18:50 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jglisse@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 22, 2018 at 05:57:16PM +0200, Michal Hocko wrote: > On Fri 22-06-18 16:36:49, Chris Wilson wrote: > > Quoting Michal Hocko (2018-06-22 16:02:42) > > > Hi, > > > this is an RFC and not tested at all. I am not very familiar with the > > > mmu notifiers semantics very much so this is a crude attempt to achieve > > > what I need basically. It might be completely wrong but I would like > > > to discuss what would be a better way if that is the case. > > > > > > get_maintainers gave me quite large list of people to CC so I had to trim > > > it down. If you think I have forgot somebody, please let me know > > > > > diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > index 854bd51b9478..5285df9331fa 100644 > > > --- a/drivers/gpu/drm/i915/i915_gem_userptr.c > > > +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c > > > @@ -112,10 +112,11 @@ static void del_object(struct i915_mmu_object *mo) > > > mo->attached = false; > > > } > > > > > > -static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > +static int i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > struct mm_struct *mm, > > > unsigned long start, > > > - unsigned long end) > > > + unsigned long end, > > > + bool blockable) > > > { > > > struct i915_mmu_notifier *mn = > > > container_of(_mn, struct i915_mmu_notifier, mn); > > > @@ -124,7 +125,7 @@ static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn, > > > LIST_HEAD(cancelled); > > > > > > if (RB_EMPTY_ROOT(&mn->objects.rb_root)) > > > - return; > > > + return 0; > > > > The principle wait here is for the HW (even after fixing all the locks > > to be not so coarse, we still have to wait for the HW to finish its > > access). > > Is this wait bound or it can take basically arbitrary amount of time? Arbitrary amount of time but in desktop use case you can assume that it should never go above 16ms for a 60frame per second rendering of your desktop (in GPU compute case this kind of assumption does not hold). Is the process exit_state already updated by the time this mmu notifier callbacks happen ? > > > The first pass would be then to not do anything here if > > !blockable. > > something like this? (incremental diff) What i wanted to do with HMM and mmu notifier is split the invalidation in 2 pass. First pass tell the drivers to stop/cancel pending jobs that depends on the range and invalidate internal driver states (like clear buffer object pages array in case of GPU but not GPU page table). While the second callback would do the actual wait on the GPU to be done and update the GPU page table. Now in this scheme in case the task is already in some exit state and that all CPU threads are frozen/kill then we can probably find a way to do the first path mostly lock less. AFAICR nor AMD nor Intel allow to share userptr bo hence a uptr bo should only ever be access through ioctl submited by the process. The second call can then be delayed and ping from time to time to see if GPU jobs are done. Note that what you propose might still be useful as in case there is no buffer object for a range then OOM can make progress in freeing a range of memory. It is very likely that significant virtual address range of a process and backing memory can be reclaim that way. This assume OOM reclaim vma by vma or in some form of granularity like reclaiming 1GB by 1GB. Or we could also update blocking callback to return range that are blocking that way OOM can reclaim around. Cheers, Jérôme