From mboxrd@z Thu Jan 1 00:00:00 1970 From: Oded Gabbay Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver Date: Mon, 21 Jul 2014 22:23:43 +0300 Message-ID: <53CD68BF.4020308@amd.com> References: <53C7D645.3070607@amd.com> <20140720174652.GE3068@gmail.com> <53CD0961.4070505@amd.com> <53CD17FD.3000908@vodafone.de> <53CD1FB6.1000602@amd.com> <20140721155437.GA4519@gmail.com> <53CD5122.5040804@amd.com> <20140721181433.GA5196@gmail.com> <53CD5DBC.7010301@amd.com> <20140721185940.GA5278@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2lp0240.outbound.protection.outlook.com [207.46.163.240]) by gabe.freedesktop.org (Postfix) with ESMTP id 6BCA86E0E0 for ; Mon, 21 Jul 2014 12:23:55 -0700 (PDT) In-Reply-To: <20140721185940.GA5278@gmail.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Jerome Glisse Cc: Andrew Lewycky , linux-mm , =?UTF-8?B?TWljaGVsIETDpG56ZXI=?= , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Evgeny Pinchuk , Alexey Skidanov , Andrew Morton List-Id: dri-devel@lists.freedesktop.org T24gMjEvMDcvMTQgMjE6NTksIEplcm9tZSBHbGlzc2Ugd3JvdGU6Cj4gT24gTW9uLCBKdWwgMjEs IDIwMTQgYXQgMDk6MzY6NDRQTSArMDMwMCwgT2RlZCBHYWJiYXkgd3JvdGU6Cj4+IE9uIDIxLzA3 LzE0IDIxOjE0LCBKZXJvbWUgR2xpc3NlIHdyb3RlOgo+Pj4gT24gTW9uLCBKdWwgMjEsIDIwMTQg YXQgMDg6NDI6NThQTSArMDMwMCwgT2RlZCBHYWJiYXkgd3JvdGU6Cj4+Pj4gT24gMjEvMDcvMTQg MTg6NTQsIEplcm9tZSBHbGlzc2Ugd3JvdGU6Cj4+Pj4+IE9uIE1vbiwgSnVsIDIxLCAyMDE0IGF0 IDA1OjEyOjA2UE0gKzAzMDAsIE9kZWQgR2FiYmF5IHdyb3RlOgo+Pj4+Pj4gT24gMjEvMDcvMTQg MTY6MzksIENocmlzdGlhbiBLw7ZuaWcgd3JvdGU6Cj4+Pj4+Pj4gQW0gMjEuMDcuMjAxNCAxNDoz Niwgc2NocmllYiBPZGVkIEdhYmJheToKPj4+Pj4+Pj4gT24gMjAvMDcvMTQgMjA6NDYsIEplcm9t ZSBHbGlzc2Ugd3JvdGU6Cj4+Pj4+Pj4+PiBPbiBUaHUsIEp1bCAxNywgMjAxNCBhdCAwNDo1Nzoy NVBNICswMzAwLCBPZGVkIEdhYmJheSB3cm90ZToKPj4+Pj4+Pj4+PiBGb3Jnb3QgdG8gY2MgbWFp bGluZyBsaXN0IG9uIGNvdmVyIGxldHRlci4gU29ycnkuCj4+Pj4+Pj4+Pj4KPj4+Pj4+Pj4+PiBB cyBhIGNvbnRpbnVhdGlvbiB0byB0aGUgZXhpc3RpbmcgZGlzY3Vzc2lvbiwgaGVyZSBpcyBhIHYy IHBhdGNoIHNlcmllcwo+Pj4+Pj4+Pj4+IHJlc3RydWN0dXJlZCB3aXRoIGEgY2xlYW5lciBoaXN0 b3J5IGFuZCBubyB0b3RhbGx5LWRpZmZlcmVudC1lYXJseS12ZXJzaW9ucwo+Pj4+Pj4+Pj4+IG9m IHRoZSBjb2RlLgo+Pj4+Pj4+Pj4+Cj4+Pj4+Pj4+Pj4gSW5zdGVhZCBvZiA4MyBwYXRjaGVzLCB0 aGVyZSBhcmUgbm93IGEgdG90YWwgb2YgMjUgcGF0Y2hlcywgd2hlcmUgNSBvZiB0aGVtCj4+Pj4+ Pj4+Pj4gYXJlIG1vZGlmaWNhdGlvbnMgdG8gcmFkZW9uIGRyaXZlciBhbmQgMTggb2YgdGhlbSBp bmNsdWRlIG9ubHkgYW1ka2ZkIGNvZGUuCj4+Pj4+Pj4+Pj4gVGhlcmUgaXMgbm8gY29kZSBnb2lu ZyBhd2F5IG9yIGV2ZW4gbW9kaWZpZWQgYmV0d2VlbiBwYXRjaGVzLCBvbmx5IGFkZGVkLgo+Pj4+ Pj4+Pj4+Cj4+Pj4+Pj4+Pj4gVGhlIGRyaXZlciB3YXMgcmVuYW1lZCBmcm9tIHJhZGVvbl9rZmQg dG8gYW1ka2ZkIGFuZCBtb3ZlZCB0byByZXNpZGUgdW5kZXIKPj4+Pj4+Pj4+PiBkcm0vcmFkZW9u L2FtZGtmZC4gVGhpcyBtb3ZlIHdhcyBkb25lIHRvIGVtcGhhc2l6ZSB0aGUgZmFjdCB0aGF0IHRo aXMgZHJpdmVyCj4+Pj4+Pj4+Pj4gaXMgYW4gQU1ELW9ubHkgZHJpdmVyIGF0IHRoaXMgcG9pbnQu IEhhdmluZyBzYWlkIHRoYXQsIHdlIGRvIGZvcmVzZWUgYQo+Pj4+Pj4+Pj4+IGdlbmVyaWMgaHNh IGZyYW1ld29yayBiZWluZyBpbXBsZW1lbnRlZCBpbiB0aGUgZnV0dXJlIGFuZCBpbiB0aGF0IGNh c2UsIHdlCj4+Pj4+Pj4+Pj4gd2lsbCBhZGp1c3QgYW1ka2ZkIHRvIHdvcmsgd2l0aGluIHRoYXQg ZnJhbWV3b3JrLgo+Pj4+Pj4+Pj4+Cj4+Pj4+Pj4+Pj4gQXMgdGhlIGFtZGtmZCBkcml2ZXIgc2hv dWxkIHN1cHBvcnQgbXVsdGlwbGUgQU1EIGdmeCBkcml2ZXJzLCB3ZSB3YW50IHRvCj4+Pj4+Pj4+ Pj4ga2VlcCBpdCBhcyBhIHNlcGVyYXRlIGRyaXZlciBmcm9tIHJhZGVvbi4gVGhlcmVmb3JlLCB0 aGUgYW1ka2ZkIGNvZGUgaXMKPj4+Pj4+Pj4+PiBjb250YWluZWQgaW4gaXRzIG93biBmb2xkZXIu IFRoZSBhbWRrZmQgZm9sZGVyIHdhcyBwdXQgdW5kZXIgdGhlIHJhZGVvbgo+Pj4+Pj4+Pj4+IGZv bGRlciBiZWNhdXNlIHRoZSBvbmx5IEFNRCBnZnggZHJpdmVyIGluIHRoZSBMaW51eCBrZXJuZWwg YXQgdGhpcyBwb2ludAo+Pj4+Pj4+Pj4+IGlzIHRoZSByYWRlb24gZHJpdmVyLiBIYXZpbmcgc2Fp ZCB0aGF0LCB3ZSB3aWxsIHByb2JhYmx5IG5lZWQgdG8gbW92ZSBpdAo+Pj4+Pj4+Pj4+IChtYXli ZSB0byBiZSBkaXJlY3RseSB1bmRlciBkcm0pIGFmdGVyIHdlIGludGVncmF0ZSB3aXRoIGFkZGl0 aW9uYWwgQU1EIGdmeAo+Pj4+Pj4+Pj4+IGRyaXZlcnMuCj4+Pj4+Pj4+Pj4KPj4+Pj4+Pj4+PiBG b3IgcGVvcGxlIHdobyBsaWtlIHRvIHJldmlldyB1c2luZyBnaXQsIHRoZSB2MiBwYXRjaCBzZXQg aXMgbG9jYXRlZCBhdDoKPj4+Pj4+Pj4+PiBodHRwOi8vY2dpdC5mcmVlZGVza3RvcC5vcmcvfmdh YmJheW8vbGludXgvbG9nLz9oPWtmZC1uZXh0LTMuMTctdjIKPj4+Pj4+Pj4+Pgo+Pj4+Pj4+Pj4+ IFdyaXR0ZW4gYnkgT2RlZCBHYWJiYXloIDxvZGVkLmdhYmJheUBhbWQuY29tPgo+Pj4+Pj4+Pj4K Pj4+Pj4+Pj4+IFNvIHF1aWNrIGNvbW1lbnRzIGJlZm9yZSBpIGZpbmlzaCBnb2luZyBvdmVyIGFs bCBwYXRjaGVzLiBUaGVyZSBpcyBtYW55Cj4+Pj4+Pj4+PiB0aGluZ3MgdGhhdCBuZWVkIG1vcmUg ZG9jdW1lbnRhdGlvbiBlc3BhY2lhbHkgYXMgb2YgcmlnaHQgbm93IHRoZXJlIGlzCj4+Pj4+Pj4+ PiBubyB1c2Vyc3BhY2UgaSBjYW4gZ28gbG9vayBhdC4KPj4+Pj4+Pj4gU28gcXVpY2sgY29tbWVu dHMgb24gc29tZSBvZiB5b3VyIHF1ZXN0aW9ucyBidXQgZmlyc3Qgb2YgYWxsLCB0aGFua3MgZm9y IHRoZQo+Pj4+Pj4+PiB0aW1lIHlvdSBkZWRpY2F0ZWQgdG8gcmV2aWV3IHRoZSBjb2RlLgo+Pj4+ Pj4+Pj4KPj4+Pj4+Pj4+IFRoZXJlIGZldyBzaG93IHN0b3BwZXIsIGJpZ2dlc3Qgb25lIGlzIGdw dSBtZW1vcnkgcGlubmluZyB0aGlzIGlzIGEgYmlnCj4+Pj4+Pj4+PiBubywgdGhhdCB3b3VsZCBu ZWVkIHNlcmlvdXMgYXJndW1lbnRzIGZvciBhbnkgaG9wZSBvZiBjb252aW5jaW5nIG1lIG9uCj4+ Pj4+Pj4+PiB0aGF0IHNpZGUuCj4+Pj4+Pj4+IFdlIG9ubHkgZG8gZ3B1IG1lbW9yeSBwaW5uaW5n IGZvciBrZXJuZWwgb2JqZWN0cy4gVGhlcmUgYXJlIG5vIHVzZXJzcGFjZQo+Pj4+Pj4+PiBvYmpl Y3RzIHRoYXQgYXJlIHBpbm5lZCBvbiB0aGUgZ3B1IG1lbW9yeSBpbiBvdXIgZHJpdmVyLiBJZiB0 aGF0IGlzIHRoZSBjYXNlLAo+Pj4+Pj4+PiBpcyBpdCBzdGlsbCBhIHNob3cgc3RvcHBlciA/Cj4+ Pj4+Pj4+Cj4+Pj4+Pj4+IFRoZSBrZXJuZWwgb2JqZWN0cyBhcmU6Cj4+Pj4+Pj4+IC0gcGlwZWxp bmVzICg0IHBlciBkZXZpY2UpCj4+Pj4+Pj4+IC0gbXFkIHBlciBoaXEgKG9ubHkgMSBwZXIgZGV2 aWNlKQo+Pj4+Pj4+PiAtIG1xZCBwZXIgdXNlcnNwYWNlIHF1ZXVlLiBPbiBLViwgd2Ugc3VwcG9y dCB1cCB0byAxSyBxdWV1ZXMgcGVyIHByb2Nlc3MsIGZvcgo+Pj4+Pj4+PiBhIHRvdGFsIG9mIDUx MksgcXVldWVzLiBFYWNoIG1xZCBpcyAxNTEgYnl0ZXMsIGJ1dCB0aGUgYWxsb2NhdGlvbiBpcyBk b25lIGluCj4+Pj4+Pj4+IDI1NiBhbGlnbm1lbnQuIFNvIHRvdGFsICpwb3NzaWJsZSogbWVtb3J5 IGlzIDEyOE1CCj4+Pj4+Pj4+IC0ga2VybmVsIHF1ZXVlIChvbmx5IDEgcGVyIGRldmljZSkKPj4+ Pj4+Pj4gLSBmZW5jZSBhZGRyZXNzIGZvciBrZXJuZWwgcXVldWUKPj4+Pj4+Pj4gLSBydW5saXN0 cyBmb3IgdGhlIENQICgxIG9yIDIgcGVyIGRldmljZSkKPj4+Pj4+Pgo+Pj4+Pj4+IFRoZSBtYWlu IHF1ZXN0aW9ucyBoZXJlIGFyZSBpZiBpdCdzIGF2b2lkIGFibGUgdG8gcGluIGRvd24gdGhlIG1l bW9yeSBhbmQgaWYgdGhlCj4+Pj4+Pj4gbWVtb3J5IGlzIHBpbm5lZCBkb3duIGF0IGRyaXZlciBs b2FkLCBieSByZXF1ZXN0IGZyb20gdXNlcnNwYWNlIG9yIGJ5IGFueXRoaW5nCj4+Pj4+Pj4gZWxz ZS4KPj4+Pj4+Pgo+Pj4+Pj4+IEFzIGZhciBhcyBJIGNhbiBzZWUgb25seSB0aGUgIm1xZCBwZXIg dXNlcnNwYWNlIHF1ZXVlIiBtaWdodCBiZSBhIGJpdAo+Pj4+Pj4+IHF1ZXN0aW9uYWJsZSwgZXZl cnl0aGluZyBlbHNlIHNvdW5kcyByZWFzb25hYmxlLgo+Pj4+Pj4+Cj4+Pj4+Pj4gQ2hyaXN0aWFu Lgo+Pj4+Pj4KPj4+Pj4+IE1vc3Qgb2YgdGhlIHBpbiBkb3ducyBhcmUgZG9uZSBvbiBkZXZpY2Ug aW5pdGlhbGl6YXRpb24uCj4+Pj4+PiBUaGUgIm1xZCBwZXIgdXNlcnNwYWNlIiBpcyBkb25lIHBl ciB1c2Vyc3BhY2UgcXVldWUgY3JlYXRpb24uIEhvd2V2ZXIsIGFzIEkKPj4+Pj4+IHNhaWQsIGl0 IGhhcyBhbiB1cHBlciBsaW1pdCBvZiAxMjhNQiBvbiBLViwgYW5kIGNvbnNpZGVyaW5nIHRoZSAy RyBsb2NhbAo+Pj4+Pj4gbWVtb3J5LCBJIHRoaW5rIGl0IGlzIE9LLgo+Pj4+Pj4gVGhlIHJ1bmxp c3RzIGFyZSBhbHNvIGRvbmUgb24gdXNlcnNwYWNlIHF1ZXVlIGNyZWF0aW9uL2RlbGV0aW9uLCBi dXQgd2Ugb25seQo+Pj4+Pj4gaGF2ZSAxIG9yIDIgcnVubGlzdHMgcGVyIGRldmljZSwgc28gaXQg aXMgbm90IHRoYXQgYmFkLgo+Pj4+Pgo+Pj4+PiAyRyBsb2NhbCBtZW1vcnkgPyBZb3UgY2FuIG5v dCBhc3N1bWUgYW55dGhpbmcgb24gdXNlcnNpZGUgY29uZmlndXJhdGlvbiBzb21lCj4+Pj4+IG9u ZSBtaWdodCBidWlsZCBhbiBoc2EgY29tcHV0ZXIgd2l0aCA1MTJNIGFuZCBzdGlsbCBleHBlY3Qg YSBmdW5jdGlvbmluZwo+Pj4+PiBkZXNrdG9wLgo+Pj4+IEZpcnN0IG9mIGFsbCwgSSdtIG9ubHkg Y29uc2lkZXJpbmcgS2F2ZXJpIGNvbXB1dGVyLCBub3QgImhzYSIgY29tcHV0ZXIuCj4+Pj4gU2Vj b25kLCBJIHdvdWxkIGltYWdpbmUgd2UgY2FuIGJ1aWxkIHNvbWUgcHJvdGVjdGlvbiBhcm91bmQg aXQsIGxpa2UKPj4+PiBjaGVja2luZyB0b3RhbCBsb2NhbCBtZW1vcnkgYW5kIGxpbWl0IG51bWJl ciBvZiBxdWV1ZXMgYmFzZWQgb24gc29tZQo+Pj4+IHBlcmNlbnRhZ2Ugb2YgdGhhdCB0b3RhbCBs b2NhbCBtZW1vcnkuIFNvLCBpZiBzb21lb25lIHdpbGwgaGF2ZSBvbmx5Cj4+Pj4gNTEyTSwgaGUg d2lsbCBiZSBhYmxlIHRvIG9wZW4gbGVzcyBxdWV1ZXMuCj4+Pj4KPj4+Pgo+Pj4+Pgo+Pj4+PiBJ IG5lZWQgdG8gZ28gbG9vayBpbnRvIHdoYXQgYWxsIHRoaXMgbXFkIGlzIGZvciwgd2hhdCBpdCBk b2VzIGFuZCB3aGF0IGl0IGlzCj4+Pj4+IGFib3V0LiBCdXQgcGlubmluZyBpcyByZWFsbHkgYmFk IGFuZCB0aGlzIGlzIGFuIGlzc3VlIHdpdGggdXNlcnNwYWNlIGNvbW1hbmQKPj4+Pj4gc2NoZWR1 bGluZyBhbiBpc3N1ZSB0aGF0IG9idmlvdXNseSBBTUQgZmFpbHMgdG8gdGFrZSBpbnRvIGFjY291 bnQgaW4gZGVzaWduCj4+Pj4+IHBoYXNlLgo+Pj4+IE1heWJlLCBidXQgdGhhdCBpcyB0aGUgSC9X IGRlc2lnbiBub24tdGhlLWxlc3MuIFdlIGNhbid0IHZlcnkgd2VsbAo+Pj4+IGNoYW5nZSB0aGUg SC9XLgo+Pj4KPj4+IFlvdSBjYW4gbm90IGNoYW5nZSB0aGUgaGFyZHdhcmUgYnV0IGl0IGlzIG5v dCBhbiBleGN1c2UgdG8gYWxsb3cgYmFkIGRlc2lnbiB0bwo+Pj4gc25lYWsgaW4gc29mdHdhcmUg dG8gd29yayBhcm91bmQgdGhhdC4gU28gaSB3b3VsZCByYXRoZXIgcGVuYWxpemUgYmFkIGhhcmR3 YXJlCj4+PiBkZXNpZ24gYW5kIGhhdmUgY29tbWFuZCBzdWJtaXNzaW9uIGluIHRoZSBrZXJuZWws IHVudGlsIEFNRCBmaXggaXRzIGhhcmR3YXJlIHRvCj4+PiBhbGxvdyBwcm9wZXIgc2NoZWR1bGlu ZyBieSB0aGUga2VybmVsIGFuZCBwcm9wZXIgY29udHJvbCBieSB0aGUga2VybmVsLiAKPj4gSSdt IHNvcnJ5IGJ1dCBJIGRvICpub3QqIHRoaW5rIHRoaXMgaXMgYSBiYWQgZGVzaWduLiBTL1cgc2No ZWR1bGluZyBpbgo+PiB0aGUga2VybmVsIGNhbiBub3QsIElNTywgc2NhbGUgd2VsbCB0byAxMDBL IHF1ZXVlcyBhbmQgMTBLIHByb2Nlc3Nlcy4KPiAKPiBJIGFtIG5vdCBhZHZvY2F0aW5nIGZvciBo YXZpbmcga2VybmVsIGRlY2lkZSBkb3duIHRvIHRoZSB2ZXJ5IGxhc3QgZGV0YWlscy4gSSBhbQo+ IGFkdm9jYXRpbmcgZm9yIGtlcm5lbCBiZWluZyBhYmxlIHRvIHByZWVtcHQgYXQgYW55IHRpbWUg YW5kIGJlIGFibGUgdG8gZGVjcmVhc2UKPiBvciBpbmNyZWFzZSB1c2VyIHF1ZXVlIHByaW9yaXR5 IHNvIG92ZXJhbGwga2VybmVsIGlzIGluIGNoYXJnZSBvZiByZXNvdXJjZXMKPiBtYW5hZ2VtZW50 IGFuZCBpdCBjYW4gaGFuZGxlIHJvZ3VlIGNsaWVudCBpbiBwcm9wZXIgZmFzaGlvbi4KPiAKPj4K Pj4+IEJlY2F1c2UgcmVhbGx5IHdoZXJlIHdlIHdhbnQgdG8gZ28gaXMgaGF2aW5nIEdQVSBjbG9z ZXIgdG8gYSBDUFUgaW4gdGVybSBvZiBzY2hlZHVsaW5nCj4+PiBjYXBhY2l0eSBhbmQgb25jZSB3 ZSBnZXQgdGhlcmUgd2Ugd2FudCB0aGUga2VybmVsIHRvIGFsd2F5cyBiZSBhYmxlIHRvIHRha2Ug b3Zlcgo+Pj4gYW5kIGRvIHdoYXRldmVyIGl0IHdhbnRzIGJlaGluZCBwcm9jZXNzIGJhY2suCj4+ IFdobyBkbyB5b3UgcmVmZXIgdG8gd2hlbiB5b3Ugc2F5ICJ3ZSIgPyBBRkFJSywgdGhlIGh3IHNj aGVkdWxpbmcKPj4gZGlyZWN0aW9uIGlzIHdoZXJlIEFNRCBpcyBub3cgYW5kIHdoZXJlIGl0IGlz IGhlYWRpbmcgaW4gdGhlIGZ1dHVyZS4KPj4gVGhhdCBkb2Vzbid0IHByZWNsdWRlIHRoZSBvcHRp b24gdG8gYWxsb3cgdGhlIGtlcm5lbCB0byB0YWtlIG92ZXIgYW5kIGRvCj4+IHdoYXQgaGUgd2Fu dHMuIEkgYWdyZWUgdGhhdCBpbiBLViB3ZSBoYXZlIGEgcHJvYmxlbSB3aGVyZSB3ZSBjYW4ndCBk byBhCj4+IG1pZC13YXZlIHByZWVtcHRpb24sIHNvIHRoZW9yZXRpY2FsbHksIGEgbG9uZyBydW5u aW5nIGNvbXB1dGUga2VybmVsIGNhbgo+PiBtYWtlIHRoaW5ncyBtZXNzeSwgYnV0IGluIENhcnJp em8sIHdlIHdpbGwgaGF2ZSB0aGlzIGFiaWxpdHkuIEhhdmluZwo+PiBzYWlkIHRoYXQsIGl0IHdp bGwgb25seSBiZSB0aHJvdWdoIHRoZSBDUCBIL1cgc2NoZWR1bGluZy4gU28gQU1EIGlzCj4+IF9u b3RfIGdvaW5nIHRvIGFiYW5kb24gSC9XIHNjaGVkdWxpbmcuIFlvdSBjYW4gZGlzbGlrZSBpdCwg YnV0IHRoaXMgaXMKPj4gdGhlIHNpdHVhdGlvbi4KPiAKPiBXZSB3YXMgZm9yIHRoZSBvdmVyYWxs IExpbnV4IGNvbW11bml0eSBidXQgbWF5YmUgaSBzaG91bGQgbm90IHByZXRlbmQgdG8gdGFsawo+ IGZvciBhbnlvbmUgaW50ZXJlc3RlZCBpbiBoYXZpbmcgYSBjb21tb24gc3RhbmRhcmQuCj4gCj4g TXkgcG9pbnQgaXMgdGhhdCBjdXJyZW50IGhhcmR3YXJlIGRvIG5vdCBoYXZlIGFwcHJvcmlhdGUg aGFyZHdhcmUgc3VwcG9ydCBmb3IKPiBwcmVlbXB0aW9uIGhlbmNlLCBjdXJyZW50IGhhcmR3YXJl IHNob3VsZCB1c2UgaW9jdGwgdG8gc2NoZWR1bGUgam9iIGFuZCBBTUQKPiBzaG91bGQgdGhpbmsg YSBiaXQgbW9yZSBvbiBjb21taXRpbmcgdG8gYSBkZXNpZ24gYW5kIGhhbmR3YXZpbmcgYW55IGhh cmR3YXJlCj4gc2hvcnQgY29taW5nIGFzIHNvbWV0aGluZyB0aGF0IGNhbiBiZSB3b3JrIGFyb3Vu ZCBpbiB0aGUgc29mdHdhcmUuIFRoZSBwaW5uaW5nCj4gdGhpbmcgaXMgYnJva2VuIGJ5IGRlc2ln biwgb25seSB3YXkgdG8gd29yayBhcm91bmQgaXQgaXMgdGhyb3VnaCBrZXJuZWwgY21kCj4gcXVl dWUgc2NoZWR1bGluZyB0aGF0J3MgYSBmYWN0LgoKPiAKPiBPbmNlIGhhcmR3YXJlIHN1cHBvcnQg cHJvcGVyIHByZWVtcHRpb24gYW5kIGFsbG93cyB0byBtb3ZlIGFyb3VuZC9ldmljdCBidWZmZXIK PiB1c2Ugb24gYmVoYWxmIG9mIHVzZXJzcGFjZSBjb21tYW5kIHF1ZXVlIHRoZW4gd2UgY2FuIGFs bG93IHVzZXJzcGFjZSBzY2hlZHVsaW5nCj4gYnV0IHVudGlsIHRoZW4gbXkgcGVyc29ubmFsIG9w aW5pb24gaXMgdGhhdCBpdCBzaG91bGQgbm90IGJlIGFsbG93ZWQgYW5kIHRoYXQKPiBwZW9wbGUg d2lsbCBoYXZlIHRvIHBheSB0aGUgaW9jdGwgcHJpY2Ugd2hpY2ggaSBwcm92ZWQgdG8gYmUgc21h bGwsIGJlY2F1c2UKPiByZWFsbHkgaWYgeW91IDEwMEsgcXVldWUgZWFjaCB3aXRoIG9uZSBqb2Is IGkgd291bGQgbm90IGV4cGVjdCB0aGF0IGFsbCB0aG9zZQo+IDEwMEsgam9iIHdpbGwgY29tcGxl dGUgaW4gbGVzcyB0aW1lIHRoYW4gaXQgdGFrZXMgdG8gZXhlY3V0ZSBhbiBpb2N0bCBpZSBieQo+ IGV2ZW4gaWYgeW91IGRvIG5vdCBoYXZlIHRoZSBpb2N0bCBkZWxheSB3aGF0IGV2ZXIgeW91IHNj aGVkdWxlIHdpbGwgaGF2ZSB0bwo+IHdhaXQgb24gcHJldmlvdXNseSBzdWJtaXRlZCBqb2JzLgoK QnV0IEplcm9tZSwgdGhlIGNvcmUgcHJvYmxlbSBzdGlsbCByZW1haW5zIGluIGVmZmVjdCwgZXZl biB3aXRoIHlvdXIKc3VnZ2VzdGlvbi4gSWYgYW4gYXBwbGljYXRpb24sIGVpdGhlciB2aWEgdXNl cnNwYWNlIHF1ZXVlIG9yIHZpYSBpb2N0bCwKc3VibWl0cyBhIGxvbmctcnVubmluZyBrZXJuZWws IHRoYW4gdGhlIENQVSBpbiBnZW5lcmFsIGNhbid0IHN0b3AgdGhlCkdQVSBmcm9tIHJ1bm5pbmcg aXQuIEFuZCBpZiB0aGF0IGtlcm5lbCBkb2VzIHdoaWxlKDEpOyB0aGFuIHRoYXQncyBpdCwKZ2Ft ZSdzIG92ZXIsIGFuZCBubyBtYXR0ZXIgaG93IHlvdSBzdWJtaXR0ZWQgdGhlIHdvcmsuIFNvIEkg ZG9uJ3QgcmVhbGx5CnNlZSB0aGUgYmlnIGFkdmFudGFnZSBpbiB5b3VyIHByb3Bvc2FsLiBPbmx5 IGluIENaIHdlIGNhbiBzdG9wIHRoaXMgd2F2ZQooYnkgQ1AgSC9XIHNjaGVkdWxpbmcgb25seSku IFdoYXQgYXJlIHlvdSBzYXlpbmcgaXMgYmFzaWNhbGx5IEkgd29uJ3QKYWxsb3cgcGVvcGxlIHRv IHVzZSBjb21wdXRlIG9uIExpbnV4IEtWIHN5c3RlbSBiZWNhdXNlIGl0IF9tYXlfIGdldCB0aGUK c3lzdGVtIHN0dWNrLgoKU28gZXZlbiBpZiBJIHJlYWxseSB3YW50ZWQgdG8sIGFuZCBJIG1heSBh Z3JlZSB3aXRoIHlvdSB0aGVvcmV0aWNhbGx5IG9uCnRoYXQsIEkgY2FuJ3QgZnVsZmlsbCB5b3Vy IGRlc2lyZSB0byBtYWtlIHRoZSAia2VybmVsIGJlaW5nIGFibGUgdG8KcHJlZW1wdCBhdCBhbnkg dGltZSBhbmQgYmUgYWJsZSB0byBkZWNyZWFzZSBvciBpbmNyZWFzZSB1c2VyIHF1ZXVlCnByaW9y aXR5IHNvIG92ZXJhbGwga2VybmVsIGlzIGluIGNoYXJnZSBvZiByZXNvdXJjZXMgbWFuYWdlbWVu dCBhbmQgaXQKY2FuIGhhbmRsZSByb2d1ZSBjbGllbnQgaW4gcHJvcGVyIGZhc2hpb24iLiBOb3Qg aW4gS1YsIGFuZCBJIGd1ZXNzIG5vdAppbiBDWiBhcyB3ZWxsLgoKCU9kZWQKCj4gCj4+Pgo+Pj4+ Pj4+Cj4+Pj4+Pj4+Pgo+Pj4+Pj4+Pj4gSXQgbWlnaHQgYmUgYmV0dGVyIHRvIGFkZCBhIGRyaXZl cnMvZ3B1L2RybS9hbWQgZGlyZWN0b3J5IGFuZCBhZGQgY29tbW9uCj4+Pj4+Pj4+PiBzdHVmZiB0 aGVyZS4KPj4+Pj4+Pj4+Cj4+Pj4+Pj4+PiBHaXZlbiB0aGF0IHRoaXMgaXMgbm90IGludGVuZGVk IHRvIGJlIGZpbmFsIEhTQSBhcGkgQUZBSUNUIHRoZW4gaSB3b3VsZAo+Pj4+Pj4+Pj4gc2F5IHRo aXMgZmFyIGJldHRlciB0byBhdm9pZCB0aGUgd2hvbGUga2ZkIG1vZHVsZSBhbmQgYWRkIGlvY3Rs IHRvIHJhZGVvbi4KPj4+Pj4+Pj4+IFRoaXMgd291bGQgYXZvaWQgY3JhenkgY29tbXVuaWNhdGlv biBidHcgcmFkZW9uIGFuZCBrZmQuCj4+Pj4+Pj4+Pgo+Pj4+Pj4+Pj4gVGhlIHdob2xlIGFwZXJ0 dXJlIGJ1c2luZXNzIG5lZWRzIHNvbWUgc2VyaW91cyBleHBsYW5hdGlvbi4gRXNwZWNpYWx5IGFz Cj4+Pj4+Pj4+PiB5b3Ugd2FudCB0byB1c2UgdXNlcnNwYWNlIGFkZHJlc3MgdGhlcmUgaXMgbm90 aGluZyB0byBwcmV2ZW50IHVzZXJzcGFjZQo+Pj4+Pj4+Pj4gcHJvZ3JhbSBmcm9tIGFsbG9jYXRp bmcgdGhpbmdzIGF0IGFkZHJlc3MgeW91IHJlc2VydmUgZm9yIGxkcywgc2NyYXRjaCwKPj4+Pj4+ Pj4+IC4uLiBvbmx5IHNhbmUgd2F5IHdvdWxkIGJlIHRvIG1vdmUgdGhvc2UgbGRzLCBzY3JhdGNo IGluc2lkZSB0aGUgdmlydHVhbAo+Pj4+Pj4+Pj4gYWRkcmVzcyByZXNlcnZlZCBmb3Iga2VybmVs IChzZWUga2VybmVsIG1lbW9yeSBtYXApLgo+Pj4+Pj4+Pj4KPj4+Pj4+Pj4+IFRoZSB3aG9sZSBi dXNpbmVzcyBvZiBsb2NraW5nIHBlcmZvcm1hbmNlIGNvdW50ZXIgZm9yIGV4Y2x1c2l2ZSBwZXIg cHJvY2Vzcwo+Pj4+Pj4+Pj4gYWNjZXNzIGlzIGEgYmlnIE5PLiBXaGljaCBsZWFkcyBtZSB0byB0 aGUgcXVlc3Rpb25hYmxlIHVzZWZ1bGxuZXNzIG9mIHVzZXIKPj4+Pj4+Pj4+IHNwYWNlIGNvbW1h bmQgcmluZy4KPj4+Pj4+Pj4gVGhhdCdzIGxpa2Ugc2F5aW5nOiAiV2hpY2ggbGVhZHMgbWUgdG8g dGhlIHF1ZXN0aW9uYWJsZSB1c2VmdWxuZXNzIG9mIEhTQSIuIEkKPj4+Pj4+Pj4gZmluZCBpdCBh bmFsb2dvdXMgdG8gYSBzaXR1YXRpb24gd2hlcmUgYSBuZXR3b3JrIG1haW50YWluZXIgbmFja2lu ZyBhIGRyaXZlcgo+Pj4+Pj4+PiBmb3IgYSBuZXR3b3JrIGNhcmQsIHdoaWNoIGlzIHNsb3dlciB0 aGFuIGEgZGlmZmVyZW50IG5ldHdvcmsgY2FyZC4gRG9lc24ndAo+Pj4+Pj4+PiBzZWVtIHJlYXNv bmFibGUgdGhpcyBzaXR1YXRpb24gaXMgd291bGQgaGFwcGVuLiBIZSB3b3VsZCBzdGlsbCBwdXQg Ym90aCB0aGUKPj4+Pj4+Pj4gZHJpdmVycyBpbiB0aGUga2VybmVsIGJlY2F1c2UgcGVvcGxlIHdh bnQgdG8gdXNlIHRoZSBIL1cgYW5kIGl0cyBmZWF0dXJlcy4gU28sCj4+Pj4+Pj4+IEkgZG9uJ3Qg dGhpbmsgdGhpcyBpcyBhIHZhbGlkIHJlYXNvbiB0byBOQUNLIHRoZSBkcml2ZXIuCj4+Pj4+Cj4+ Pj4+IExldCBtZSByZXBocmFzZSwgZHJvcCB0aGUgdGhlIHBlcmZvcm1hbmNlIGNvdW50ZXIgaW9j dGwgYW5kIG1vZHVsbyBtZW1vcnkgcGlubmluZwo+Pj4+PiBpIHNlZSBubyBvYmplY3Rpb24uIElu IG90aGVyIHdvcmQsIGkgYW0gbm90IE5BQ0tJTkcgd2hvbGUgcGF0Y2hzZXQgaSBhbSBOQUNLSU5H Cj4+Pj4+IHRoZSBwZXJmb3JtYW5jZSBpb2N0bC4KPj4+Pj4KPj4+Pj4gQWdhaW4gdGhpcyBpcyBh bm90aGVyIGFyZ3VtZW50IGZvciByb3VuZCB0cmlwIHRvIHRoZSBrZXJuZWwuIEFzIGluc2lkZSBr ZXJuZWwgeW91Cj4+Pj4+IGNvdWxkIHByb3Blcmx5IGRvIGV4Y2x1c2l2ZSBncHUgY291bnRlciBh Y2Nlc3MgYWNjcm9zcyBzaW5nbGUgdXNlciBjbWQgYnVmZmVyCj4+Pj4+IGV4ZWN1dGlvbi4KPj4+ Pj4KPj4+Pj4+Pj4KPj4+Pj4+Pj4+IEkgb25seSBzZWUgaXNzdWVzIHdpdGggdGhhdC4gRmlyc3Qg YW5kIGZvcmVtb3N0IGkgd291bGQKPj4+Pj4+Pj4+IG5lZWQgdG8gc2VlIHNvbGlkIGZpZ3VyZXMg dGhhdCBrZXJuZWwgaW9jdGwgb3Igc3lzY2FsbCBoYXMgYSBoaWdoZXIgYW4KPj4+Pj4+Pj4+IG92 ZXJoZWFkIHRoYXQgaXMgbWVhc3VyYWJsZSBpbiBhbnkgbWVhbmluZyBmdWxsIHdheSBhZ2FpbnN0 IGEgc2ltcGxlCj4+Pj4+Pj4+PiBmdW5jdGlvbiBjYWxsLiBJIGtub3cgdGhlIHVzZXJzcGFjZSBj b21tYW5kIHJpbmcgaXMgYSBiaWcgbWFya2V0aW5nIGZlYXR1cmVzCj4+Pj4+Pj4+PiB0aGF0IHBs ZWFzZSBpZ25vcmFudCB1c2Vyc3BhY2UgcHJvZ3JhbW1lci4gQnV0IHJlYWxseSB0aGlzIG9ubHkg YnJpbmdzIGlzc3Vlcwo+Pj4+Pj4+Pj4gYW5kIGZvciBhYnNvbHV0ZWx5IG5vdCB1cHNpZGUgYWZh aWN0Lgo+Pj4+Pj4+PiBSZWFsbHkgPyBZb3UgdGhpbmsgdGhhdCBkb2luZyBhIGNvbnRleHQgc3dp dGNoIHRvIGtlcm5lbCBzcGFjZSwgd2l0aCBhbGwgaXRzCj4+Pj4+Pj4+IG92ZXJoZWFkLCBpcyBf bm90XyBtb3JlIGV4cGFuc2l2ZSB0aGFuIGp1c3QgY2FsbGluZyBhIGZ1bmN0aW9uIGluIHVzZXJz cGFjZQo+Pj4+Pj4+PiB3aGljaCBvbmx5IHB1dHMgYSBidWZmZXIgb24gYSByaW5nIGFuZCB3cml0 ZXMgYSBkb29yYmVsbCA/Cj4+Pj4+Cj4+Pj4+IEkgYW0gc2F5aW5nIHRoZSBvdmVyaGVhZCBpcyBu b3QgdGhhdCBiaWcgYW5kIGl0IHByb2JhYmx5IHdpbGwgbm90IG1hdHRlciBpbiBtb3N0Cj4+Pj4+ IHVzZWNhc2UuIEZvciBpbnN0YW5jZSBpIGRpZCB3cm90ZSB0aGUgbW9zdCB1c2VsZXNzIGtlcm5l bCBtb2R1bGUgdGhhdCBhZGQgdHdvCj4+Pj4+IG51bWJlciB0aHJvdWdoIGFuIGlvY3RsIChodHRw Oi8vcGVvcGxlLmZyZWVkZXNrdG9wLm9yZy9+Z2xpc3NlL2FkZGVyLnRhcikgYW5kCj4+Pj4+IGl0 IHRha2VzIH4wLjM1bWljcm9zZWNvbmRzIHdpdGggaW9jdGwgd2hpbGUgZnVuY3Rpb24gaXMgfjAu MDI1bWljcm9zZWNvbmRzIHNvCj4+Pj4+IGlvY3RsIGlzIDEzIHRpbWVzIHNsb3dlci4KPj4+Pj4K Pj4+Pj4gTm93IGlmIHRoZXJlIGlzIGVub3VnaCBkYXRhIHRoYXQgc2hvd3MgdGhhdCBhIHNpZ25p ZmljYW50IHBlcmNlbnRhZ2Ugb2Ygam9icwo+Pj4+PiBzdWJtaXRlZCB0byB0aGUgR1BVIHdpbGwg dGFrZSBsZXNzIHRoYXQgMC4zNW1pY3Jvc2Vjb25kIHRoZW4geWVzIHVzZXJzcGFjZQo+Pj4+PiBz Y2hlZHVsaW5nIGRvZXMgbWFrZSBzZW5zZS4gQnV0IHNvIGZhciBhbGwgd2UgaGF2ZSBpcyBoYW5k d2F2aW5nIHdpdGggbm8gZGF0YQo+Pj4+PiB0byBzdXBwb3J0IGFueSBmYWN0cy4KPj4+Pj4KPj4+ Pj4KPj4+Pj4gTm93IGlmIHdlIHdhbnQgdG8gc2NoZWR1bGUgZnJvbSB1c2Vyc3BhY2UgdGhhbiB5 b3Ugd2lsbCBuZWVkIHRvIGRvIHNvbWV0aGluZwo+Pj4+PiBhYm91dCB0aGUgcGlubmluZywgc29t ZXRoaW5nIHRoYXQgZ2l2ZXMgY29udHJvbCB0byBrZXJuZWwgc28gdGhhdCBrZXJuZWwgY2FuCj4+ Pj4+IHVucGluIHdoZW4gaXQgd2FudHMgYW5kIG1vdmUgb2JqZWN0IHdoZW4gaXQgd2FudHMgbm8g bWF0dGVyIHdoYXQgdXNlcnNwYWNlIGlzCj4+Pj4+IGRvaW5nLgo+Pj4+Pgo+Pj4+Pj4+Pj4KPiAK PiAtLQo+IFRvIHVuc3Vic2NyaWJlLCBzZW5kIGEgbWVzc2FnZSB3aXRoICd1bnN1YnNjcmliZSBs aW51eC1tbScgaW4KPiB0aGUgYm9keSB0byBtYWpvcmRvbW9Aa3ZhY2sub3JnLiAgRm9yIG1vcmUg aW5mbyBvbiBMaW51eCBNTSwKPiBzZWU6IGh0dHA6Ly93d3cubGludXgtbW0ub3JnLyAuCj4gRG9u J3QgZW1haWw6IDxhIGhyZWY9bWFpbHRvOiJkb250QGt2YWNrLm9yZyI+IGVtYWlsQGt2YWNrLm9y ZyA8L2E+Cj4gCgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f XwpkcmktZGV2ZWwgbWFpbGluZyBsaXN0CmRyaS1kZXZlbEBsaXN0cy5mcmVlZGVza3RvcC5vcmcK aHR0cDovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZvL2RyaS1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f180.google.com (mail-pd0-f180.google.com [209.85.192.180]) by kanga.kvack.org (Postfix) with ESMTP id A2F9E6B003C for ; Mon, 21 Jul 2014 15:23:56 -0400 (EDT) Received: by mail-pd0-f180.google.com with SMTP id y13so9615885pdi.25 for ; Mon, 21 Jul 2014 12:23:56 -0700 (PDT) Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2lp0242.outbound.protection.outlook.com. [207.46.163.242]) by mx.google.com with ESMTPS id lx8si15204888pab.115.2014.07.21.12.23.54 for (version=TLSv1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 21 Jul 2014 12:23:55 -0700 (PDT) Message-ID: <53CD68BF.4020308@amd.com> Date: Mon, 21 Jul 2014 22:23:43 +0300 From: Oded Gabbay MIME-Version: 1.0 Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver References: <53C7D645.3070607@amd.com> <20140720174652.GE3068@gmail.com> <53CD0961.4070505@amd.com> <53CD17FD.3000908@vodafone.de> <53CD1FB6.1000602@amd.com> <20140721155437.GA4519@gmail.com> <53CD5122.5040804@amd.com> <20140721181433.GA5196@gmail.com> <53CD5DBC.7010301@amd.com> <20140721185940.GA5278@gmail.com> In-Reply-To: <20140721185940.GA5278@gmail.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: owner-linux-mm@kvack.org List-ID: To: Jerome Glisse Cc: Andrew Lewycky , =?UTF-8?B?TWljaGVsIETDpG56ZXI=?= , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , linux-mm , Evgeny Pinchuk , Alexey Skidanov , Andrew Morton On 21/07/14 21:59, Jerome Glisse wrote: > On Mon, Jul 21, 2014 at 09:36:44PM +0300, Oded Gabbay wrote: >> On 21/07/14 21:14, Jerome Glisse wrote: >>> On Mon, Jul 21, 2014 at 08:42:58PM +0300, Oded Gabbay wrote: >>>> On 21/07/14 18:54, Jerome Glisse wrote: >>>>> On Mon, Jul 21, 2014 at 05:12:06PM +0300, Oded Gabbay wrote: >>>>>> On 21/07/14 16:39, Christian K=C3=B6nig wrote: >>>>>>> Am 21.07.2014 14:36, schrieb Oded Gabbay: >>>>>>>> On 20/07/14 20:46, Jerome Glisse wrote: >>>>>>>>> On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: >>>>>>>>>> Forgot to cc mailing list on cover letter. Sorry. >>>>>>>>>> >>>>>>>>>> As a continuation to the existing discussion, here is a v2 pat= ch series >>>>>>>>>> restructured with a cleaner history and no totally-different-e= arly-versions >>>>>>>>>> of the code. >>>>>>>>>> >>>>>>>>>> Instead of 83 patches, there are now a total of 25 patches, wh= ere 5 of them >>>>>>>>>> are modifications to radeon driver and 18 of them include only= amdkfd code. >>>>>>>>>> There is no code going away or even modified between patches, = only added. >>>>>>>>>> >>>>>>>>>> The driver was renamed from radeon_kfd to amdkfd and moved to = reside under >>>>>>>>>> drm/radeon/amdkfd. This move was done to emphasize the fact th= at this driver >>>>>>>>>> is an AMD-only driver at this point. Having said that, we do f= oresee a >>>>>>>>>> generic hsa framework being implemented in the future and in t= hat case, we >>>>>>>>>> will adjust amdkfd to work within that framework. >>>>>>>>>> >>>>>>>>>> As the amdkfd driver should support multiple AMD gfx drivers, = we want to >>>>>>>>>> keep it as a seperate driver from radeon. Therefore, the amdkf= d code is >>>>>>>>>> contained in its own folder. The amdkfd folder was put under t= he radeon >>>>>>>>>> folder because the only AMD gfx driver in the Linux kernel at = this point >>>>>>>>>> is the radeon driver. Having said that, we will probably need = to move it >>>>>>>>>> (maybe to be directly under drm) after we integrate with addit= ional AMD gfx >>>>>>>>>> drivers. >>>>>>>>>> >>>>>>>>>> For people who like to review using git, the v2 patch set is l= ocated at: >>>>>>>>>> http://cgit.freedesktop.org/~gabbayo/linux/log/?h=3Dkfd-next-3= .17-v2 >>>>>>>>>> >>>>>>>>>> Written by Oded Gabbayh >>>>>>>>> >>>>>>>>> So quick comments before i finish going over all patches. There= is many >>>>>>>>> things that need more documentation espacialy as of right now t= here is >>>>>>>>> no userspace i can go look at. >>>>>>>> So quick comments on some of your questions but first of all, th= anks for the >>>>>>>> time you dedicated to review the code. >>>>>>>>> >>>>>>>>> There few show stopper, biggest one is gpu memory pinning this = is a big >>>>>>>>> no, that would need serious arguments for any hope of convincin= g me on >>>>>>>>> that side. >>>>>>>> We only do gpu memory pinning for kernel objects. There are no u= serspace >>>>>>>> objects that are pinned on the gpu memory in our driver. If that= is the case, >>>>>>>> is it still a show stopper ? >>>>>>>> >>>>>>>> The kernel objects are: >>>>>>>> - pipelines (4 per device) >>>>>>>> - mqd per hiq (only 1 per device) >>>>>>>> - mqd per userspace queue. On KV, we support up to 1K queues per= process, for >>>>>>>> a total of 512K queues. Each mqd is 151 bytes, but the allocatio= n is done in >>>>>>>> 256 alignment. So total *possible* memory is 128MB >>>>>>>> - kernel queue (only 1 per device) >>>>>>>> - fence address for kernel queue >>>>>>>> - runlists for the CP (1 or 2 per device) >>>>>>> >>>>>>> The main questions here are if it's avoid able to pin down the me= mory and if the >>>>>>> memory is pinned down at driver load, by request from userspace o= r by anything >>>>>>> else. >>>>>>> >>>>>>> As far as I can see only the "mqd per userspace queue" might be a= bit >>>>>>> questionable, everything else sounds reasonable. >>>>>>> >>>>>>> Christian. >>>>>> >>>>>> Most of the pin downs are done on device initialization. >>>>>> The "mqd per userspace" is done per userspace queue creation. Howe= ver, as I >>>>>> said, it has an upper limit of 128MB on KV, and considering the 2G= local >>>>>> memory, I think it is OK. >>>>>> The runlists are also done on userspace queue creation/deletion, b= ut we only >>>>>> have 1 or 2 runlists per device, so it is not that bad. >>>>> >>>>> 2G local memory ? You can not assume anything on userside configura= tion some >>>>> one might build an hsa computer with 512M and still expect a functi= oning >>>>> desktop. >>>> First of all, I'm only considering Kaveri computer, not "hsa" comput= er. >>>> Second, I would imagine we can build some protection around it, like >>>> checking total local memory and limit number of queues based on some >>>> percentage of that total local memory. So, if someone will have only >>>> 512M, he will be able to open less queues. >>>> >>>> >>>>> >>>>> I need to go look into what all this mqd is for, what it does and w= hat it is >>>>> about. But pinning is really bad and this is an issue with userspac= e command >>>>> scheduling an issue that obviously AMD fails to take into account i= n design >>>>> phase. >>>> Maybe, but that is the H/W design non-the-less. We can't very well >>>> change the H/W. >>> >>> You can not change the hardware but it is not an excuse to allow bad = design to >>> sneak in software to work around that. So i would rather penalize bad= hardware >>> design and have command submission in the kernel, until AMD fix its h= ardware to >>> allow proper scheduling by the kernel and proper control by the kerne= l.=20 >> I'm sorry but I do *not* think this is a bad design. S/W scheduling in >> the kernel can not, IMO, scale well to 100K queues and 10K processes. >=20 > I am not advocating for having kernel decide down to the very last deta= ils. I am > advocating for kernel being able to preempt at any time and be able to = decrease > or increase user queue priority so overall kernel is in charge of resou= rces > management and it can handle rogue client in proper fashion. >=20 >> >>> Because really where we want to go is having GPU closer to a CPU in t= erm of scheduling >>> capacity and once we get there we want the kernel to always be able t= o take over >>> and do whatever it wants behind process back. >> Who do you refer to when you say "we" ? AFAIK, the hw scheduling >> direction is where AMD is now and where it is heading in the future. >> That doesn't preclude the option to allow the kernel to take over and = do >> what he wants. I agree that in KV we have a problem where we can't do = a >> mid-wave preemption, so theoretically, a long running compute kernel c= an >> make things messy, but in Carrizo, we will have this ability. Having >> said that, it will only be through the CP H/W scheduling. So AMD is >> _not_ going to abandon H/W scheduling. You can dislike it, but this is >> the situation. >=20 > We was for the overall Linux community but maybe i should not pretend t= o talk > for anyone interested in having a common standard. >=20 > My point is that current hardware do not have approriate hardware suppo= rt for > preemption hence, current hardware should use ioctl to schedule job and= AMD > should think a bit more on commiting to a design and handwaving any har= dware > short coming as something that can be work around in the software. The = pinning > thing is broken by design, only way to work around it is through kernel= cmd > queue scheduling that's a fact. >=20 > Once hardware support proper preemption and allows to move around/evict= buffer > use on behalf of userspace command queue then we can allow userspace sc= heduling > but until then my personnal opinion is that it should not be allowed an= d that > people will have to pay the ioctl price which i proved to be small, bec= ause > really if you 100K queue each with one job, i would not expect that all= those > 100K job will complete in less time than it takes to execute an ioctl i= e by > even if you do not have the ioctl delay what ever you schedule will hav= e to > wait on previously submited jobs. But Jerome, the core problem still remains in effect, even with your suggestion. If an application, either via userspace queue or via ioctl, submits a long-running kernel, than the CPU in general can't stop the GPU from running it. And if that kernel does while(1); than that's it, game's over, and no matter how you submitted the work. So I don't really see the big advantage in your proposal. Only in CZ we can stop this wave (by CP H/W scheduling only). What are you saying is basically I won't allow people to use compute on Linux KV system because it _may_ get the system stuck. So even if I really wanted to, and I may agree with you theoretically on that, I can't fulfill your desire to make the "kernel being able to preempt at any time and be able to decrease or increase user queue priority so overall kernel is in charge of resources management and it can handle rogue client in proper fashion". Not in KV, and I guess not in CZ as well. Oded >=20 >>> >>>>>>> >>>>>>>>> >>>>>>>>> It might be better to add a drivers/gpu/drm/amd directory and a= dd common >>>>>>>>> stuff there. >>>>>>>>> >>>>>>>>> Given that this is not intended to be final HSA api AFAICT then= i would >>>>>>>>> say this far better to avoid the whole kfd module and add ioctl= to radeon. >>>>>>>>> This would avoid crazy communication btw radeon and kfd. >>>>>>>>> >>>>>>>>> The whole aperture business needs some serious explanation. Esp= ecialy as >>>>>>>>> you want to use userspace address there is nothing to prevent u= serspace >>>>>>>>> program from allocating things at address you reserve for lds, = scratch, >>>>>>>>> ... only sane way would be to move those lds, scratch inside th= e virtual >>>>>>>>> address reserved for kernel (see kernel memory map). >>>>>>>>> >>>>>>>>> The whole business of locking performance counter for exclusive= per process >>>>>>>>> access is a big NO. Which leads me to the questionable usefulln= ess of user >>>>>>>>> space command ring. >>>>>>>> That's like saying: "Which leads me to the questionable usefulne= ss of HSA". I >>>>>>>> find it analogous to a situation where a network maintainer nack= ing a driver >>>>>>>> for a network card, which is slower than a different network car= d. Doesn't >>>>>>>> seem reasonable this situation is would happen. He would still p= ut both the >>>>>>>> drivers in the kernel because people want to use the H/W and its= features. So, >>>>>>>> I don't think this is a valid reason to NACK the driver. >>>>> >>>>> Let me rephrase, drop the the performance counter ioctl and modulo = memory pinning >>>>> i see no objection. In other word, i am not NACKING whole patchset = i am NACKING >>>>> the performance ioctl. >>>>> >>>>> Again this is another argument for round trip to the kernel. As ins= ide kernel you >>>>> could properly do exclusive gpu counter access accross single user = cmd buffer >>>>> execution. >>>>> >>>>>>>> >>>>>>>>> I only see issues with that. First and foremost i would >>>>>>>>> need to see solid figures that kernel ioctl or syscall has a hi= gher an >>>>>>>>> overhead that is measurable in any meaning full way against a s= imple >>>>>>>>> function call. I know the userspace command ring is a big marke= ting features >>>>>>>>> that please ignorant userspace programmer. But really this only= brings issues >>>>>>>>> and for absolutely not upside afaict. >>>>>>>> Really ? You think that doing a context switch to kernel space, = with all its >>>>>>>> overhead, is _not_ more expansive than just calling a function i= n userspace >>>>>>>> which only puts a buffer on a ring and writes a doorbell ? >>>>> >>>>> I am saying the overhead is not that big and it probably will not m= atter in most >>>>> usecase. For instance i did wrote the most useless kernel module th= at add two >>>>> number through an ioctl (http://people.freedesktop.org/~glisse/adde= r.tar) and >>>>> it takes ~0.35microseconds with ioctl while function is ~0.025micro= seconds so >>>>> ioctl is 13 times slower. >>>>> >>>>> Now if there is enough data that shows that a significant percentag= e of jobs >>>>> submited to the GPU will take less that 0.35microsecond then yes us= erspace >>>>> scheduling does make sense. But so far all we have is handwaving wi= th no data >>>>> to support any facts. >>>>> >>>>> >>>>> Now if we want to schedule from userspace than you will need to do = something >>>>> about the pinning, something that gives control to kernel so that k= ernel can >>>>> unpin when it wants and move object when it wants no matter what us= erspace is >>>>> doing. >>>>> >>>>>>>>> >=20 > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >=20 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933542AbaGUTX4 (ORCPT ); Mon, 21 Jul 2014 15:23:56 -0400 Received: from mail-by2lp0244.outbound.protection.outlook.com ([207.46.163.244]:4277 "EHLO na01-by2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755206AbaGUTXz convert rfc822-to-8bit (ORCPT ); Mon, 21 Jul 2014 15:23:55 -0400 X-WSS-ID: 0N92TVL-08-API-02 X-M-MSG: Message-ID: <53CD68BF.4020308@amd.com> Date: Mon, 21 Jul 2014 22:23:43 +0300 From: Oded Gabbay Organization: AMD User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Jerome Glisse CC: Andrew Lewycky , =?UTF-8?B?TWljaGVsIETDpG56ZXI=?= , "linux-kernel@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , linux-mm , "Evgeny Pinchuk" , Alexey Skidanov , Andrew Morton Subject: Re: [PATCH v2 00/25] AMDKFD kernel driver References: <53C7D645.3070607@amd.com> <20140720174652.GE3068@gmail.com> <53CD0961.4070505@amd.com> <53CD17FD.3000908@vodafone.de> <53CD1FB6.1000602@amd.com> <20140721155437.GA4519@gmail.com> <53CD5122.5040804@amd.com> <20140721181433.GA5196@gmail.com> <53CD5DBC.7010301@amd.com> <20140721185940.GA5278@gmail.com> In-Reply-To: <20140721185940.GA5278@gmail.com> Content-Type: text/plain; charset="UTF-8" X-Originating-IP: [10.224.155.153] Content-Transfer-Encoding: 8BIT X-EOPAttributedMessage: 0 X-Forefront-Antispam-Report: CIP:165.204.84.222;CTRY:US;IPV:NLI;IPV:NLI;EFV:NLI;SFV:NSPM;SFS:(6009001)(428002)(189002)(199002)(51914003)(51704005)(52254002)(479174003)(24454002)(4396001)(36756003)(81542001)(83322001)(87936001)(106466001)(99396002)(93886003)(65816999)(85306003)(23676002)(76176999)(44976005)(101416001)(84676001)(83506001)(50986999)(21056001)(19580405001)(87266999)(46102001)(95666004)(86362001)(50466002)(79102001)(15975445006)(54356999)(83072002)(74662001)(77982001)(76482001)(97736001)(74502001)(107046002)(33656002)(68736004)(92566001)(64126003)(47776003)(19580395003)(92726001)(65956001)(15202345003)(105586002)(65806001)(110136001)(102836001)(80022001)(81342001)(20776003)(31966008)(64706001)(85852003);DIR:OUT;SFP:;SCL:1;SRVR:BN1PR02MB037;H:atltwp02.amd.com;FPR:;MLV:sfv;PTR:InfoDomainNonexistent;MX:1;LANG:en; X-Microsoft-Antispam: BCL:0;PCL:0;RULEID: X-Forefront-PRVS: 0279B3DD0D Authentication-Results: spf=none (sender IP is 165.204.84.222) smtp.mailfrom=Oded.Gabbay@amd.com; X-OriginatorOrg: amd4.onmicrosoft.com Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 21/07/14 21:59, Jerome Glisse wrote: > On Mon, Jul 21, 2014 at 09:36:44PM +0300, Oded Gabbay wrote: >> On 21/07/14 21:14, Jerome Glisse wrote: >>> On Mon, Jul 21, 2014 at 08:42:58PM +0300, Oded Gabbay wrote: >>>> On 21/07/14 18:54, Jerome Glisse wrote: >>>>> On Mon, Jul 21, 2014 at 05:12:06PM +0300, Oded Gabbay wrote: >>>>>> On 21/07/14 16:39, Christian König wrote: >>>>>>> Am 21.07.2014 14:36, schrieb Oded Gabbay: >>>>>>>> On 20/07/14 20:46, Jerome Glisse wrote: >>>>>>>>> On Thu, Jul 17, 2014 at 04:57:25PM +0300, Oded Gabbay wrote: >>>>>>>>>> Forgot to cc mailing list on cover letter. Sorry. >>>>>>>>>> >>>>>>>>>> As a continuation to the existing discussion, here is a v2 patch series >>>>>>>>>> restructured with a cleaner history and no totally-different-early-versions >>>>>>>>>> of the code. >>>>>>>>>> >>>>>>>>>> Instead of 83 patches, there are now a total of 25 patches, where 5 of them >>>>>>>>>> are modifications to radeon driver and 18 of them include only amdkfd code. >>>>>>>>>> There is no code going away or even modified between patches, only added. >>>>>>>>>> >>>>>>>>>> The driver was renamed from radeon_kfd to amdkfd and moved to reside under >>>>>>>>>> drm/radeon/amdkfd. This move was done to emphasize the fact that this driver >>>>>>>>>> is an AMD-only driver at this point. Having said that, we do foresee a >>>>>>>>>> generic hsa framework being implemented in the future and in that case, we >>>>>>>>>> will adjust amdkfd to work within that framework. >>>>>>>>>> >>>>>>>>>> As the amdkfd driver should support multiple AMD gfx drivers, we want to >>>>>>>>>> keep it as a seperate driver from radeon. Therefore, the amdkfd code is >>>>>>>>>> contained in its own folder. The amdkfd folder was put under the radeon >>>>>>>>>> folder because the only AMD gfx driver in the Linux kernel at this point >>>>>>>>>> is the radeon driver. Having said that, we will probably need to move it >>>>>>>>>> (maybe to be directly under drm) after we integrate with additional AMD gfx >>>>>>>>>> drivers. >>>>>>>>>> >>>>>>>>>> For people who like to review using git, the v2 patch set is located at: >>>>>>>>>> http://cgit.freedesktop.org/~gabbayo/linux/log/?h=kfd-next-3.17-v2 >>>>>>>>>> >>>>>>>>>> Written by Oded Gabbayh >>>>>>>>> >>>>>>>>> So quick comments before i finish going over all patches. There is many >>>>>>>>> things that need more documentation espacialy as of right now there is >>>>>>>>> no userspace i can go look at. >>>>>>>> So quick comments on some of your questions but first of all, thanks for the >>>>>>>> time you dedicated to review the code. >>>>>>>>> >>>>>>>>> There few show stopper, biggest one is gpu memory pinning this is a big >>>>>>>>> no, that would need serious arguments for any hope of convincing me on >>>>>>>>> that side. >>>>>>>> We only do gpu memory pinning for kernel objects. There are no userspace >>>>>>>> objects that are pinned on the gpu memory in our driver. If that is the case, >>>>>>>> is it still a show stopper ? >>>>>>>> >>>>>>>> The kernel objects are: >>>>>>>> - pipelines (4 per device) >>>>>>>> - mqd per hiq (only 1 per device) >>>>>>>> - mqd per userspace queue. On KV, we support up to 1K queues per process, for >>>>>>>> a total of 512K queues. Each mqd is 151 bytes, but the allocation is done in >>>>>>>> 256 alignment. So total *possible* memory is 128MB >>>>>>>> - kernel queue (only 1 per device) >>>>>>>> - fence address for kernel queue >>>>>>>> - runlists for the CP (1 or 2 per device) >>>>>>> >>>>>>> The main questions here are if it's avoid able to pin down the memory and if the >>>>>>> memory is pinned down at driver load, by request from userspace or by anything >>>>>>> else. >>>>>>> >>>>>>> As far as I can see only the "mqd per userspace queue" might be a bit >>>>>>> questionable, everything else sounds reasonable. >>>>>>> >>>>>>> Christian. >>>>>> >>>>>> Most of the pin downs are done on device initialization. >>>>>> The "mqd per userspace" is done per userspace queue creation. However, as I >>>>>> said, it has an upper limit of 128MB on KV, and considering the 2G local >>>>>> memory, I think it is OK. >>>>>> The runlists are also done on userspace queue creation/deletion, but we only >>>>>> have 1 or 2 runlists per device, so it is not that bad. >>>>> >>>>> 2G local memory ? You can not assume anything on userside configuration some >>>>> one might build an hsa computer with 512M and still expect a functioning >>>>> desktop. >>>> First of all, I'm only considering Kaveri computer, not "hsa" computer. >>>> Second, I would imagine we can build some protection around it, like >>>> checking total local memory and limit number of queues based on some >>>> percentage of that total local memory. So, if someone will have only >>>> 512M, he will be able to open less queues. >>>> >>>> >>>>> >>>>> I need to go look into what all this mqd is for, what it does and what it is >>>>> about. But pinning is really bad and this is an issue with userspace command >>>>> scheduling an issue that obviously AMD fails to take into account in design >>>>> phase. >>>> Maybe, but that is the H/W design non-the-less. We can't very well >>>> change the H/W. >>> >>> You can not change the hardware but it is not an excuse to allow bad design to >>> sneak in software to work around that. So i would rather penalize bad hardware >>> design and have command submission in the kernel, until AMD fix its hardware to >>> allow proper scheduling by the kernel and proper control by the kernel. >> I'm sorry but I do *not* think this is a bad design. S/W scheduling in >> the kernel can not, IMO, scale well to 100K queues and 10K processes. > > I am not advocating for having kernel decide down to the very last details. I am > advocating for kernel being able to preempt at any time and be able to decrease > or increase user queue priority so overall kernel is in charge of resources > management and it can handle rogue client in proper fashion. > >> >>> Because really where we want to go is having GPU closer to a CPU in term of scheduling >>> capacity and once we get there we want the kernel to always be able to take over >>> and do whatever it wants behind process back. >> Who do you refer to when you say "we" ? AFAIK, the hw scheduling >> direction is where AMD is now and where it is heading in the future. >> That doesn't preclude the option to allow the kernel to take over and do >> what he wants. I agree that in KV we have a problem where we can't do a >> mid-wave preemption, so theoretically, a long running compute kernel can >> make things messy, but in Carrizo, we will have this ability. Having >> said that, it will only be through the CP H/W scheduling. So AMD is >> _not_ going to abandon H/W scheduling. You can dislike it, but this is >> the situation. > > We was for the overall Linux community but maybe i should not pretend to talk > for anyone interested in having a common standard. > > My point is that current hardware do not have approriate hardware support for > preemption hence, current hardware should use ioctl to schedule job and AMD > should think a bit more on commiting to a design and handwaving any hardware > short coming as something that can be work around in the software. The pinning > thing is broken by design, only way to work around it is through kernel cmd > queue scheduling that's a fact. > > Once hardware support proper preemption and allows to move around/evict buffer > use on behalf of userspace command queue then we can allow userspace scheduling > but until then my personnal opinion is that it should not be allowed and that > people will have to pay the ioctl price which i proved to be small, because > really if you 100K queue each with one job, i would not expect that all those > 100K job will complete in less time than it takes to execute an ioctl ie by > even if you do not have the ioctl delay what ever you schedule will have to > wait on previously submited jobs. But Jerome, the core problem still remains in effect, even with your suggestion. If an application, either via userspace queue or via ioctl, submits a long-running kernel, than the CPU in general can't stop the GPU from running it. And if that kernel does while(1); than that's it, game's over, and no matter how you submitted the work. So I don't really see the big advantage in your proposal. Only in CZ we can stop this wave (by CP H/W scheduling only). What are you saying is basically I won't allow people to use compute on Linux KV system because it _may_ get the system stuck. So even if I really wanted to, and I may agree with you theoretically on that, I can't fulfill your desire to make the "kernel being able to preempt at any time and be able to decrease or increase user queue priority so overall kernel is in charge of resources management and it can handle rogue client in proper fashion". Not in KV, and I guess not in CZ as well. Oded > >>> >>>>>>> >>>>>>>>> >>>>>>>>> It might be better to add a drivers/gpu/drm/amd directory and add common >>>>>>>>> stuff there. >>>>>>>>> >>>>>>>>> Given that this is not intended to be final HSA api AFAICT then i would >>>>>>>>> say this far better to avoid the whole kfd module and add ioctl to radeon. >>>>>>>>> This would avoid crazy communication btw radeon and kfd. >>>>>>>>> >>>>>>>>> The whole aperture business needs some serious explanation. Especialy as >>>>>>>>> you want to use userspace address there is nothing to prevent userspace >>>>>>>>> program from allocating things at address you reserve for lds, scratch, >>>>>>>>> ... only sane way would be to move those lds, scratch inside the virtual >>>>>>>>> address reserved for kernel (see kernel memory map). >>>>>>>>> >>>>>>>>> The whole business of locking performance counter for exclusive per process >>>>>>>>> access is a big NO. Which leads me to the questionable usefullness of user >>>>>>>>> space command ring. >>>>>>>> That's like saying: "Which leads me to the questionable usefulness of HSA". I >>>>>>>> find it analogous to a situation where a network maintainer nacking a driver >>>>>>>> for a network card, which is slower than a different network card. Doesn't >>>>>>>> seem reasonable this situation is would happen. He would still put both the >>>>>>>> drivers in the kernel because people want to use the H/W and its features. So, >>>>>>>> I don't think this is a valid reason to NACK the driver. >>>>> >>>>> Let me rephrase, drop the the performance counter ioctl and modulo memory pinning >>>>> i see no objection. In other word, i am not NACKING whole patchset i am NACKING >>>>> the performance ioctl. >>>>> >>>>> Again this is another argument for round trip to the kernel. As inside kernel you >>>>> could properly do exclusive gpu counter access accross single user cmd buffer >>>>> execution. >>>>> >>>>>>>> >>>>>>>>> I only see issues with that. First and foremost i would >>>>>>>>> need to see solid figures that kernel ioctl or syscall has a higher an >>>>>>>>> overhead that is measurable in any meaning full way against a simple >>>>>>>>> function call. I know the userspace command ring is a big marketing features >>>>>>>>> that please ignorant userspace programmer. But really this only brings issues >>>>>>>>> and for absolutely not upside afaict. >>>>>>>> Really ? You think that doing a context switch to kernel space, with all its >>>>>>>> overhead, is _not_ more expansive than just calling a function in userspace >>>>>>>> which only puts a buffer on a ring and writes a doorbell ? >>>>> >>>>> I am saying the overhead is not that big and it probably will not matter in most >>>>> usecase. For instance i did wrote the most useless kernel module that add two >>>>> number through an ioctl (http://people.freedesktop.org/~glisse/adder.tar) and >>>>> it takes ~0.35microseconds with ioctl while function is ~0.025microseconds so >>>>> ioctl is 13 times slower. >>>>> >>>>> Now if there is enough data that shows that a significant percentage of jobs >>>>> submited to the GPU will take less that 0.35microsecond then yes userspace >>>>> scheduling does make sense. But so far all we have is handwaving with no data >>>>> to support any facts. >>>>> >>>>> >>>>> Now if we want to schedule from userspace than you will need to do something >>>>> about the pinning, something that gives control to kernel so that kernel can >>>>> unpin when it wants and move object when it wants no matter what userspace is >>>>> doing. >>>>> >>>>>>>>> > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org >