From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1389DC4708F for ; Tue, 1 Jun 2021 12:27:46 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 9AD686101C for ; Tue, 1 Jun 2021 12:27:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9AD686101C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4BAD26EA35; Tue, 1 Jun 2021 12:27:45 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 15D546EA35; Tue, 1 Jun 2021 12:27:44 +0000 (UTC) IronPort-SDR: 9J8iNcsfvvEe/iBgV4krw++1CWjk++7PxKPcb4Yn6ygeV+Nc7vWHC1PnuB/iyGRp0kAoe01gEW xydRDbGhotYg== X-IronPort-AV: E=McAfee;i="6200,9189,10001"; a="190650258" X-IronPort-AV: E=Sophos;i="5.83,239,1616482800"; d="scan'208";a="190650258" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2021 05:27:43 -0700 IronPort-SDR: 3alkzTWieqoqD0FyT6ZmkL4P3b4wzc/Nq9alCj91uIAN/JW05buGSBXVJYExFYYXQD0HcV35xi kF/q+hpoGoLQ== X-IronPort-AV: E=Sophos;i="5.83,239,1616482800"; d="scan'208";a="445316204" Received: from ycohenha-mobl1.ger.corp.intel.com (HELO localhost) ([10.252.54.130]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2021 05:27:40 -0700 From: Jani Nikula To: Thomas =?utf-8?Q?Hellstr=C3=B6m?= , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org In-Reply-To: <20210601074654.3103-8-thomas.hellstrom@linux.intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo References: <20210601074654.3103-1-thomas.hellstrom@linux.intel.com> <20210601074654.3103-8-thomas.hellstrom@linux.intel.com> Date: Tue, 01 Jun 2021 15:27:37 +0300 Message-ID: <87im2xrcqu.fsf@intel.com> MIME-Version: 1.0 Subject: Re: [Intel-gfx] [PATCH v9 07/15] drm: Add a prefetching memcpy_from_wc X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas =?utf-8?Q?Hellstr=C3=B6m?= , Christian =?utf-8?Q?K=C3=B6nig?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" T24gVHVlLCAwMSBKdW4gMjAyMSwgVGhvbWFzIEhlbGxzdHLDtm0gPHRob21hcy5oZWxsc3Ryb21A bGludXguaW50ZWwuY29tPiB3cm90ZToKPiBSZWFkaW5nIG91dCBvZiB3cml0ZS1jb21iaW5pbmcg bWFwcGVkIG1lbW9yeSBpcyB0eXBpY2FsbHkgdmVyeSBzbG93Cj4gc2luY2UgdGhlIENQVSBkb2Vz bid0IHByZWZldGNoLiBIb3dldmVyIHNvbWUgYXJjaHMgaGF2ZSBzcGVjaWFsCj4gaW5zdHJ1Y3Rp b25zIHRvIGRvIHRoaXMuCj4KPiBTbyBhZGQgYSBiZXN0LWVmZm9ydCBtZW1jcHlfZnJvbV93YyB0 YWtpbmcgZG1hLWJ1Zi1tYXAgcG9pbnRlcgo+IGFyZ3VtZW50cyB0aGF0IGF0dGVtcHRzIHRvIHVz ZSBhIGZhc3QgcHJlZmV0Y2hpbmcgbWVtY3B5IGFuZAo+IG90aGVyd2lzZSBmYWxscyBiYWNrIHRv IG9yZGluYXJ5IG1lbWNvcGllcywgdGFraW5nIHRoZSBpb21lbSB0YWdnaW5nCj4gaW50byBhY2Nv dW50Lgo+Cj4gVGhlIGNvZGUgaXMgbGFyZ2VseSBjb3BpZWQgZnJvbSBpOTE1X21lbWNweV9mcm9t X3djLgo+Cj4gQ2M6IERhbmllbCBWZXR0ZXIgPGRhbmllbEBmZndsbC5jaD4KPiBDYzogQ2hyaXN0 aWFuIEvDtm5pZyA8Y2hyaXN0aWFuLmtvZW5pZ0BhbWQuY29tPgo+IFN1Z2dlc3RlZC1ieTogRGFu aWVsIFZldHRlciA8ZGFuaWVsQGZmd2xsLmNoPgo+IFNpZ25lZC1vZmYtYnk6IFRob21hcyBIZWxs c3Ryw7ZtIDx0aG9tYXMuaGVsbHN0cm9tQGxpbnV4LmludGVsLmNvbT4KPiBBY2tlZC1ieTogQ2hy aXN0aWFuIEvDtm5pZyA8Y2hyaXN0aWFuLmtvZW5pZ0BhbWQuY29tPgo+IEFja2VkLWJ5OiBEYW5p ZWwgVmV0dGVyIDxkYW5pZWxAZmZ3bGwuY2g+Cj4gLS0tCj4gdjc6Cj4gLSBQZXJmb3JtIGEgbWVt Y3B5IGV2ZW4gaWYgd2FybmluZyB3aXRoIGluX2ludGVycnVwdCgpLiBTdWdnZXN0ZWQgYnkKPiAg IENocmlzdGlhbiBLw7ZuaWcuCj4gLSBGaXggY29tcGlsYXRpb24gZmFpbHVyZSBvbiAhWDg2IChS ZXBvcnRlZCBieSBrZXJuZWwgdGVzdCByb2JvdAo+ICAgbGtwQGludGVsLmNvbSkKPiB2ODoKPiAt IFNraXAga2VybmVsZG9jIGZvciBkcm1fbWVtY3B5X2luaXRfZWFybHkoKQo+IC0gRXhwb3J0IGRy bV9tZW1jcHlfZnJvbV93YygpIGFsc28gZm9yIG5vbi14ODYuCj4gLS0tCj4gIERvY3VtZW50YXRp b24vZ3B1L2RybS1tbS5yc3QgfCAgIDIgKy0KPiAgZHJpdmVycy9ncHUvZHJtL2RybV9jYWNoZS5j ICB8IDE0OCArKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKwo+ICBkcml2ZXJzL2dw dS9kcm0vZHJtX2Rydi5jICAgIHwgICAyICsKPiAgaW5jbHVkZS9kcm0vZHJtX2NhY2hlLmggICAg ICB8ICAgNyArKwo+ICA0IGZpbGVzIGNoYW5nZWQsIDE1OCBpbnNlcnRpb25zKCspLCAxIGRlbGV0 aW9uKC0pCj4KPiBkaWZmIC0tZ2l0IGEvRG9jdW1lbnRhdGlvbi9ncHUvZHJtLW1tLnJzdCBiL0Rv Y3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3QKPiBpbmRleCAyMWJlNmRlYWRjMTIuLmM2NjA1OGM1 YmNlNyAxMDA2NDQKPiAtLS0gYS9Eb2N1bWVudGF0aW9uL2dwdS9kcm0tbW0ucnN0Cj4gKysrIGIv RG9jdW1lbnRhdGlvbi9ncHUvZHJtLW1tLnJzdAo+IEBAIC00NjksNyArNDY5LDcgQEAgRFJNIE1N IFJhbmdlIEFsbG9jYXRvciBGdW5jdGlvbiBSZWZlcmVuY2VzCj4gIC4uIGtlcm5lbC1kb2M6OiBk cml2ZXJzL2dwdS9kcm0vZHJtX21tLmMKPiAgICAgOmV4cG9ydDoKPiAgCj4gLURSTSBDYWNoZSBI YW5kbGluZwo+ICtEUk0gQ2FjaGUgSGFuZGxpbmcgYW5kIEZhc3QgV0MgbWVtY3B5KCkKPiAgPT09 PT09PT09PT09PT09PT09CgpUaGUgdGl0bGUgdW5kZXJsaW5lIG5lZWRzIHRvIGJlIGFzIGxvbmcg YXMgdGhlIHRpdGxlLgoKQlIsCkphbmkuCgo+ICAKPiAgLi4ga2VybmVsLWRvYzo6IGRyaXZlcnMv Z3B1L2RybS9kcm1fY2FjaGUuYwo+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2dwdS9kcm0vZHJtX2Nh Y2hlLmMgYi9kcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMKPiBpbmRleCA3OWE1MGVmMTI1MGYu LjU0NjU5OWYxOWE5MyAxMDA2NDQKPiAtLS0gYS9kcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMK PiArKysgYi9kcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMKPiBAQCAtMjgsNiArMjgsNyBAQAo+ ICAgKiBBdXRob3JzOiBUaG9tYXMgSGVsbHN0csO2bSA8dGhvbWFzLWF0LXR1bmdzdGVuZ3JhcGhp Y3MtZG90LWNvbT4KPiAgICovCj4gIAo+ICsjaW5jbHVkZSA8bGludXgvZG1hLWJ1Zi1tYXAuaD4K PiAgI2luY2x1ZGUgPGxpbnV4L2V4cG9ydC5oPgo+ICAjaW5jbHVkZSA8bGludXgvaGlnaG1lbS5o Pgo+ICAjaW5jbHVkZSA8bGludXgvbWVtX2VuY3J5cHQuaD4KPiBAQCAtMzUsNiArMzYsOSBAQAo+ ICAKPiAgI2luY2x1ZGUgPGRybS9kcm1fY2FjaGUuaD4KPiAgCj4gKy8qIEEgc21hbGwgYm91bmNl IGJ1ZmZlciB0aGF0IGZpdHMgb24gdGhlIHN0YWNrLiAqLwo+ICsjZGVmaW5lIE1FTUNQWV9CT1VO Q0VfU0laRSAxMjgKPiArCj4gICNpZiBkZWZpbmVkKENPTkZJR19YODYpCj4gICNpbmNsdWRlIDxh c20vc21wLmg+Cj4gIAo+IEBAIC0yMDksMyArMjEzLDE0NyBAQCBib29sIGRybV9uZWVkX3N3aW90 bGIoaW50IGRtYV9iaXRzKQo+ICAJcmV0dXJuIG1heF9pb21lbSA+ICgodTY0KTEgPDwgZG1hX2Jp dHMpOwo+ICB9Cj4gIEVYUE9SVF9TWU1CT0woZHJtX25lZWRfc3dpb3RsYik7Cj4gKwo+ICtzdGF0 aWMgdm9pZCBtZW1jcHlfZmFsbGJhY2soc3RydWN0IGRtYV9idWZfbWFwICpkc3QsCj4gKwkJCSAg ICBjb25zdCBzdHJ1Y3QgZG1hX2J1Zl9tYXAgKnNyYywKPiArCQkJICAgIHVuc2lnbmVkIGxvbmcg bGVuKQo+ICt7Cj4gKwlpZiAoIWRzdC0+aXNfaW9tZW0gJiYgIXNyYy0+aXNfaW9tZW0pIHsKPiAr CQltZW1jcHkoZHN0LT52YWRkciwgc3JjLT52YWRkciwgbGVuKTsKPiArCX0gZWxzZSBpZiAoIXNy Yy0+aXNfaW9tZW0pIHsKPiArCQlkbWFfYnVmX21hcF9tZW1jcHlfdG8oZHN0LCBzcmMtPnZhZGRy LCBsZW4pOwo+ICsJfSBlbHNlIGlmICghZHN0LT5pc19pb21lbSkgewo+ICsJCW1lbWNweV9mcm9t aW8oZHN0LT52YWRkciwgc3JjLT52YWRkcl9pb21lbSwgbGVuKTsKPiArCX0gZWxzZSB7Cj4gKwkJ LyoKPiArCQkgKiBCb3VuY2Ugc2l6ZSBpcyBub3QgcGVyZm9ybWFuY2UgdHVuZWQsIGJ1dCB1c2lu ZyBhCj4gKwkJICogYm91bmNlIGJ1ZmZlciBsaWtlIHRoaXMgaXMgc2lnbmlmaWNhbnRseSBmYXN0 ZXIgdGhhbgo+ICsJCSAqIHJlc29ydGluZyB0byBpb3JlYWR4eCgpICsgaW93cml0ZXh4KCkuCj4g KwkJICovCj4gKwkJY2hhciBib3VuY2VbTUVNQ1BZX0JPVU5DRV9TSVpFXTsKPiArCQl2b2lkIF9f aW9tZW0gKl9zcmMgPSBzcmMtPnZhZGRyX2lvbWVtOwo+ICsJCXZvaWQgX19pb21lbSAqX2RzdCA9 IGRzdC0+dmFkZHJfaW9tZW07Cj4gKwo+ICsJCXdoaWxlIChsZW4gPj0gTUVNQ1BZX0JPVU5DRV9T SVpFKSB7Cj4gKwkJCW1lbWNweV9mcm9taW8oYm91bmNlLCBfc3JjLCBNRU1DUFlfQk9VTkNFX1NJ WkUpOwo+ICsJCQltZW1jcHlfdG9pbyhfZHN0LCBib3VuY2UsIE1FTUNQWV9CT1VOQ0VfU0laRSk7 Cj4gKwkJCV9zcmMgKz0gTUVNQ1BZX0JPVU5DRV9TSVpFOwo+ICsJCQlfZHN0ICs9IE1FTUNQWV9C T1VOQ0VfU0laRTsKPiArCQkJbGVuIC09IE1FTUNQWV9CT1VOQ0VfU0laRTsKPiArCQl9Cj4gKwkJ aWYgKGxlbikgewo+ICsJCQltZW1jcHlfZnJvbWlvKGJvdW5jZSwgX3NyYywgTUVNQ1BZX0JPVU5D RV9TSVpFKTsKPiArCQkJbWVtY3B5X3RvaW8oX2RzdCwgYm91bmNlLCBNRU1DUFlfQk9VTkNFX1NJ WkUpOwo+ICsJCX0KPiArCX0KPiArfQo+ICsKPiArI2lmZGVmIENPTkZJR19YODYKPiArCj4gK3N0 YXRpYyBERUZJTkVfU1RBVElDX0tFWV9GQUxTRShoYXNfbW92bnRkcWEpOwo+ICsKPiArc3RhdGlj IHZvaWQgX19tZW1jcHlfbnRkcWEodm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMsIHVuc2lnbmVk IGxvbmcgbGVuKQo+ICt7Cj4gKwlrZXJuZWxfZnB1X2JlZ2luKCk7Cj4gKwo+ICsJd2hpbGUgKGxl biA+PSA0KSB7Cj4gKwkJYXNtKCJtb3ZudGRxYQkoJTApLCAlJXhtbTBcbiIKPiArCQkgICAgIm1v dm50ZHFhIDE2KCUwKSwgJSV4bW0xXG4iCj4gKwkJICAgICJtb3ZudGRxYSAzMiglMCksICUleG1t MlxuIgo+ICsJCSAgICAibW92bnRkcWEgNDgoJTApLCAlJXhtbTNcbiIKPiArCQkgICAgIm1vdmFw cyAlJXhtbTAsICAgKCUxKVxuIgo+ICsJCSAgICAibW92YXBzICUleG1tMSwgMTYoJTEpXG4iCj4g KwkJICAgICJtb3ZhcHMgJSV4bW0yLCAzMiglMSlcbiIKPiArCQkgICAgIm1vdmFwcyAlJXhtbTMs IDQ4KCUxKVxuIgo+ICsJCSAgICA6OiAiciIgKHNyYyksICJyIiAoZHN0KSA6ICJtZW1vcnkiKTsK PiArCQlzcmMgKz0gNjQ7Cj4gKwkJZHN0ICs9IDY0Owo+ICsJCWxlbiAtPSA0Owo+ICsJfQo+ICsJ d2hpbGUgKGxlbi0tKSB7Cj4gKwkJYXNtKCJtb3ZudGRxYSAoJTApLCAlJXhtbTBcbiIKPiArCQkg ICAgIm1vdmFwcyAlJXhtbTAsICglMSlcbiIKPiArCQkgICAgOjogInIiIChzcmMpLCAiciIgKGRz dCkgOiAibWVtb3J5Iik7Cj4gKwkJc3JjICs9IDE2Owo+ICsJCWRzdCArPSAxNjsKPiArCX0KPiAr Cj4gKwlrZXJuZWxfZnB1X2VuZCgpOwo+ICt9Cj4gKwo+ICsvKgo+ICsgKiBfX2RybV9tZW1jcHlf ZnJvbV93YyBjb3BpZXMgQGxlbiBieXRlcyBmcm9tIEBzcmMgdG8gQGRzdCB1c2luZwo+ICsgKiBu b24tdGVtcG9yYWwgaW5zdHJ1Y3Rpb25zIHdoZXJlIGF2YWlsYWJsZS4gTm90ZSB0aGF0IGFsbCBh cmd1bWVudHMKPiArICogKEBzcmMsIEBkc3QpIG11c3QgYmUgYWxpZ25lZCB0byAxNiBieXRlcyBh bmQgQGxlbiBtdXN0IGJlIGEgbXVsdGlwbGUKPiArICogb2YgMTYuCj4gKyAqLwo+ICtzdGF0aWMg dm9pZCBfX2RybV9tZW1jcHlfZnJvbV93Yyh2b2lkICpkc3QsIGNvbnN0IHZvaWQgKnNyYywgdW5z aWduZWQgbG9uZyBsZW4pCj4gK3sKPiArCWlmICh1bmxpa2VseSgoKHVuc2lnbmVkIGxvbmcpZHN0 IHwgKHVuc2lnbmVkIGxvbmcpc3JjIHwgbGVuKSAmIDE1KSkKPiArCQltZW1jcHkoZHN0LCBzcmMs IGxlbik7Cj4gKwllbHNlIGlmIChsaWtlbHkobGVuKSkKPiArCQlfX21lbWNweV9udGRxYShkc3Qs IHNyYywgbGVuID4+IDQpOwo+ICt9Cj4gKwo+ICsvKioKPiArICogZHJtX21lbWNweV9mcm9tX3dj IC0gUGVyZm9ybSB0aGUgZmFzdGVzdCBhdmFpbGFibGUgbWVtY3B5IGZyb20gYSBzb3VyY2UKPiAr ICogdGhhdCBtYXkgYmUgV0MuCj4gKyAqIEBkc3Q6IFRoZSBkZXN0aW5hdGlvbiBwb2ludGVyCj4g KyAqIEBzcmM6IFRoZSBzb3VyY2UgcG9pbnRlcgo+ICsgKiBAbGVuOiBUaGUgc2l6ZSBvZiB0aGUg YXJlYSBvIHRyYW5zZmVyIGluIGJ5dGVzCj4gKyAqCj4gKyAqIFRyaWVzIGFuIGFyY2ggb3B0aW1p emVkIG1lbWNweSBmb3IgcHJlZmV0Y2hpbmcgcmVhZGluZyBvdXQgb2YgYSBXQyByZWdpb24sCj4g KyAqIGFuZCBpZiBubyBzdWNoIGJlYXN0IGlzIGF2YWlsYWJsZSwgZmFsbHMgYmFjayB0byBhIG5v cm1hbCBtZW1jcHkuCj4gKyAqLwo+ICt2b2lkIGRybV9tZW1jcHlfZnJvbV93YyhzdHJ1Y3QgZG1h X2J1Zl9tYXAgKmRzdCwKPiArCQkJY29uc3Qgc3RydWN0IGRtYV9idWZfbWFwICpzcmMsCj4gKwkJ CXVuc2lnbmVkIGxvbmcgbGVuKQo+ICt7Cj4gKwlpZiAoV0FSTl9PTihpbl9pbnRlcnJ1cHQoKSkp IHsKPiArCQltZW1jcHlfZmFsbGJhY2soZHN0LCBzcmMsIGxlbik7Cj4gKwkJcmV0dXJuOwo+ICsJ fQo+ICsKPiArCWlmIChzdGF0aWNfYnJhbmNoX2xpa2VseSgmaGFzX21vdm50ZHFhKSkgewo+ICsJ CV9fZHJtX21lbWNweV9mcm9tX3djKGRzdC0+aXNfaW9tZW0gPwo+ICsJCQkJICAgICAodm9pZCBf X2ZvcmNlICopZHN0LT52YWRkcl9pb21lbSA6Cj4gKwkJCQkgICAgIGRzdC0+dmFkZHIsCj4gKwkJ CQkgICAgIHNyYy0+aXNfaW9tZW0gPwo+ICsJCQkJICAgICAodm9pZCBjb25zdCBfX2ZvcmNlICop c3JjLT52YWRkcl9pb21lbSA6Cj4gKwkJCQkgICAgIHNyYy0+dmFkZHIsCj4gKwkJCQkgICAgIGxl bik7Cj4gKwkJcmV0dXJuOwo+ICsJfQo+ICsKPiArCW1lbWNweV9mYWxsYmFjayhkc3QsIHNyYywg bGVuKTsKPiArfQo+ICtFWFBPUlRfU1lNQk9MKGRybV9tZW1jcHlfZnJvbV93Yyk7Cj4gKwo+ICsv Kgo+ICsgKiBkcm1fbWVtY3B5X2luaXRfZWFybHkgLSBPbmUgdGltZSBpbml0aWFsaXphdGlvbiBv ZiB0aGUgV0MgbWVtY3B5IGNvZGUKPiArICovCj4gK3ZvaWQgZHJtX21lbWNweV9pbml0X2Vhcmx5 KHZvaWQpCj4gK3sKPiArCS8qCj4gKwkgKiBTb21lIGh5cGVydmlzb3JzIChlLmcuIEtWTSkgZG9u J3Qgc3VwcG9ydCBWRVgtcHJlZml4IGluc3RydWN0aW9ucwo+ICsJICogZW11bGF0aW9uLiBTbyBk b24ndCBlbmFibGUgbW92bnRkcWEgaW4gaHlwZXJ2aXNvciBndWVzdC4KPiArCSAqLwo+ICsJaWYg KHN0YXRpY19jcHVfaGFzKFg4Nl9GRUFUVVJFX1hNTTRfMSkgJiYKPiArCSAgICAhYm9vdF9jcHVf aGFzKFg4Nl9GRUFUVVJFX0hZUEVSVklTT1IpKQo+ICsJCXN0YXRpY19icmFuY2hfZW5hYmxlKCZo YXNfbW92bnRkcWEpOwo+ICt9Cj4gKyNlbHNlCj4gK3ZvaWQgZHJtX21lbWNweV9mcm9tX3djKHN0 cnVjdCBkbWFfYnVmX21hcCAqZHN0LAo+ICsJCQljb25zdCBzdHJ1Y3QgZG1hX2J1Zl9tYXAgKnNy YywKPiArCQkJdW5zaWduZWQgbG9uZyBsZW4pCj4gK3sKPiArCVdBUk5fT04oaW5faW50ZXJydXB0 KCkpOwo+ICsKPiArCW1lbWNweV9mYWxsYmFjayhkc3QsIHNyYywgbGVuKTsKPiArfQo+ICtFWFBP UlRfU1lNQk9MKGRybV9tZW1jcHlfZnJvbV93Yyk7Cj4gKwo+ICt2b2lkIGRybV9tZW1jcHlfaW5p dF9lYXJseSh2b2lkKQo+ICt7Cj4gK30KPiArI2VuZGlmIC8qIENPTkZJR19YODYgKi8KPiBkaWZm IC0tZ2l0IGEvZHJpdmVycy9ncHUvZHJtL2RybV9kcnYuYyBiL2RyaXZlcnMvZ3B1L2RybS9kcm1f ZHJ2LmMKPiBpbmRleCAzZDhkNjhhOThiOTUuLjg4MDRlYzdkMzIxNSAxMDA2NDQKPiAtLS0gYS9k cml2ZXJzL2dwdS9kcm0vZHJtX2Rydi5jCj4gKysrIGIvZHJpdmVycy9ncHUvZHJtL2RybV9kcnYu Ywo+IEBAIC0zNSw2ICszNSw3IEBACj4gICNpbmNsdWRlIDxsaW51eC9zbGFiLmg+Cj4gICNpbmNs dWRlIDxsaW51eC9zcmN1Lmg+Cj4gIAo+ICsjaW5jbHVkZSA8ZHJtL2RybV9jYWNoZS5oPgo+ICAj aW5jbHVkZSA8ZHJtL2RybV9jbGllbnQuaD4KPiAgI2luY2x1ZGUgPGRybS9kcm1fY29sb3JfbWdt dC5oPgo+ICAjaW5jbHVkZSA8ZHJtL2RybV9kcnYuaD4KPiBAQCAtMTA0MSw2ICsxMDQyLDcgQEAg c3RhdGljIGludCBfX2luaXQgZHJtX2NvcmVfaW5pdCh2b2lkKQo+ICAKPiAgCWRybV9jb25uZWN0 b3JfaWRhX2luaXQoKTsKPiAgCWlkcl9pbml0KCZkcm1fbWlub3JzX2lkcik7Cj4gKwlkcm1fbWVt Y3B5X2luaXRfZWFybHkoKTsKPiAgCj4gIAlyZXQgPSBkcm1fc3lzZnNfaW5pdCgpOwo+ICAJaWYg KHJldCA8IDApIHsKPiBkaWZmIC0tZ2l0IGEvaW5jbHVkZS9kcm0vZHJtX2NhY2hlLmggYi9pbmNs dWRlL2RybS9kcm1fY2FjaGUuaAo+IGluZGV4IGU5YWQ0ODYzZDkxNS4uY2M5ZGUxNjMyZGQzIDEw MDY0NAo+IC0tLSBhL2luY2x1ZGUvZHJtL2RybV9jYWNoZS5oCj4gKysrIGIvaW5jbHVkZS9kcm0v ZHJtX2NhY2hlLmgKPiBAQCAtMzUsNiArMzUsOCBAQAo+ICAKPiAgI2luY2x1ZGUgPGxpbnV4L3Nj YXR0ZXJsaXN0Lmg+Cj4gIAo+ICtzdHJ1Y3QgZG1hX2J1Zl9tYXA7Cj4gKwo+ICB2b2lkIGRybV9j bGZsdXNoX3BhZ2VzKHN0cnVjdCBwYWdlICpwYWdlc1tdLCB1bnNpZ25lZCBsb25nIG51bV9wYWdl cyk7Cj4gIHZvaWQgZHJtX2NsZmx1c2hfc2coc3RydWN0IHNnX3RhYmxlICpzdCk7Cj4gIHZvaWQg ZHJtX2NsZmx1c2hfdmlydF9yYW5nZSh2b2lkICphZGRyLCB1bnNpZ25lZCBsb25nIGxlbmd0aCk7 Cj4gQEAgLTcwLDQgKzcyLDkgQEAgc3RhdGljIGlubGluZSBib29sIGRybV9hcmNoX2Nhbl93Y19t ZW1vcnkodm9pZCkKPiAgI2VuZGlmCj4gIH0KPiAgCj4gK3ZvaWQgZHJtX21lbWNweV9pbml0X2Vh cmx5KHZvaWQpOwo+ICsKPiArdm9pZCBkcm1fbWVtY3B5X2Zyb21fd2Moc3RydWN0IGRtYV9idWZf bWFwICpkc3QsCj4gKwkJCWNvbnN0IHN0cnVjdCBkbWFfYnVmX21hcCAqc3JjLAo+ICsJCQl1bnNp Z25lZCBsb25nIGxlbik7Cj4gICNlbmRpZgoKLS0gCkphbmkgTmlrdWxhLCBJbnRlbCBPcGVuIFNv dXJjZSBHcmFwaGljcyBDZW50ZXIKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KSW50ZWwtZ2Z4IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRl c2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8v aW50ZWwtZ2Z4Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0721CC47080 for ; Tue, 1 Jun 2021 12:27:47 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C903461009 for ; Tue, 1 Jun 2021 12:27:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C903461009 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DAA8E6EA39; Tue, 1 Jun 2021 12:27:45 +0000 (UTC) Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by gabe.freedesktop.org (Postfix) with ESMTPS id 15D546EA35; Tue, 1 Jun 2021 12:27:44 +0000 (UTC) IronPort-SDR: 9J8iNcsfvvEe/iBgV4krw++1CWjk++7PxKPcb4Yn6ygeV+Nc7vWHC1PnuB/iyGRp0kAoe01gEW xydRDbGhotYg== X-IronPort-AV: E=McAfee;i="6200,9189,10001"; a="190650258" X-IronPort-AV: E=Sophos;i="5.83,239,1616482800"; d="scan'208";a="190650258" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2021 05:27:43 -0700 IronPort-SDR: 3alkzTWieqoqD0FyT6ZmkL4P3b4wzc/Nq9alCj91uIAN/JW05buGSBXVJYExFYYXQD0HcV35xi kF/q+hpoGoLQ== X-IronPort-AV: E=Sophos;i="5.83,239,1616482800"; d="scan'208";a="445316204" Received: from ycohenha-mobl1.ger.corp.intel.com (HELO localhost) ([10.252.54.130]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Jun 2021 05:27:40 -0700 From: Jani Nikula To: Thomas =?utf-8?Q?Hellstr=C3=B6m?= , intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: Re: [PATCH v9 07/15] drm: Add a prefetching memcpy_from_wc In-Reply-To: <20210601074654.3103-8-thomas.hellstrom@linux.intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo References: <20210601074654.3103-1-thomas.hellstrom@linux.intel.com> <20210601074654.3103-8-thomas.hellstrom@linux.intel.com> Date: Tue, 01 Jun 2021 15:27:37 +0300 Message-ID: <87im2xrcqu.fsf@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Thomas =?utf-8?Q?Hellstr=C3=B6m?= , Christian =?utf-8?Q?K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" On Tue, 01 Jun 2021, Thomas Hellstr=C3=B6m wrote: > Reading out of write-combining mapped memory is typically very slow > since the CPU doesn't prefetch. However some archs have special > instructions to do this. > > So add a best-effort memcpy_from_wc taking dma-buf-map pointer > arguments that attempts to use a fast prefetching memcpy and > otherwise falls back to ordinary memcopies, taking the iomem tagging > into account. > > The code is largely copied from i915_memcpy_from_wc. > > Cc: Daniel Vetter > Cc: Christian K=C3=B6nig > Suggested-by: Daniel Vetter > Signed-off-by: Thomas Hellstr=C3=B6m > Acked-by: Christian K=C3=B6nig > Acked-by: Daniel Vetter > --- > v7: > - Perform a memcpy even if warning with in_interrupt(). Suggested by > Christian K=C3=B6nig. > - Fix compilation failure on !X86 (Reported by kernel test robot > lkp@intel.com) > v8: > - Skip kerneldoc for drm_memcpy_init_early() > - Export drm_memcpy_from_wc() also for non-x86. > --- > Documentation/gpu/drm-mm.rst | 2 +- > drivers/gpu/drm/drm_cache.c | 148 +++++++++++++++++++++++++++++++++++ > drivers/gpu/drm/drm_drv.c | 2 + > include/drm/drm_cache.h | 7 ++ > 4 files changed, 158 insertions(+), 1 deletion(-) > > diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst > index 21be6deadc12..c66058c5bce7 100644 > --- a/Documentation/gpu/drm-mm.rst > +++ b/Documentation/gpu/drm-mm.rst > @@ -469,7 +469,7 @@ DRM MM Range Allocator Function References > .. kernel-doc:: drivers/gpu/drm/drm_mm.c > :export: >=20=20 > -DRM Cache Handling > +DRM Cache Handling and Fast WC memcpy() > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D The title underline needs to be as long as the title. BR, Jani. >=20=20 > .. kernel-doc:: drivers/gpu/drm/drm_cache.c > diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c > index 79a50ef1250f..546599f19a93 100644 > --- a/drivers/gpu/drm/drm_cache.c > +++ b/drivers/gpu/drm/drm_cache.c > @@ -28,6 +28,7 @@ > * Authors: Thomas Hellstr=C3=B6m > */ >=20=20 > +#include > #include > #include > #include > @@ -35,6 +36,9 @@ >=20=20 > #include >=20=20 > +/* A small bounce buffer that fits on the stack. */ > +#define MEMCPY_BOUNCE_SIZE 128 > + > #if defined(CONFIG_X86) > #include >=20=20 > @@ -209,3 +213,147 @@ bool drm_need_swiotlb(int dma_bits) > return max_iomem > ((u64)1 << dma_bits); > } > EXPORT_SYMBOL(drm_need_swiotlb); > + > +static void memcpy_fallback(struct dma_buf_map *dst, > + const struct dma_buf_map *src, > + unsigned long len) > +{ > + if (!dst->is_iomem && !src->is_iomem) { > + memcpy(dst->vaddr, src->vaddr, len); > + } else if (!src->is_iomem) { > + dma_buf_map_memcpy_to(dst, src->vaddr, len); > + } else if (!dst->is_iomem) { > + memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); > + } else { > + /* > + * Bounce size is not performance tuned, but using a > + * bounce buffer like this is significantly faster than > + * resorting to ioreadxx() + iowritexx(). > + */ > + char bounce[MEMCPY_BOUNCE_SIZE]; > + void __iomem *_src =3D src->vaddr_iomem; > + void __iomem *_dst =3D dst->vaddr_iomem; > + > + while (len >=3D MEMCPY_BOUNCE_SIZE) { > + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); > + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); > + _src +=3D MEMCPY_BOUNCE_SIZE; > + _dst +=3D MEMCPY_BOUNCE_SIZE; > + len -=3D MEMCPY_BOUNCE_SIZE; > + } > + if (len) { > + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); > + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); > + } > + } > +} > + > +#ifdef CONFIG_X86 > + > +static DEFINE_STATIC_KEY_FALSE(has_movntdqa); > + > +static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) > +{ > + kernel_fpu_begin(); > + > + while (len >=3D 4) { > + asm("movntdqa (%0), %%xmm0\n" > + "movntdqa 16(%0), %%xmm1\n" > + "movntdqa 32(%0), %%xmm2\n" > + "movntdqa 48(%0), %%xmm3\n" > + "movaps %%xmm0, (%1)\n" > + "movaps %%xmm1, 16(%1)\n" > + "movaps %%xmm2, 32(%1)\n" > + "movaps %%xmm3, 48(%1)\n" > + :: "r" (src), "r" (dst) : "memory"); > + src +=3D 64; > + dst +=3D 64; > + len -=3D 4; > + } > + while (len--) { > + asm("movntdqa (%0), %%xmm0\n" > + "movaps %%xmm0, (%1)\n" > + :: "r" (src), "r" (dst) : "memory"); > + src +=3D 16; > + dst +=3D 16; > + } > + > + kernel_fpu_end(); > +} > + > +/* > + * __drm_memcpy_from_wc copies @len bytes from @src to @dst using > + * non-temporal instructions where available. Note that all arguments > + * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple > + * of 16. > + */ > +static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned lo= ng len) > +{ > + if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) > + memcpy(dst, src, len); > + else if (likely(len)) > + __memcpy_ntdqa(dst, src, len >> 4); > +} > + > +/** > + * drm_memcpy_from_wc - Perform the fastest available memcpy from a sour= ce > + * that may be WC. > + * @dst: The destination pointer > + * @src: The source pointer > + * @len: The size of the area o transfer in bytes > + * > + * Tries an arch optimized memcpy for prefetching reading out of a WC re= gion, > + * and if no such beast is available, falls back to a normal memcpy. > + */ > +void drm_memcpy_from_wc(struct dma_buf_map *dst, > + const struct dma_buf_map *src, > + unsigned long len) > +{ > + if (WARN_ON(in_interrupt())) { > + memcpy_fallback(dst, src, len); > + return; > + } > + > + if (static_branch_likely(&has_movntdqa)) { > + __drm_memcpy_from_wc(dst->is_iomem ? > + (void __force *)dst->vaddr_iomem : > + dst->vaddr, > + src->is_iomem ? > + (void const __force *)src->vaddr_iomem : > + src->vaddr, > + len); > + return; > + } > + > + memcpy_fallback(dst, src, len); > +} > +EXPORT_SYMBOL(drm_memcpy_from_wc); > + > +/* > + * drm_memcpy_init_early - One time initialization of the WC memcpy code > + */ > +void drm_memcpy_init_early(void) > +{ > + /* > + * Some hypervisors (e.g. KVM) don't support VEX-prefix instructions > + * emulation. So don't enable movntdqa in hypervisor guest. > + */ > + if (static_cpu_has(X86_FEATURE_XMM4_1) && > + !boot_cpu_has(X86_FEATURE_HYPERVISOR)) > + static_branch_enable(&has_movntdqa); > +} > +#else > +void drm_memcpy_from_wc(struct dma_buf_map *dst, > + const struct dma_buf_map *src, > + unsigned long len) > +{ > + WARN_ON(in_interrupt()); > + > + memcpy_fallback(dst, src, len); > +} > +EXPORT_SYMBOL(drm_memcpy_from_wc); > + > +void drm_memcpy_init_early(void) > +{ > +} > +#endif /* CONFIG_X86 */ > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > index 3d8d68a98b95..8804ec7d3215 100644 > --- a/drivers/gpu/drm/drm_drv.c > +++ b/drivers/gpu/drm/drm_drv.c > @@ -35,6 +35,7 @@ > #include > #include >=20=20 > +#include > #include > #include > #include > @@ -1041,6 +1042,7 @@ static int __init drm_core_init(void) >=20=20 > drm_connector_ida_init(); > idr_init(&drm_minors_idr); > + drm_memcpy_init_early(); >=20=20 > ret =3D drm_sysfs_init(); > if (ret < 0) { > diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h > index e9ad4863d915..cc9de1632dd3 100644 > --- a/include/drm/drm_cache.h > +++ b/include/drm/drm_cache.h > @@ -35,6 +35,8 @@ >=20=20 > #include >=20=20 > +struct dma_buf_map; > + > void drm_clflush_pages(struct page *pages[], unsigned long num_pages); > void drm_clflush_sg(struct sg_table *st); > void drm_clflush_virt_range(void *addr, unsigned long length); > @@ -70,4 +72,9 @@ static inline bool drm_arch_can_wc_memory(void) > #endif > } >=20=20 > +void drm_memcpy_init_early(void); > + > +void drm_memcpy_from_wc(struct dma_buf_map *dst, > + const struct dma_buf_map *src, > + unsigned long len); > #endif --=20 Jani Nikula, Intel Open Source Graphics Center