From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 996D0C47096 for ; Mon, 31 May 2021 12:20:25 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 6952F6127C for ; Mon, 31 May 2021 12:20:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6952F6127C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D902A6E902; Mon, 31 May 2021 12:20:15 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BBBD6E8F4; Mon, 31 May 2021 12:20:13 +0000 (UTC) IronPort-SDR: wBZb5Rg3+M2zTpSfYBLi6j4y2nSroqPE6fRZ3ADXvW3OcdsC6pnL+6k5M09Rsj2H1PuPfxsr+s AkeTgqLOhTYQ== X-IronPort-AV: E=McAfee;i="6200,9189,10000"; a="183027490" X-IronPort-AV: E=Sophos;i="5.83,237,1616482800"; d="scan'208";a="183027490" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2021 05:20:13 -0700 IronPort-SDR: J59TCZjIK5DxuvVopcgCOTDM31PG2KMAZMi/OtyDyxkif5IEI3Gv5Ggiycfe9UehB+4Trbfie6 nInTr2ubCzKw== X-IronPort-AV: E=Sophos;i="5.83,237,1616482800"; d="scan'208";a="473903922" Received: from fnygreen-mobl1.ger.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.133]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2021 05:20:10 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Mon, 31 May 2021 14:19:32 +0200 Message-Id: <20210531121940.267032-8-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210531121940.267032-1-thomas.hellstrom@linux.intel.com> References: <20210531121940.267032-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v7 07/15] drm: Add a prefetching memcpy_from_wc X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , =?UTF-8?q?Christian=20K=C3=B6nig?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" UmVhZGluZyBvdXQgb2Ygd3JpdGUtY29tYmluaW5nIG1hcHBlZCBtZW1vcnkgaXMgdHlwaWNhbGx5 IHZlcnkgc2xvdwpzaW5jZSB0aGUgQ1BVIGRvZXNuJ3QgcHJlZmV0Y2guIEhvd2V2ZXIgc29tZSBh cmNocyBoYXZlIHNwZWNpYWwKaW5zdHJ1Y3Rpb25zIHRvIGRvIHRoaXMuCgpTbyBhZGQgYSBiZXN0 LWVmZm9ydCBtZW1jcHlfZnJvbV93YyB0YWtpbmcgZG1hLWJ1Zi1tYXAgcG9pbnRlcgphcmd1bWVu dHMgdGhhdCBhdHRlbXB0cyB0byB1c2UgYSBmYXN0IHByZWZldGNoaW5nIG1lbWNweSBhbmQKb3Ro ZXJ3aXNlIGZhbGxzIGJhY2sgdG8gb3JkaW5hcnkgbWVtY29waWVzLCB0YWtpbmcgdGhlIGlvbWVt IHRhZ2dpbmcKaW50byBhY2NvdW50LgoKVGhlIGNvZGUgaXMgbGFyZ2VseSBjb3BpZWQgZnJvbSBp OTE1X21lbWNweV9mcm9tX3djLgoKQ2M6IERhbmllbCBWZXR0ZXIgPGRhbmllbEBmZndsbC5jaD4K Q2M6IENocmlzdGlhbiBLw7ZuaWcgPGNocmlzdGlhbi5rb2VuaWdAYW1kLmNvbT4KU3VnZ2VzdGVk LWJ5OiBEYW5pZWwgVmV0dGVyIDxkYW5pZWxAZmZ3bGwuY2g+ClNpZ25lZC1vZmYtYnk6IFRob21h cyBIZWxsc3Ryw7ZtIDx0aG9tYXMuaGVsbHN0cm9tQGxpbnV4LmludGVsLmNvbT4KLS0tCnY3Ogot IFBlcmZvcm0gYSBtZW1jcHkgZXZlbiBpZiB3YXJuaW5nIHdpdGggaW5faW50ZXJydXB0KCkuIFN1 Z2dlc3RlZCBieQogIENocmlzdGlhbiBLw7ZuaWcuCi0gRml4IGNvbXBpbGF0aW9uIGZhaWx1cmUg b24gIVg4NiAoUmVwb3J0ZWQgYnkga2VybmVsIHRlc3Qgcm9ib3QKICBsa3BAaW50ZWwuY29tKQot LS0KIERvY3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3QgfCAgIDIgKy0KIGRyaXZlcnMvZ3B1L2Ry bS9kcm1fY2FjaGUuYyAgfCAxNDcgKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysK IGRyaXZlcnMvZ3B1L2RybS9kcm1fZHJ2LmMgICAgfCAgIDIgKwogaW5jbHVkZS9kcm0vZHJtX2Nh Y2hlLmggICAgICB8ICAgNyArKwogNCBmaWxlcyBjaGFuZ2VkLCAxNTcgaW5zZXJ0aW9ucygrKSwg MSBkZWxldGlvbigtKQoKZGlmZiAtLWdpdCBhL0RvY3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3Qg Yi9Eb2N1bWVudGF0aW9uL2dwdS9kcm0tbW0ucnN0CmluZGV4IDIxYmU2ZGVhZGMxMi4uYzY2MDU4 YzViY2U3IDEwMDY0NAotLS0gYS9Eb2N1bWVudGF0aW9uL2dwdS9kcm0tbW0ucnN0CisrKyBiL0Rv Y3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3QKQEAgLTQ2OSw3ICs0NjksNyBAQCBEUk0gTU0gUmFu Z2UgQWxsb2NhdG9yIEZ1bmN0aW9uIFJlZmVyZW5jZXMKIC4uIGtlcm5lbC1kb2M6OiBkcml2ZXJz L2dwdS9kcm0vZHJtX21tLmMKICAgIDpleHBvcnQ6CiAKLURSTSBDYWNoZSBIYW5kbGluZworRFJN IENhY2hlIEhhbmRsaW5nIGFuZCBGYXN0IFdDIG1lbWNweSgpCiA9PT09PT09PT09PT09PT09PT0K IAogLi4ga2VybmVsLWRvYzo6IGRyaXZlcnMvZ3B1L2RybS9kcm1fY2FjaGUuYwpkaWZmIC0tZ2l0 IGEvZHJpdmVycy9ncHUvZHJtL2RybV9jYWNoZS5jIGIvZHJpdmVycy9ncHUvZHJtL2RybV9jYWNo ZS5jCmluZGV4IDc5YTUwZWYxMjUwZi4uYjg4N2Q3ZGVjOGI4IDEwMDY0NAotLS0gYS9kcml2ZXJz L2dwdS9kcm0vZHJtX2NhY2hlLmMKKysrIGIvZHJpdmVycy9ncHUvZHJtL2RybV9jYWNoZS5jCkBA IC0yOCw2ICsyOCw3IEBACiAgKiBBdXRob3JzOiBUaG9tYXMgSGVsbHN0csO2bSA8dGhvbWFzLWF0 LXR1bmdzdGVuZ3JhcGhpY3MtZG90LWNvbT4KICAqLwogCisjaW5jbHVkZSA8bGludXgvZG1hLWJ1 Zi1tYXAuaD4KICNpbmNsdWRlIDxsaW51eC9leHBvcnQuaD4KICNpbmNsdWRlIDxsaW51eC9oaWdo bWVtLmg+CiAjaW5jbHVkZSA8bGludXgvbWVtX2VuY3J5cHQuaD4KQEAgLTM1LDYgKzM2LDkgQEAK IAogI2luY2x1ZGUgPGRybS9kcm1fY2FjaGUuaD4KIAorLyogQSBzbWFsbCBib3VuY2UgYnVmZmVy IHRoYXQgZml0cyBvbiB0aGUgc3RhY2suICovCisjZGVmaW5lIE1FTUNQWV9CT1VOQ0VfU0laRSAx MjgKKwogI2lmIGRlZmluZWQoQ09ORklHX1g4NikKICNpbmNsdWRlIDxhc20vc21wLmg+CiAKQEAg LTIwOSwzICsyMTMsMTQ2IEBAIGJvb2wgZHJtX25lZWRfc3dpb3RsYihpbnQgZG1hX2JpdHMpCiAJ cmV0dXJuIG1heF9pb21lbSA+ICgodTY0KTEgPDwgZG1hX2JpdHMpOwogfQogRVhQT1JUX1NZTUJP TChkcm1fbmVlZF9zd2lvdGxiKTsKKworc3RhdGljIHZvaWQgbWVtY3B5X2ZhbGxiYWNrKHN0cnVj dCBkbWFfYnVmX21hcCAqZHN0LAorCQkJICAgIGNvbnN0IHN0cnVjdCBkbWFfYnVmX21hcCAqc3Jj LAorCQkJICAgIHVuc2lnbmVkIGxvbmcgbGVuKQoreworCWlmICghZHN0LT5pc19pb21lbSAmJiAh c3JjLT5pc19pb21lbSkgeworCQltZW1jcHkoZHN0LT52YWRkciwgc3JjLT52YWRkciwgbGVuKTsK Kwl9IGVsc2UgaWYgKCFzcmMtPmlzX2lvbWVtKSB7CisJCWRtYV9idWZfbWFwX21lbWNweV90byhk c3QsIHNyYy0+dmFkZHIsIGxlbik7CisJfSBlbHNlIGlmICghZHN0LT5pc19pb21lbSkgeworCQlt ZW1jcHlfZnJvbWlvKGRzdC0+dmFkZHIsIHNyYy0+dmFkZHJfaW9tZW0sIGxlbik7CisJfSBlbHNl IHsKKwkJLyoKKwkJICogQm91bmNlIHNpemUgaXMgbm90IHBlcmZvcm1hbmNlIHR1bmVkLCBidXQg dXNpbmcgYQorCQkgKiBib3VuY2UgYnVmZmVyIGxpa2UgdGhpcyBpcyBzaWduaWZpY2FudGx5IGZh c3RlciB0aGFuCisJCSAqIHJlc29ydGluZyB0byBpb3JlYWR4eCgpICsgaW93cml0ZXh4KCkuCisJ CSAqLworCQljaGFyIGJvdW5jZVtNRU1DUFlfQk9VTkNFX1NJWkVdOworCQl2b2lkIF9faW9tZW0g Kl9zcmMgPSBzcmMtPnZhZGRyX2lvbWVtOworCQl2b2lkIF9faW9tZW0gKl9kc3QgPSBkc3QtPnZh ZGRyX2lvbWVtOworCisJCXdoaWxlIChsZW4gPj0gTUVNQ1BZX0JPVU5DRV9TSVpFKSB7CisJCQlt ZW1jcHlfZnJvbWlvKGJvdW5jZSwgX3NyYywgTUVNQ1BZX0JPVU5DRV9TSVpFKTsKKwkJCW1lbWNw eV90b2lvKF9kc3QsIGJvdW5jZSwgTUVNQ1BZX0JPVU5DRV9TSVpFKTsKKwkJCV9zcmMgKz0gTUVN Q1BZX0JPVU5DRV9TSVpFOworCQkJX2RzdCArPSBNRU1DUFlfQk9VTkNFX1NJWkU7CisJCQlsZW4g LT0gTUVNQ1BZX0JPVU5DRV9TSVpFOworCQl9CisJCWlmIChsZW4pIHsKKwkJCW1lbWNweV9mcm9t aW8oYm91bmNlLCBfc3JjLCBNRU1DUFlfQk9VTkNFX1NJWkUpOworCQkJbWVtY3B5X3RvaW8oX2Rz dCwgYm91bmNlLCBNRU1DUFlfQk9VTkNFX1NJWkUpOworCQl9CisJfQorfQorCisjaWZkZWYgQ09O RklHX1g4NgorCitzdGF0aWMgREVGSU5FX1NUQVRJQ19LRVlfRkFMU0UoaGFzX21vdm50ZHFhKTsK Kworc3RhdGljIHZvaWQgX19tZW1jcHlfbnRkcWEodm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMs IHVuc2lnbmVkIGxvbmcgbGVuKQoreworCWtlcm5lbF9mcHVfYmVnaW4oKTsKKworCXdoaWxlIChs ZW4gPj0gNCkgeworCQlhc20oIm1vdm50ZHFhCSglMCksICUleG1tMFxuIgorCQkgICAgIm1vdm50 ZHFhIDE2KCUwKSwgJSV4bW0xXG4iCisJCSAgICAibW92bnRkcWEgMzIoJTApLCAlJXhtbTJcbiIK KwkJICAgICJtb3ZudGRxYSA0OCglMCksICUleG1tM1xuIgorCQkgICAgIm1vdmFwcyAlJXhtbTAs ICAgKCUxKVxuIgorCQkgICAgIm1vdmFwcyAlJXhtbTEsIDE2KCUxKVxuIgorCQkgICAgIm1vdmFw cyAlJXhtbTIsIDMyKCUxKVxuIgorCQkgICAgIm1vdmFwcyAlJXhtbTMsIDQ4KCUxKVxuIgorCQkg ICAgOjogInIiIChzcmMpLCAiciIgKGRzdCkgOiAibWVtb3J5Iik7CisJCXNyYyArPSA2NDsKKwkJ ZHN0ICs9IDY0OworCQlsZW4gLT0gNDsKKwl9CisJd2hpbGUgKGxlbi0tKSB7CisJCWFzbSgibW92 bnRkcWEgKCUwKSwgJSV4bW0wXG4iCisJCSAgICAibW92YXBzICUleG1tMCwgKCUxKVxuIgorCQkg ICAgOjogInIiIChzcmMpLCAiciIgKGRzdCkgOiAibWVtb3J5Iik7CisJCXNyYyArPSAxNjsKKwkJ ZHN0ICs9IDE2OworCX0KKworCWtlcm5lbF9mcHVfZW5kKCk7Cit9CisKKy8qCisgKiBfX2RybV9t ZW1jcHlfZnJvbV93YyBjb3BpZXMgQGxlbiBieXRlcyBmcm9tIEBzcmMgdG8gQGRzdCB1c2luZwor ICogbm9uLXRlbXBvcmFsIGluc3RydWN0aW9ucyB3aGVyZSBhdmFpbGFibGUuIE5vdGUgdGhhdCBh bGwgYXJndW1lbnRzCisgKiAoQHNyYywgQGRzdCkgbXVzdCBiZSBhbGlnbmVkIHRvIDE2IGJ5dGVz IGFuZCBAbGVuIG11c3QgYmUgYSBtdWx0aXBsZQorICogb2YgMTYuCisgKi8KK3N0YXRpYyB2b2lk IF9fZHJtX21lbWNweV9mcm9tX3djKHZvaWQgKmRzdCwgY29uc3Qgdm9pZCAqc3JjLCB1bnNpZ25l ZCBsb25nIGxlbikKK3sKKwlpZiAodW5saWtlbHkoKCh1bnNpZ25lZCBsb25nKWRzdCB8ICh1bnNp Z25lZCBsb25nKXNyYyB8IGxlbikgJiAxNSkpCisJCW1lbWNweShkc3QsIHNyYywgbGVuKTsKKwll bHNlIGlmIChsaWtlbHkobGVuKSkKKwkJX19tZW1jcHlfbnRkcWEoZHN0LCBzcmMsIGxlbiA+PiA0 KTsKK30KKworLyoqCisgKiBkcm1fbWVtY3B5X2Zyb21fd2MgLSBQZXJmb3JtIHRoZSBmYXN0ZXN0 IGF2YWlsYWJsZSBtZW1jcHkgZnJvbSBhIHNvdXJjZQorICogdGhhdCBtYXkgYmUgV0MuCisgKiBA ZHN0OiBUaGUgZGVzdGluYXRpb24gcG9pbnRlcgorICogQHNyYzogVGhlIHNvdXJjZSBwb2ludGVy CisgKiBAbGVuOiBUaGUgc2l6ZSBvZiB0aGUgYXJlYSBvIHRyYW5zZmVyIGluIGJ5dGVzCisgKgor ICogVHJpZXMgYW4gYXJjaCBvcHRpbWl6ZWQgbWVtY3B5IGZvciBwcmVmZXRjaGluZyByZWFkaW5n IG91dCBvZiBhIFdDIHJlZ2lvbiwKKyAqIGFuZCBpZiBubyBzdWNoIGJlYXN0IGlzIGF2YWlsYWJs ZSwgZmFsbHMgYmFjayB0byBhIG5vcm1hbCBtZW1jcHkuCisgKi8KK3ZvaWQgZHJtX21lbWNweV9m cm9tX3djKHN0cnVjdCBkbWFfYnVmX21hcCAqZHN0LAorCQkJY29uc3Qgc3RydWN0IGRtYV9idWZf bWFwICpzcmMsCisJCQl1bnNpZ25lZCBsb25nIGxlbikKK3sKKwlpZiAoV0FSTl9PTihpbl9pbnRl cnJ1cHQoKSkpIHsKKwkJbWVtY3B5X2ZhbGxiYWNrKGRzdCwgc3JjLCBsZW4pOworCQlyZXR1cm47 CisJfQorCisJaWYgKHN0YXRpY19icmFuY2hfbGlrZWx5KCZoYXNfbW92bnRkcWEpKSB7CisJCV9f ZHJtX21lbWNweV9mcm9tX3djKGRzdC0+aXNfaW9tZW0gPworCQkJCSAgICAgKHZvaWQgX19mb3Jj ZSAqKWRzdC0+dmFkZHJfaW9tZW0gOgorCQkJCSAgICAgZHN0LT52YWRkciwKKwkJCQkgICAgIHNy Yy0+aXNfaW9tZW0gPworCQkJCSAgICAgKHZvaWQgY29uc3QgX19mb3JjZSAqKXNyYy0+dmFkZHJf aW9tZW0gOgorCQkJCSAgICAgc3JjLT52YWRkciwKKwkJCQkgICAgIGxlbik7CisJCXJldHVybjsK Kwl9CisKKwltZW1jcHlfZmFsbGJhY2soZHN0LCBzcmMsIGxlbik7Cit9CitFWFBPUlRfU1lNQk9M KGRybV9tZW1jcHlfZnJvbV93Yyk7CisKKy8qKgorICogZHJtX21lbWNweV9pbml0X2Vhcmx5IC0g T25lIHRpbWUgaW5pdGlhbGl6YXRpb24gb2YgdGhlIFdDIG1lbWNweSBjb2RlCisgKi8KK3ZvaWQg ZHJtX21lbWNweV9pbml0X2Vhcmx5KHZvaWQpCit7CisJLyoKKwkgKiBTb21lIGh5cGVydmlzb3Jz IChlLmcuIEtWTSkgZG9uJ3Qgc3VwcG9ydCBWRVgtcHJlZml4IGluc3RydWN0aW9ucworCSAqIGVt dWxhdGlvbi4gU28gZG9uJ3QgZW5hYmxlIG1vdm50ZHFhIGluIGh5cGVydmlzb3IgZ3Vlc3QuCisJ ICovCisJaWYgKHN0YXRpY19jcHVfaGFzKFg4Nl9GRUFUVVJFX1hNTTRfMSkgJiYKKwkgICAgIWJv b3RfY3B1X2hhcyhYODZfRkVBVFVSRV9IWVBFUlZJU09SKSkKKwkJc3RhdGljX2JyYW5jaF9lbmFi bGUoJmhhc19tb3ZudGRxYSk7Cit9CisjZWxzZQordm9pZCBkcm1fbWVtY3B5X2Zyb21fd2Moc3Ry dWN0IGRtYV9idWZfbWFwICpkc3QsCisJCQljb25zdCBzdHJ1Y3QgZG1hX2J1Zl9tYXAgKnNyYywK KwkJCXVuc2lnbmVkIGxvbmcgbGVuKQoreworCVdBUk5fT04oaW5faW50ZXJydXB0KCkpOworCisJ bWVtY3B5X2ZhbGxiYWNrKGRzdCwgc3JjLCBsZW4pOworfQorCit2b2lkIGRybV9tZW1jcHlfaW5p dF9lYXJseSh2b2lkKQoreworfQorI2VuZGlmIC8qIENPTkZJR19YODYgKi8KZGlmZiAtLWdpdCBh L2RyaXZlcnMvZ3B1L2RybS9kcm1fZHJ2LmMgYi9kcml2ZXJzL2dwdS9kcm0vZHJtX2Rydi5jCmlu ZGV4IDNkOGQ2OGE5OGI5NS4uODgwNGVjN2QzMjE1IDEwMDY0NAotLS0gYS9kcml2ZXJzL2dwdS9k cm0vZHJtX2Rydi5jCisrKyBiL2RyaXZlcnMvZ3B1L2RybS9kcm1fZHJ2LmMKQEAgLTM1LDYgKzM1 LDcgQEAKICNpbmNsdWRlIDxsaW51eC9zbGFiLmg+CiAjaW5jbHVkZSA8bGludXgvc3JjdS5oPgog CisjaW5jbHVkZSA8ZHJtL2RybV9jYWNoZS5oPgogI2luY2x1ZGUgPGRybS9kcm1fY2xpZW50Lmg+ CiAjaW5jbHVkZSA8ZHJtL2RybV9jb2xvcl9tZ210Lmg+CiAjaW5jbHVkZSA8ZHJtL2RybV9kcnYu aD4KQEAgLTEwNDEsNiArMTA0Miw3IEBAIHN0YXRpYyBpbnQgX19pbml0IGRybV9jb3JlX2luaXQo dm9pZCkKIAogCWRybV9jb25uZWN0b3JfaWRhX2luaXQoKTsKIAlpZHJfaW5pdCgmZHJtX21pbm9y c19pZHIpOworCWRybV9tZW1jcHlfaW5pdF9lYXJseSgpOwogCiAJcmV0ID0gZHJtX3N5c2ZzX2lu aXQoKTsKIAlpZiAocmV0IDwgMCkgewpkaWZmIC0tZ2l0IGEvaW5jbHVkZS9kcm0vZHJtX2NhY2hl LmggYi9pbmNsdWRlL2RybS9kcm1fY2FjaGUuaAppbmRleCBlOWFkNDg2M2Q5MTUuLmNjOWRlMTYz MmRkMyAxMDA2NDQKLS0tIGEvaW5jbHVkZS9kcm0vZHJtX2NhY2hlLmgKKysrIGIvaW5jbHVkZS9k cm0vZHJtX2NhY2hlLmgKQEAgLTM1LDYgKzM1LDggQEAKIAogI2luY2x1ZGUgPGxpbnV4L3NjYXR0 ZXJsaXN0Lmg+CiAKK3N0cnVjdCBkbWFfYnVmX21hcDsKKwogdm9pZCBkcm1fY2xmbHVzaF9wYWdl cyhzdHJ1Y3QgcGFnZSAqcGFnZXNbXSwgdW5zaWduZWQgbG9uZyBudW1fcGFnZXMpOwogdm9pZCBk cm1fY2xmbHVzaF9zZyhzdHJ1Y3Qgc2dfdGFibGUgKnN0KTsKIHZvaWQgZHJtX2NsZmx1c2hfdmly dF9yYW5nZSh2b2lkICphZGRyLCB1bnNpZ25lZCBsb25nIGxlbmd0aCk7CkBAIC03MCw0ICs3Miw5 IEBAIHN0YXRpYyBpbmxpbmUgYm9vbCBkcm1fYXJjaF9jYW5fd2NfbWVtb3J5KHZvaWQpCiAjZW5k aWYKIH0KIAordm9pZCBkcm1fbWVtY3B5X2luaXRfZWFybHkodm9pZCk7CisKK3ZvaWQgZHJtX21l bWNweV9mcm9tX3djKHN0cnVjdCBkbWFfYnVmX21hcCAqZHN0LAorCQkJY29uc3Qgc3RydWN0IGRt YV9idWZfbWFwICpzcmMsCisJCQl1bnNpZ25lZCBsb25nIGxlbik7CiAjZW5kaWYKLS0gCjIuMzEu MQoKX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwt Z2Z4IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8v bGlzdHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg== From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8784BC47083 for ; Mon, 31 May 2021 12:20:26 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4ED866127C for ; Mon, 31 May 2021 12:20:26 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4ED866127C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 5CEC46E909; Mon, 31 May 2021 12:20:16 +0000 (UTC) Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5BBBD6E8F4; Mon, 31 May 2021 12:20:13 +0000 (UTC) IronPort-SDR: wBZb5Rg3+M2zTpSfYBLi6j4y2nSroqPE6fRZ3ADXvW3OcdsC6pnL+6k5M09Rsj2H1PuPfxsr+s AkeTgqLOhTYQ== X-IronPort-AV: E=McAfee;i="6200,9189,10000"; a="183027490" X-IronPort-AV: E=Sophos;i="5.83,237,1616482800"; d="scan'208";a="183027490" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2021 05:20:13 -0700 IronPort-SDR: J59TCZjIK5DxuvVopcgCOTDM31PG2KMAZMi/OtyDyxkif5IEI3Gv5Ggiycfe9UehB+4Trbfie6 nInTr2ubCzKw== X-IronPort-AV: E=Sophos;i="5.83,237,1616482800"; d="scan'208";a="473903922" Received: from fnygreen-mobl1.ger.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.133]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 May 2021 05:20:10 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH v7 07/15] drm: Add a prefetching memcpy_from_wc Date: Mon, 31 May 2021 14:19:32 +0200 Message-Id: <20210531121940.267032-8-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210531121940.267032-1-thomas.hellstrom@linux.intel.com> References: <20210531121940.267032-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , =?UTF-8?q?Christian=20K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Reading out of write-combining mapped memory is typically very slow since the CPU doesn't prefetch. However some archs have special instructions to do this. So add a best-effort memcpy_from_wc taking dma-buf-map pointer arguments that attempts to use a fast prefetching memcpy and otherwise falls back to ordinary memcopies, taking the iomem tagging into account. The code is largely copied from i915_memcpy_from_wc. Cc: Daniel Vetter Cc: Christian König Suggested-by: Daniel Vetter Signed-off-by: Thomas Hellström --- v7: - Perform a memcpy even if warning with in_interrupt(). Suggested by Christian König. - Fix compilation failure on !X86 (Reported by kernel test robot lkp@intel.com) --- Documentation/gpu/drm-mm.rst | 2 +- drivers/gpu/drm/drm_cache.c | 147 +++++++++++++++++++++++++++++++++++ drivers/gpu/drm/drm_drv.c | 2 + include/drm/drm_cache.h | 7 ++ 4 files changed, 157 insertions(+), 1 deletion(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index 21be6deadc12..c66058c5bce7 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -469,7 +469,7 @@ DRM MM Range Allocator Function References .. kernel-doc:: drivers/gpu/drm/drm_mm.c :export: -DRM Cache Handling +DRM Cache Handling and Fast WC memcpy() ================== .. kernel-doc:: drivers/gpu/drm/drm_cache.c diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 79a50ef1250f..b887d7dec8b8 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -28,6 +28,7 @@ * Authors: Thomas Hellström */ +#include #include #include #include @@ -35,6 +36,9 @@ #include +/* A small bounce buffer that fits on the stack. */ +#define MEMCPY_BOUNCE_SIZE 128 + #if defined(CONFIG_X86) #include @@ -209,3 +213,146 @@ bool drm_need_swiotlb(int dma_bits) return max_iomem > ((u64)1 << dma_bits); } EXPORT_SYMBOL(drm_need_swiotlb); + +static void memcpy_fallback(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + if (!dst->is_iomem && !src->is_iomem) { + memcpy(dst->vaddr, src->vaddr, len); + } else if (!src->is_iomem) { + dma_buf_map_memcpy_to(dst, src->vaddr, len); + } else if (!dst->is_iomem) { + memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); + } else { + /* + * Bounce size is not performance tuned, but using a + * bounce buffer like this is significantly faster than + * resorting to ioreadxx() + iowritexx(). + */ + char bounce[MEMCPY_BOUNCE_SIZE]; + void __iomem *_src = src->vaddr_iomem; + void __iomem *_dst = dst->vaddr_iomem; + + while (len >= MEMCPY_BOUNCE_SIZE) { + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); + _src += MEMCPY_BOUNCE_SIZE; + _dst += MEMCPY_BOUNCE_SIZE; + len -= MEMCPY_BOUNCE_SIZE; + } + if (len) { + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); + } + } +} + +#ifdef CONFIG_X86 + +static DEFINE_STATIC_KEY_FALSE(has_movntdqa); + +static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) +{ + kernel_fpu_begin(); + + while (len >= 4) { + asm("movntdqa (%0), %%xmm0\n" + "movntdqa 16(%0), %%xmm1\n" + "movntdqa 32(%0), %%xmm2\n" + "movntdqa 48(%0), %%xmm3\n" + "movaps %%xmm0, (%1)\n" + "movaps %%xmm1, 16(%1)\n" + "movaps %%xmm2, 32(%1)\n" + "movaps %%xmm3, 48(%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 64; + dst += 64; + len -= 4; + } + while (len--) { + asm("movntdqa (%0), %%xmm0\n" + "movaps %%xmm0, (%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 16; + dst += 16; + } + + kernel_fpu_end(); +} + +/* + * __drm_memcpy_from_wc copies @len bytes from @src to @dst using + * non-temporal instructions where available. Note that all arguments + * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple + * of 16. + */ +static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned long len) +{ + if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) + memcpy(dst, src, len); + else if (likely(len)) + __memcpy_ntdqa(dst, src, len >> 4); +} + +/** + * drm_memcpy_from_wc - Perform the fastest available memcpy from a source + * that may be WC. + * @dst: The destination pointer + * @src: The source pointer + * @len: The size of the area o transfer in bytes + * + * Tries an arch optimized memcpy for prefetching reading out of a WC region, + * and if no such beast is available, falls back to a normal memcpy. + */ +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + if (WARN_ON(in_interrupt())) { + memcpy_fallback(dst, src, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst->is_iomem ? + (void __force *)dst->vaddr_iomem : + dst->vaddr, + src->is_iomem ? + (void const __force *)src->vaddr_iomem : + src->vaddr, + len); + return; + } + + memcpy_fallback(dst, src, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc); + +/** + * drm_memcpy_init_early - One time initialization of the WC memcpy code + */ +void drm_memcpy_init_early(void) +{ + /* + * Some hypervisors (e.g. KVM) don't support VEX-prefix instructions + * emulation. So don't enable movntdqa in hypervisor guest. + */ + if (static_cpu_has(X86_FEATURE_XMM4_1) && + !boot_cpu_has(X86_FEATURE_HYPERVISOR)) + static_branch_enable(&has_movntdqa); +} +#else +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + WARN_ON(in_interrupt()); + + memcpy_fallback(dst, src, len); +} + +void drm_memcpy_init_early(void) +{ +} +#endif /* CONFIG_X86 */ diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 3d8d68a98b95..8804ec7d3215 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -35,6 +35,7 @@ #include #include +#include #include #include #include @@ -1041,6 +1042,7 @@ static int __init drm_core_init(void) drm_connector_ida_init(); idr_init(&drm_minors_idr); + drm_memcpy_init_early(); ret = drm_sysfs_init(); if (ret < 0) { diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index e9ad4863d915..cc9de1632dd3 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -35,6 +35,8 @@ #include +struct dma_buf_map; + void drm_clflush_pages(struct page *pages[], unsigned long num_pages); void drm_clflush_sg(struct sg_table *st); void drm_clflush_virt_range(void *addr, unsigned long length); @@ -70,4 +72,9 @@ static inline bool drm_arch_can_wc_memory(void) #endif } +void drm_memcpy_init_early(void); + +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len); #endif -- 2.31.1