From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23176C47097 for ; Wed, 2 Jun 2021 08:38:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id C342E613AC for ; Wed, 2 Jun 2021 08:38:52 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C342E613AC Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C6EE56EB94; Wed, 2 Jun 2021 08:38:40 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 458BD6EB8F; Wed, 2 Jun 2021 08:38:39 +0000 (UTC) IronPort-SDR: BFYHLoKDMZign0fH+LdcOxU+KbW7HNC/5nkBpnlPiTBKL6RPigifbm/FkO1i0B2nj8PhIvg2Rr wzm+s82M+Svg== X-IronPort-AV: E=McAfee;i="6200,9189,10002"; a="225026229" X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="225026229" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2021 01:38:39 -0700 IronPort-SDR: QB+PbNcPMU6HTxD1L7t79aTgqcr0f556y5K+5/3YZkFAPMSrpGZKp4ry58Y2/cF+b/q9GcrkBr qLMimEqabTvg== X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="467376319" Received: from lmarkel-mobl1.ger.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.49]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2021 01:38:37 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Date: Wed, 2 Jun 2021 10:38:11 +0200 Message-Id: <20210602083818.241793-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210602083818.241793-1-thomas.hellstrom@linux.intel.com> References: <20210602083818.241793-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v10 04/11] drm: Add a prefetching memcpy_from_wc X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , =?UTF-8?q?Christian=20K=C3=B6nig?= Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" UmVhZGluZyBvdXQgb2Ygd3JpdGUtY29tYmluaW5nIG1hcHBlZCBtZW1vcnkgaXMgdHlwaWNhbGx5 IHZlcnkgc2xvdwpzaW5jZSB0aGUgQ1BVIGRvZXNuJ3QgcHJlZmV0Y2guIEhvd2V2ZXIgc29tZSBh cmNocyBoYXZlIHNwZWNpYWwKaW5zdHJ1Y3Rpb25zIHRvIGRvIHRoaXMuCgpTbyBhZGQgYSBiZXN0 LWVmZm9ydCBtZW1jcHlfZnJvbV93YyB0YWtpbmcgZG1hLWJ1Zi1tYXAgcG9pbnRlcgphcmd1bWVu dHMgdGhhdCBhdHRlbXB0cyB0byB1c2UgYSBmYXN0IHByZWZldGNoaW5nIG1lbWNweSBhbmQKb3Ro ZXJ3aXNlIGZhbGxzIGJhY2sgdG8gb3JkaW5hcnkgbWVtY29waWVzLCB0YWtpbmcgdGhlIGlvbWVt IHRhZ2dpbmcKaW50byBhY2NvdW50LgoKVGhlIGNvZGUgaXMgbGFyZ2VseSBjb3BpZWQgZnJvbSBp OTE1X21lbWNweV9mcm9tX3djLgoKQ2M6IERhbmllbCBWZXR0ZXIgPGRhbmllbEBmZndsbC5jaD4K Q2M6IENocmlzdGlhbiBLw7ZuaWcgPGNocmlzdGlhbi5rb2VuaWdAYW1kLmNvbT4KU3VnZ2VzdGVk LWJ5OiBEYW5pZWwgVmV0dGVyIDxkYW5pZWxAZmZ3bGwuY2g+ClNpZ25lZC1vZmYtYnk6IFRob21h cyBIZWxsc3Ryw7ZtIDx0aG9tYXMuaGVsbHN0cm9tQGxpbnV4LmludGVsLmNvbT4KQWNrZWQtYnk6 IENocmlzdGlhbiBLw7ZuaWcgPGNocmlzdGlhbi5rb2VuaWdAYW1kLmNvbT4KQWNrZWQtYnk6IERh bmllbCBWZXR0ZXIgPGRhbmllbEBmZndsbC5jaD4KLS0tCnY3OgotIFBlcmZvcm0gYSBtZW1jcHkg ZXZlbiBpZiB3YXJuaW5nIHdpdGggaW5faW50ZXJydXB0KCkuIFN1Z2dlc3RlZCBieQogIENocmlz dGlhbiBLw7ZuaWcuCi0gRml4IGNvbXBpbGF0aW9uIGZhaWx1cmUgb24gIVg4NiAoUmVwb3J0ZWQg Ynkga2VybmVsIHRlc3Qgcm9ib3QKICBsa3BAaW50ZWwuY29tKQp2ODoKLSBTa2lwIGtlcm5lbGRv YyBmb3IgZHJtX21lbWNweV9pbml0X2Vhcmx5KCkKLSBFeHBvcnQgZHJtX21lbWNweV9mcm9tX3dj KCkgYWxzbyBmb3Igbm9uLXg4Ni4KdjEwOgotIEZpeCBhIGtlcm5lbGRvYyB0aXRsZSB1bmRlcmxp bmUKLS0tCiBEb2N1bWVudGF0aW9uL2dwdS9kcm0tbW0ucnN0IHwgICA0ICstCiBkcml2ZXJzL2dw dS9kcm0vZHJtX2NhY2hlLmMgIHwgMTQ4ICsrKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrCiBkcml2ZXJzL2dwdS9kcm0vZHJtX2Rydi5jICAgIHwgICAyICsKIGluY2x1ZGUvZHJtL2Ry bV9jYWNoZS5oICAgICAgfCAgIDcgKysKIDQgZmlsZXMgY2hhbmdlZCwgMTU5IGluc2VydGlvbnMo KyksIDIgZGVsZXRpb25zKC0pCgpkaWZmIC0tZ2l0IGEvRG9jdW1lbnRhdGlvbi9ncHUvZHJtLW1t LnJzdCBiL0RvY3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3QKaW5kZXggMjFiZTZkZWFkYzEyLi5k NWE3M2ZhMmM5ZWYgMTAwNjQ0Ci0tLSBhL0RvY3VtZW50YXRpb24vZ3B1L2RybS1tbS5yc3QKKysr IGIvRG9jdW1lbnRhdGlvbi9ncHUvZHJtLW1tLnJzdApAQCAtNDY5LDggKzQ2OSw4IEBAIERSTSBN TSBSYW5nZSBBbGxvY2F0b3IgRnVuY3Rpb24gUmVmZXJlbmNlcwogLi4ga2VybmVsLWRvYzo6IGRy aXZlcnMvZ3B1L2RybS9kcm1fbW0uYwogICAgOmV4cG9ydDoKIAotRFJNIENhY2hlIEhhbmRsaW5n Ci09PT09PT09PT09PT09PT09PT0KK0RSTSBDYWNoZSBIYW5kbGluZyBhbmQgRmFzdCBXQyBtZW1j cHkoKQorPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09CiAKIC4uIGtlcm5l bC1kb2M6OiBkcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMKICAgIDpleHBvcnQ6CmRpZmYgLS1n aXQgYS9kcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMgYi9kcml2ZXJzL2dwdS9kcm0vZHJtX2Nh Y2hlLmMKaW5kZXggNzlhNTBlZjEyNTBmLi41NDY1OTlmMTlhOTMgMTAwNjQ0Ci0tLSBhL2RyaXZl cnMvZ3B1L2RybS9kcm1fY2FjaGUuYworKysgYi9kcml2ZXJzL2dwdS9kcm0vZHJtX2NhY2hlLmMK QEAgLTI4LDYgKzI4LDcgQEAKICAqIEF1dGhvcnM6IFRob21hcyBIZWxsc3Ryw7ZtIDx0aG9tYXMt YXQtdHVuZ3N0ZW5ncmFwaGljcy1kb3QtY29tPgogICovCiAKKyNpbmNsdWRlIDxsaW51eC9kbWEt YnVmLW1hcC5oPgogI2luY2x1ZGUgPGxpbnV4L2V4cG9ydC5oPgogI2luY2x1ZGUgPGxpbnV4L2hp Z2htZW0uaD4KICNpbmNsdWRlIDxsaW51eC9tZW1fZW5jcnlwdC5oPgpAQCAtMzUsNiArMzYsOSBA QAogCiAjaW5jbHVkZSA8ZHJtL2RybV9jYWNoZS5oPgogCisvKiBBIHNtYWxsIGJvdW5jZSBidWZm ZXIgdGhhdCBmaXRzIG9uIHRoZSBzdGFjay4gKi8KKyNkZWZpbmUgTUVNQ1BZX0JPVU5DRV9TSVpF IDEyOAorCiAjaWYgZGVmaW5lZChDT05GSUdfWDg2KQogI2luY2x1ZGUgPGFzbS9zbXAuaD4KIApA QCAtMjA5LDMgKzIxMywxNDcgQEAgYm9vbCBkcm1fbmVlZF9zd2lvdGxiKGludCBkbWFfYml0cykK IAlyZXR1cm4gbWF4X2lvbWVtID4gKCh1NjQpMSA8PCBkbWFfYml0cyk7CiB9CiBFWFBPUlRfU1lN Qk9MKGRybV9uZWVkX3N3aW90bGIpOworCitzdGF0aWMgdm9pZCBtZW1jcHlfZmFsbGJhY2soc3Ry dWN0IGRtYV9idWZfbWFwICpkc3QsCisJCQkgICAgY29uc3Qgc3RydWN0IGRtYV9idWZfbWFwICpz cmMsCisJCQkgICAgdW5zaWduZWQgbG9uZyBsZW4pCit7CisJaWYgKCFkc3QtPmlzX2lvbWVtICYm ICFzcmMtPmlzX2lvbWVtKSB7CisJCW1lbWNweShkc3QtPnZhZGRyLCBzcmMtPnZhZGRyLCBsZW4p OworCX0gZWxzZSBpZiAoIXNyYy0+aXNfaW9tZW0pIHsKKwkJZG1hX2J1Zl9tYXBfbWVtY3B5X3Rv KGRzdCwgc3JjLT52YWRkciwgbGVuKTsKKwl9IGVsc2UgaWYgKCFkc3QtPmlzX2lvbWVtKSB7CisJ CW1lbWNweV9mcm9taW8oZHN0LT52YWRkciwgc3JjLT52YWRkcl9pb21lbSwgbGVuKTsKKwl9IGVs c2UgeworCQkvKgorCQkgKiBCb3VuY2Ugc2l6ZSBpcyBub3QgcGVyZm9ybWFuY2UgdHVuZWQsIGJ1 dCB1c2luZyBhCisJCSAqIGJvdW5jZSBidWZmZXIgbGlrZSB0aGlzIGlzIHNpZ25pZmljYW50bHkg ZmFzdGVyIHRoYW4KKwkJICogcmVzb3J0aW5nIHRvIGlvcmVhZHh4KCkgKyBpb3dyaXRleHgoKS4K KwkJICovCisJCWNoYXIgYm91bmNlW01FTUNQWV9CT1VOQ0VfU0laRV07CisJCXZvaWQgX19pb21l bSAqX3NyYyA9IHNyYy0+dmFkZHJfaW9tZW07CisJCXZvaWQgX19pb21lbSAqX2RzdCA9IGRzdC0+ dmFkZHJfaW9tZW07CisKKwkJd2hpbGUgKGxlbiA+PSBNRU1DUFlfQk9VTkNFX1NJWkUpIHsKKwkJ CW1lbWNweV9mcm9taW8oYm91bmNlLCBfc3JjLCBNRU1DUFlfQk9VTkNFX1NJWkUpOworCQkJbWVt Y3B5X3RvaW8oX2RzdCwgYm91bmNlLCBNRU1DUFlfQk9VTkNFX1NJWkUpOworCQkJX3NyYyArPSBN RU1DUFlfQk9VTkNFX1NJWkU7CisJCQlfZHN0ICs9IE1FTUNQWV9CT1VOQ0VfU0laRTsKKwkJCWxl biAtPSBNRU1DUFlfQk9VTkNFX1NJWkU7CisJCX0KKwkJaWYgKGxlbikgeworCQkJbWVtY3B5X2Zy b21pbyhib3VuY2UsIF9zcmMsIE1FTUNQWV9CT1VOQ0VfU0laRSk7CisJCQltZW1jcHlfdG9pbyhf ZHN0LCBib3VuY2UsIE1FTUNQWV9CT1VOQ0VfU0laRSk7CisJCX0KKwl9Cit9CisKKyNpZmRlZiBD T05GSUdfWDg2CisKK3N0YXRpYyBERUZJTkVfU1RBVElDX0tFWV9GQUxTRShoYXNfbW92bnRkcWEp OworCitzdGF0aWMgdm9pZCBfX21lbWNweV9udGRxYSh2b2lkICpkc3QsIGNvbnN0IHZvaWQgKnNy YywgdW5zaWduZWQgbG9uZyBsZW4pCit7CisJa2VybmVsX2ZwdV9iZWdpbigpOworCisJd2hpbGUg KGxlbiA+PSA0KSB7CisJCWFzbSgibW92bnRkcWEJKCUwKSwgJSV4bW0wXG4iCisJCSAgICAibW92 bnRkcWEgMTYoJTApLCAlJXhtbTFcbiIKKwkJICAgICJtb3ZudGRxYSAzMiglMCksICUleG1tMlxu IgorCQkgICAgIm1vdm50ZHFhIDQ4KCUwKSwgJSV4bW0zXG4iCisJCSAgICAibW92YXBzICUleG1t MCwgICAoJTEpXG4iCisJCSAgICAibW92YXBzICUleG1tMSwgMTYoJTEpXG4iCisJCSAgICAibW92 YXBzICUleG1tMiwgMzIoJTEpXG4iCisJCSAgICAibW92YXBzICUleG1tMywgNDgoJTEpXG4iCisJ CSAgICA6OiAiciIgKHNyYyksICJyIiAoZHN0KSA6ICJtZW1vcnkiKTsKKwkJc3JjICs9IDY0Owor CQlkc3QgKz0gNjQ7CisJCWxlbiAtPSA0OworCX0KKwl3aGlsZSAobGVuLS0pIHsKKwkJYXNtKCJt b3ZudGRxYSAoJTApLCAlJXhtbTBcbiIKKwkJICAgICJtb3ZhcHMgJSV4bW0wLCAoJTEpXG4iCisJ CSAgICA6OiAiciIgKHNyYyksICJyIiAoZHN0KSA6ICJtZW1vcnkiKTsKKwkJc3JjICs9IDE2Owor CQlkc3QgKz0gMTY7CisJfQorCisJa2VybmVsX2ZwdV9lbmQoKTsKK30KKworLyoKKyAqIF9fZHJt X21lbWNweV9mcm9tX3djIGNvcGllcyBAbGVuIGJ5dGVzIGZyb20gQHNyYyB0byBAZHN0IHVzaW5n CisgKiBub24tdGVtcG9yYWwgaW5zdHJ1Y3Rpb25zIHdoZXJlIGF2YWlsYWJsZS4gTm90ZSB0aGF0 IGFsbCBhcmd1bWVudHMKKyAqIChAc3JjLCBAZHN0KSBtdXN0IGJlIGFsaWduZWQgdG8gMTYgYnl0 ZXMgYW5kIEBsZW4gbXVzdCBiZSBhIG11bHRpcGxlCisgKiBvZiAxNi4KKyAqLworc3RhdGljIHZv aWQgX19kcm1fbWVtY3B5X2Zyb21fd2Modm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMsIHVuc2ln bmVkIGxvbmcgbGVuKQoreworCWlmICh1bmxpa2VseSgoKHVuc2lnbmVkIGxvbmcpZHN0IHwgKHVu c2lnbmVkIGxvbmcpc3JjIHwgbGVuKSAmIDE1KSkKKwkJbWVtY3B5KGRzdCwgc3JjLCBsZW4pOwor CWVsc2UgaWYgKGxpa2VseShsZW4pKQorCQlfX21lbWNweV9udGRxYShkc3QsIHNyYywgbGVuID4+ IDQpOworfQorCisvKioKKyAqIGRybV9tZW1jcHlfZnJvbV93YyAtIFBlcmZvcm0gdGhlIGZhc3Rl c3QgYXZhaWxhYmxlIG1lbWNweSBmcm9tIGEgc291cmNlCisgKiB0aGF0IG1heSBiZSBXQy4KKyAq IEBkc3Q6IFRoZSBkZXN0aW5hdGlvbiBwb2ludGVyCisgKiBAc3JjOiBUaGUgc291cmNlIHBvaW50 ZXIKKyAqIEBsZW46IFRoZSBzaXplIG9mIHRoZSBhcmVhIG8gdHJhbnNmZXIgaW4gYnl0ZXMKKyAq CisgKiBUcmllcyBhbiBhcmNoIG9wdGltaXplZCBtZW1jcHkgZm9yIHByZWZldGNoaW5nIHJlYWRp bmcgb3V0IG9mIGEgV0MgcmVnaW9uLAorICogYW5kIGlmIG5vIHN1Y2ggYmVhc3QgaXMgYXZhaWxh YmxlLCBmYWxscyBiYWNrIHRvIGEgbm9ybWFsIG1lbWNweS4KKyAqLwordm9pZCBkcm1fbWVtY3B5 X2Zyb21fd2Moc3RydWN0IGRtYV9idWZfbWFwICpkc3QsCisJCQljb25zdCBzdHJ1Y3QgZG1hX2J1 Zl9tYXAgKnNyYywKKwkJCXVuc2lnbmVkIGxvbmcgbGVuKQoreworCWlmIChXQVJOX09OKGluX2lu dGVycnVwdCgpKSkgeworCQltZW1jcHlfZmFsbGJhY2soZHN0LCBzcmMsIGxlbik7CisJCXJldHVy bjsKKwl9CisKKwlpZiAoc3RhdGljX2JyYW5jaF9saWtlbHkoJmhhc19tb3ZudGRxYSkpIHsKKwkJ X19kcm1fbWVtY3B5X2Zyb21fd2MoZHN0LT5pc19pb21lbSA/CisJCQkJICAgICAodm9pZCBfX2Zv cmNlICopZHN0LT52YWRkcl9pb21lbSA6CisJCQkJICAgICBkc3QtPnZhZGRyLAorCQkJCSAgICAg c3JjLT5pc19pb21lbSA/CisJCQkJICAgICAodm9pZCBjb25zdCBfX2ZvcmNlICopc3JjLT52YWRk cl9pb21lbSA6CisJCQkJICAgICBzcmMtPnZhZGRyLAorCQkJCSAgICAgbGVuKTsKKwkJcmV0dXJu OworCX0KKworCW1lbWNweV9mYWxsYmFjayhkc3QsIHNyYywgbGVuKTsKK30KK0VYUE9SVF9TWU1C T0woZHJtX21lbWNweV9mcm9tX3djKTsKKworLyoKKyAqIGRybV9tZW1jcHlfaW5pdF9lYXJseSAt IE9uZSB0aW1lIGluaXRpYWxpemF0aW9uIG9mIHRoZSBXQyBtZW1jcHkgY29kZQorICovCit2b2lk IGRybV9tZW1jcHlfaW5pdF9lYXJseSh2b2lkKQoreworCS8qCisJICogU29tZSBoeXBlcnZpc29y cyAoZS5nLiBLVk0pIGRvbid0IHN1cHBvcnQgVkVYLXByZWZpeCBpbnN0cnVjdGlvbnMKKwkgKiBl bXVsYXRpb24uIFNvIGRvbid0IGVuYWJsZSBtb3ZudGRxYSBpbiBoeXBlcnZpc29yIGd1ZXN0Lgor CSAqLworCWlmIChzdGF0aWNfY3B1X2hhcyhYODZfRkVBVFVSRV9YTU00XzEpICYmCisJICAgICFi b290X2NwdV9oYXMoWDg2X0ZFQVRVUkVfSFlQRVJWSVNPUikpCisJCXN0YXRpY19icmFuY2hfZW5h YmxlKCZoYXNfbW92bnRkcWEpOworfQorI2Vsc2UKK3ZvaWQgZHJtX21lbWNweV9mcm9tX3djKHN0 cnVjdCBkbWFfYnVmX21hcCAqZHN0LAorCQkJY29uc3Qgc3RydWN0IGRtYV9idWZfbWFwICpzcmMs CisJCQl1bnNpZ25lZCBsb25nIGxlbikKK3sKKwlXQVJOX09OKGluX2ludGVycnVwdCgpKTsKKwor CW1lbWNweV9mYWxsYmFjayhkc3QsIHNyYywgbGVuKTsKK30KK0VYUE9SVF9TWU1CT0woZHJtX21l bWNweV9mcm9tX3djKTsKKwordm9pZCBkcm1fbWVtY3B5X2luaXRfZWFybHkodm9pZCkKK3sKK30K KyNlbmRpZiAvKiBDT05GSUdfWDg2ICovCmRpZmYgLS1naXQgYS9kcml2ZXJzL2dwdS9kcm0vZHJt X2Rydi5jIGIvZHJpdmVycy9ncHUvZHJtL2RybV9kcnYuYwppbmRleCAzZDhkNjhhOThiOTUuLjg4 MDRlYzdkMzIxNSAxMDA2NDQKLS0tIGEvZHJpdmVycy9ncHUvZHJtL2RybV9kcnYuYworKysgYi9k cml2ZXJzL2dwdS9kcm0vZHJtX2Rydi5jCkBAIC0zNSw2ICszNSw3IEBACiAjaW5jbHVkZSA8bGlu dXgvc2xhYi5oPgogI2luY2x1ZGUgPGxpbnV4L3NyY3UuaD4KIAorI2luY2x1ZGUgPGRybS9kcm1f Y2FjaGUuaD4KICNpbmNsdWRlIDxkcm0vZHJtX2NsaWVudC5oPgogI2luY2x1ZGUgPGRybS9kcm1f Y29sb3JfbWdtdC5oPgogI2luY2x1ZGUgPGRybS9kcm1fZHJ2Lmg+CkBAIC0xMDQxLDYgKzEwNDIs NyBAQCBzdGF0aWMgaW50IF9faW5pdCBkcm1fY29yZV9pbml0KHZvaWQpCiAKIAlkcm1fY29ubmVj dG9yX2lkYV9pbml0KCk7CiAJaWRyX2luaXQoJmRybV9taW5vcnNfaWRyKTsKKwlkcm1fbWVtY3B5 X2luaXRfZWFybHkoKTsKIAogCXJldCA9IGRybV9zeXNmc19pbml0KCk7CiAJaWYgKHJldCA8IDAp IHsKZGlmZiAtLWdpdCBhL2luY2x1ZGUvZHJtL2RybV9jYWNoZS5oIGIvaW5jbHVkZS9kcm0vZHJt X2NhY2hlLmgKaW5kZXggZTlhZDQ4NjNkOTE1Li5jYzlkZTE2MzJkZDMgMTAwNjQ0Ci0tLSBhL2lu Y2x1ZGUvZHJtL2RybV9jYWNoZS5oCisrKyBiL2luY2x1ZGUvZHJtL2RybV9jYWNoZS5oCkBAIC0z NSw2ICszNSw4IEBACiAKICNpbmNsdWRlIDxsaW51eC9zY2F0dGVybGlzdC5oPgogCitzdHJ1Y3Qg ZG1hX2J1Zl9tYXA7CisKIHZvaWQgZHJtX2NsZmx1c2hfcGFnZXMoc3RydWN0IHBhZ2UgKnBhZ2Vz W10sIHVuc2lnbmVkIGxvbmcgbnVtX3BhZ2VzKTsKIHZvaWQgZHJtX2NsZmx1c2hfc2coc3RydWN0 IHNnX3RhYmxlICpzdCk7CiB2b2lkIGRybV9jbGZsdXNoX3ZpcnRfcmFuZ2Uodm9pZCAqYWRkciwg dW5zaWduZWQgbG9uZyBsZW5ndGgpOwpAQCAtNzAsNCArNzIsOSBAQCBzdGF0aWMgaW5saW5lIGJv b2wgZHJtX2FyY2hfY2FuX3djX21lbW9yeSh2b2lkKQogI2VuZGlmCiB9CiAKK3ZvaWQgZHJtX21l bWNweV9pbml0X2Vhcmx5KHZvaWQpOworCit2b2lkIGRybV9tZW1jcHlfZnJvbV93YyhzdHJ1Y3Qg ZG1hX2J1Zl9tYXAgKmRzdCwKKwkJCWNvbnN0IHN0cnVjdCBkbWFfYnVmX21hcCAqc3JjLAorCQkJ dW5zaWduZWQgbG9uZyBsZW4pOwogI2VuZGlmCi0tIAoyLjMxLjEKCl9fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fCkludGVsLWdmeCBtYWlsaW5nIGxpc3QKSW50 ZWwtZ2Z4QGxpc3RzLmZyZWVkZXNrdG9wLm9yZwpodHRwczovL2xpc3RzLmZyZWVkZXNrdG9wLm9y Zy9tYWlsbWFuL2xpc3RpbmZvL2ludGVsLWdmeAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF6BBC47092 for ; Wed, 2 Jun 2021 08:38:53 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7B31B613D8 for ; Wed, 2 Jun 2021 08:38:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7B31B613D8 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A4C376EB92; Wed, 2 Jun 2021 08:38:40 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 458BD6EB8F; Wed, 2 Jun 2021 08:38:39 +0000 (UTC) IronPort-SDR: BFYHLoKDMZign0fH+LdcOxU+KbW7HNC/5nkBpnlPiTBKL6RPigifbm/FkO1i0B2nj8PhIvg2Rr wzm+s82M+Svg== X-IronPort-AV: E=McAfee;i="6200,9189,10002"; a="225026229" X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="225026229" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2021 01:38:39 -0700 IronPort-SDR: QB+PbNcPMU6HTxD1L7t79aTgqcr0f556y5K+5/3YZkFAPMSrpGZKp4ry58Y2/cF+b/q9GcrkBr qLMimEqabTvg== X-IronPort-AV: E=Sophos;i="5.83,241,1616482800"; d="scan'208";a="467376319" Received: from lmarkel-mobl1.ger.corp.intel.com (HELO thellst-mobl1.intel.com) ([10.249.254.49]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Jun 2021 01:38:37 -0700 From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= To: intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org Subject: [PATCH v10 04/11] drm: Add a prefetching memcpy_from_wc Date: Wed, 2 Jun 2021 10:38:11 +0200 Message-Id: <20210602083818.241793-5-thomas.hellstrom@linux.intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210602083818.241793-1-thomas.hellstrom@linux.intel.com> References: <20210602083818.241793-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: =?UTF-8?q?Thomas=20Hellstr=C3=B6m?= , =?UTF-8?q?Christian=20K=C3=B6nig?= Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" Reading out of write-combining mapped memory is typically very slow since the CPU doesn't prefetch. However some archs have special instructions to do this. So add a best-effort memcpy_from_wc taking dma-buf-map pointer arguments that attempts to use a fast prefetching memcpy and otherwise falls back to ordinary memcopies, taking the iomem tagging into account. The code is largely copied from i915_memcpy_from_wc. Cc: Daniel Vetter Cc: Christian König Suggested-by: Daniel Vetter Signed-off-by: Thomas Hellström Acked-by: Christian König Acked-by: Daniel Vetter --- v7: - Perform a memcpy even if warning with in_interrupt(). Suggested by Christian König. - Fix compilation failure on !X86 (Reported by kernel test robot lkp@intel.com) v8: - Skip kerneldoc for drm_memcpy_init_early() - Export drm_memcpy_from_wc() also for non-x86. v10: - Fix a kerneldoc title underline --- Documentation/gpu/drm-mm.rst | 4 +- drivers/gpu/drm/drm_cache.c | 148 +++++++++++++++++++++++++++++++++++ drivers/gpu/drm/drm_drv.c | 2 + include/drm/drm_cache.h | 7 ++ 4 files changed, 159 insertions(+), 2 deletions(-) diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst index 21be6deadc12..d5a73fa2c9ef 100644 --- a/Documentation/gpu/drm-mm.rst +++ b/Documentation/gpu/drm-mm.rst @@ -469,8 +469,8 @@ DRM MM Range Allocator Function References .. kernel-doc:: drivers/gpu/drm/drm_mm.c :export: -DRM Cache Handling -================== +DRM Cache Handling and Fast WC memcpy() +======================================= .. kernel-doc:: drivers/gpu/drm/drm_cache.c :export: diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c index 79a50ef1250f..546599f19a93 100644 --- a/drivers/gpu/drm/drm_cache.c +++ b/drivers/gpu/drm/drm_cache.c @@ -28,6 +28,7 @@ * Authors: Thomas Hellström */ +#include #include #include #include @@ -35,6 +36,9 @@ #include +/* A small bounce buffer that fits on the stack. */ +#define MEMCPY_BOUNCE_SIZE 128 + #if defined(CONFIG_X86) #include @@ -209,3 +213,147 @@ bool drm_need_swiotlb(int dma_bits) return max_iomem > ((u64)1 << dma_bits); } EXPORT_SYMBOL(drm_need_swiotlb); + +static void memcpy_fallback(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + if (!dst->is_iomem && !src->is_iomem) { + memcpy(dst->vaddr, src->vaddr, len); + } else if (!src->is_iomem) { + dma_buf_map_memcpy_to(dst, src->vaddr, len); + } else if (!dst->is_iomem) { + memcpy_fromio(dst->vaddr, src->vaddr_iomem, len); + } else { + /* + * Bounce size is not performance tuned, but using a + * bounce buffer like this is significantly faster than + * resorting to ioreadxx() + iowritexx(). + */ + char bounce[MEMCPY_BOUNCE_SIZE]; + void __iomem *_src = src->vaddr_iomem; + void __iomem *_dst = dst->vaddr_iomem; + + while (len >= MEMCPY_BOUNCE_SIZE) { + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); + _src += MEMCPY_BOUNCE_SIZE; + _dst += MEMCPY_BOUNCE_SIZE; + len -= MEMCPY_BOUNCE_SIZE; + } + if (len) { + memcpy_fromio(bounce, _src, MEMCPY_BOUNCE_SIZE); + memcpy_toio(_dst, bounce, MEMCPY_BOUNCE_SIZE); + } + } +} + +#ifdef CONFIG_X86 + +static DEFINE_STATIC_KEY_FALSE(has_movntdqa); + +static void __memcpy_ntdqa(void *dst, const void *src, unsigned long len) +{ + kernel_fpu_begin(); + + while (len >= 4) { + asm("movntdqa (%0), %%xmm0\n" + "movntdqa 16(%0), %%xmm1\n" + "movntdqa 32(%0), %%xmm2\n" + "movntdqa 48(%0), %%xmm3\n" + "movaps %%xmm0, (%1)\n" + "movaps %%xmm1, 16(%1)\n" + "movaps %%xmm2, 32(%1)\n" + "movaps %%xmm3, 48(%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 64; + dst += 64; + len -= 4; + } + while (len--) { + asm("movntdqa (%0), %%xmm0\n" + "movaps %%xmm0, (%1)\n" + :: "r" (src), "r" (dst) : "memory"); + src += 16; + dst += 16; + } + + kernel_fpu_end(); +} + +/* + * __drm_memcpy_from_wc copies @len bytes from @src to @dst using + * non-temporal instructions where available. Note that all arguments + * (@src, @dst) must be aligned to 16 bytes and @len must be a multiple + * of 16. + */ +static void __drm_memcpy_from_wc(void *dst, const void *src, unsigned long len) +{ + if (unlikely(((unsigned long)dst | (unsigned long)src | len) & 15)) + memcpy(dst, src, len); + else if (likely(len)) + __memcpy_ntdqa(dst, src, len >> 4); +} + +/** + * drm_memcpy_from_wc - Perform the fastest available memcpy from a source + * that may be WC. + * @dst: The destination pointer + * @src: The source pointer + * @len: The size of the area o transfer in bytes + * + * Tries an arch optimized memcpy for prefetching reading out of a WC region, + * and if no such beast is available, falls back to a normal memcpy. + */ +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + if (WARN_ON(in_interrupt())) { + memcpy_fallback(dst, src, len); + return; + } + + if (static_branch_likely(&has_movntdqa)) { + __drm_memcpy_from_wc(dst->is_iomem ? + (void __force *)dst->vaddr_iomem : + dst->vaddr, + src->is_iomem ? + (void const __force *)src->vaddr_iomem : + src->vaddr, + len); + return; + } + + memcpy_fallback(dst, src, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc); + +/* + * drm_memcpy_init_early - One time initialization of the WC memcpy code + */ +void drm_memcpy_init_early(void) +{ + /* + * Some hypervisors (e.g. KVM) don't support VEX-prefix instructions + * emulation. So don't enable movntdqa in hypervisor guest. + */ + if (static_cpu_has(X86_FEATURE_XMM4_1) && + !boot_cpu_has(X86_FEATURE_HYPERVISOR)) + static_branch_enable(&has_movntdqa); +} +#else +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len) +{ + WARN_ON(in_interrupt()); + + memcpy_fallback(dst, src, len); +} +EXPORT_SYMBOL(drm_memcpy_from_wc); + +void drm_memcpy_init_early(void) +{ +} +#endif /* CONFIG_X86 */ diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 3d8d68a98b95..8804ec7d3215 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -35,6 +35,7 @@ #include #include +#include #include #include #include @@ -1041,6 +1042,7 @@ static int __init drm_core_init(void) drm_connector_ida_init(); idr_init(&drm_minors_idr); + drm_memcpy_init_early(); ret = drm_sysfs_init(); if (ret < 0) { diff --git a/include/drm/drm_cache.h b/include/drm/drm_cache.h index e9ad4863d915..cc9de1632dd3 100644 --- a/include/drm/drm_cache.h +++ b/include/drm/drm_cache.h @@ -35,6 +35,8 @@ #include +struct dma_buf_map; + void drm_clflush_pages(struct page *pages[], unsigned long num_pages); void drm_clflush_sg(struct sg_table *st); void drm_clflush_virt_range(void *addr, unsigned long length); @@ -70,4 +72,9 @@ static inline bool drm_arch_can_wc_memory(void) #endif } +void drm_memcpy_init_early(void); + +void drm_memcpy_from_wc(struct dma_buf_map *dst, + const struct dma_buf_map *src, + unsigned long len); #endif -- 2.31.1