From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH v5] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory Date: Mon, 18 Jul 2016 10:31:05 +0100 Message-ID: <578CA1D9.2020004@linux.intel.com> References: <1468673589-3304-1-git-send-email-chris@chris-wilson.co.uk> <1468683858-28383-1-git-send-email-chris@chris-wilson.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by gabe.freedesktop.org (Postfix) with ESMTP id 1AE486E316 for ; Mon, 18 Jul 2016 09:31:07 +0000 (UTC) In-Reply-To: <1468683858-28383-1-git-send-email-chris@chris-wilson.co.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Chris Wilson , intel-gfx@lists.freedesktop.org Cc: Akash Goel , Mika Kuoppala List-Id: intel-gfx@lists.freedesktop.org Ck9uIDE2LzA3LzE2IDE2OjQ0LCBDaHJpcyBXaWxzb24gd3JvdGU6Cj4gVGhpcyBwYXRjaCBwcm92 aWRlcyB0aGUgaW5mcmFzdHJ1Y3R1cmUgZm9yIHBlcmZvcm1pbmcgYSAxNi1ieXRlIGFsaWduZWQK PiByZWFkIGZyb20gV0MgbWVtb3J5IHVzaW5nIG5vbi10ZW1wb3JhbCBpbnN0cnVjdGlvbnMgaW50 cm9kdWNlZCB3aXRoIHNzZTQuMS4KPiBVc2luZyBtb3ZudGRxYSB3ZSBjYW4gYnlwYXNzIHRoZSBD UFUgY2FjaGVzIGFuZCByZWFkIGRpcmVjdGx5IGZyb20gbWVtb3J5Cj4gYW5kIGlnbm9yaW5nIHRo ZSBwYWdlIGF0dHJpYnV0ZXMgc2V0IG9uIHRoZSBDUFUgUFRFIGkuZS4gbmVnYXRpbmcgdGhlCj4g aW1wYWN0IG9mIGFuIG90aGVyd2lzZSBVQyBhY2Nlc3MuIENvcHlpbmcgdXNpbmcgbW92bnRxZGEg ZnJvbSBXQyBpcyBhbG1vc3QKPiBhcyBmYXN0IGFzIHJlYWRpbmcgZnJvbSBXQiBtZW1vcnksIG1v ZHVsbyB0aGUgcG9zc2liaWxpdHkgb2YgYm90aCBoaXR0aW5nCj4gdGhlIENQVSBjYWNoZSBvciBs ZWF2aW5nIHRoZSBkYXRhIGluIHRoZSBDUFUgY2FjaGUgZm9yIHRoZSBuZXh0IGNvbnN1bWVyLgo+ IChUaGUgQ1BVIGNhY2hlIGl0c2VsZiBteSBiZSBmbHVzaGVkIGZvciB0aGUgcmVnaW9uIG9mIHRo ZSBtb3ZudGRxYSBhbmQgb24KPiBsYXRlciBhY2Nlc3MgdGhlIG1vdm50ZHFhIHJlYWRzIGZyb20g YSBzZXBhcmF0ZSBpbnRlcm5hbCBidWZmZXIgZm9yIHRoZQo+IGNhY2hlbGluZS4pIFRoZSB3cml0 ZSBiYWNrIHRvIHRoZSBtZW1vcnkgaXMgaG93ZXZlciBjYWNoZWQuCj4KPiBUaGlzIHdpbGwgYmUg dXNlZCBpbiBsYXRlciBwYXRjaGVzIHRvIGFjY2VsZXJhdGUgYWNjZXNzaW5nIFdDIG1lbW9yeS4K Pgo+IHYyOiBSZXBvcnQgd2hldGhlciB0aGUgYWNjZWxlcmF0ZWQgY29weSBpcyBzdWNjZXNzZnVs L3Bvc3NpYmxlLgo+IHYzOiBGdW5jdGlvbiBhbGlnbm1lbnQgb3ZlcnJpZGUgd2FzIG9ubHkgbmVj ZXNzYXJ5IHdoZW4gdXNpbmcgdGhlCj4gZnVuY3Rpb24gdGFyZ2V0KCJzc2U0LjEiKSAtIHdoaWNo IGlzIG5vdCBuZWNlc3NhcnkgZm9yIGVtaXR0aW5nIG1vdm50ZHFhCj4gZnJvbSBfX2FzbV9fLgo+ IHY0OiBJbXByb3ZlIG5vdGVzIG9uIENQVSBjYWNoZSBiZWhhdmlvdXIgdnMgbm9uLXRlbXBvcmFs IHN0b3Jlcy4KPiB2NTogRml4IGJ5dGUgb2Zmc2V0cyBmb3IgdW5yb2xsZWQgbW92ZXMuCj4KPiBT aWduZWQtb2ZmLWJ5OiBDaHJpcyBXaWxzb24gPGNocmlzQGNocmlzLXdpbHNvbi5jby51az4KPiBD YzogQWthc2ggR29lbCA8YWthc2guZ29lbEBpbnRlbC5jb20+Cj4gQ2M6IERhbWllbiBMZXNwaWF1 IDxkYW1pZW4ubGVzcGlhdUBpbnRlbC5jb20+Cj4gQ2M6IE1pa2EgS3VvcHBhbGEgPG1pa2Eua3Vv cHBhbGFAaW50ZWwuY29tPgo+IENjOiBUdnJ0a28gVXJzdWxpbiA8dHZydGtvLnVyc3VsaW5AaW50 ZWwuY29tPgo+IC0tLQo+ICAgZHJpdmVycy9ncHUvZHJtL2k5MTUvTWFrZWZpbGUgICAgICB8ICAz ICsrCj4gICBkcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2Rydi5jICAgIHwgIDIgKwo+ICAgZHJp dmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuaCAgICB8ICAzICsrCj4gICBkcml2ZXJzL2dwdS9k cm0vaTkxNS9pOTE1X21lbWNweS5jIHwgNzUgKysrKysrKysrKysrKysrKysrKysrKysrKysrKysr KysrKysrKysKPiAgIDQgZmlsZXMgY2hhbmdlZCwgODMgaW5zZXJ0aW9ucygrKQo+ICAgY3JlYXRl IG1vZGUgMTAwNjQ0IGRyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfbWVtY3B5LmMKPgo+IGRpZmYg LS1naXQgYS9kcml2ZXJzL2dwdS9kcm0vaTkxNS9NYWtlZmlsZSBiL2RyaXZlcnMvZ3B1L2RybS9p OTE1L01ha2VmaWxlCj4gaW5kZXggNzUzMThlYmI4ZDI1Li5hNTM4NTNkYWE5OTggMTAwNjQ0Cj4g LS0tIGEvZHJpdmVycy9ncHUvZHJtL2k5MTUvTWFrZWZpbGUKPiArKysgYi9kcml2ZXJzL2dwdS9k cm0vaTkxNS9NYWtlZmlsZQo+IEBAIC0zLDEyICszLDE1IEBACj4gICAjIERpcmVjdCBSZW5kZXJp bmcgSW5mcmFzdHJ1Y3R1cmUgKERSSSkgaW4gWEZyZWU4NiA0LjEuMCBhbmQgaGlnaGVyLgo+Cj4g ICBzdWJkaXItY2NmbGFncy0kKENPTkZJR19EUk1fSTkxNV9XRVJST1IpIDo9IC1XZXJyb3IKPiAr c3ViZGlyLWNjZmxhZ3MteSArPSBcCj4gKwkkKGNhbGwgYXMtaW5zdHIsbW92bnRkcWEgKCVlYXgp JChjb21tYSkleG1tMCwtRENPTkZJR19BU19NT1ZOVERRQSkKPgo+ICAgIyBQbGVhc2Uga2VlcCB0 aGVzZSBidWlsZCBsaXN0cyBzb3J0ZWQhCj4KPiAgICMgY29yZSBkcml2ZXIgY29kZQo+ICAgaTkx NS15IDo9IGk5MTVfZHJ2Lm8gXAo+ICAgCSAgaTkxNV9pcnEubyBcCj4gKwkgIGk5MTVfbWVtY3B5 Lm8gXAo+ICAgCSAgaTkxNV9wYXJhbXMubyBcCj4gICAJICBpOTE1X3BjaS5vIFwKPiAgICAgICAg ICAgICBpOTE1X3N1c3BlbmQubyBcCj4gZGlmZiAtLWdpdCBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1 L2k5MTVfZHJ2LmMgYi9kcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2Rydi5jCj4gaW5kZXggYzVi N2I4ZTA2NzhhLi4xNjYxMjU0MjQ5NmEgMTAwNjQ0Cj4gLS0tIGEvZHJpdmVycy9ncHUvZHJtL2k5 MTUvaTkxNV9kcnYuYwo+ICsrKyBiL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZHJ2LmMKPiBA QCAtODQ4LDYgKzg0OCw4IEBAIHN0YXRpYyBpbnQgaTkxNV9kcml2ZXJfaW5pdF9lYXJseShzdHJ1 Y3QgZHJtX2k5MTVfcHJpdmF0ZSAqZGV2X3ByaXYsCj4gICAJbXV0ZXhfaW5pdCgmZGV2X3ByaXYt PndtLndtX211dGV4KTsKPiAgIAltdXRleF9pbml0KCZkZXZfcHJpdi0+cHBzX211dGV4KTsKPgo+ ICsJaTkxNV9tZW1jcHlfaW5pdF9lYXJseShkZXZfcHJpdik7Cj4gKwo+ICAgCXJldCA9IGk5MTVf d29ya3F1ZXVlc19pbml0KGRldl9wcml2KTsKPiAgIAlpZiAocmV0IDwgMCkKPiAgIAkJcmV0dXJu IHJldDsKPiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuaCBiL2Ry aXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZHJ2LmgKPiBpbmRleCAyN2Q5YjJjMzc0YjMuLjNjMjY2 ZTc4NjZiYSAxMDA2NDQKPiAtLS0gYS9kcml2ZXJzL2dwdS9kcm0vaTkxNS9pOTE1X2Rydi5oCj4g KysrIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuaAo+IEBAIC00MDcwLDQgKzQwNzAs NyBAQCBzdGF0aWMgaW5saW5lIGJvb2wgX19pOTE1X3JlcXVlc3RfaXJxX2NvbXBsZXRlKHN0cnVj dCBkcm1faTkxNV9nZW1fcmVxdWVzdCAqcmVxKQo+ICAgCXJldHVybiBmYWxzZTsKPiAgIH0KPgo+ ICt2b2lkIGk5MTVfbWVtY3B5X2luaXRfZWFybHkoc3RydWN0IGRybV9pOTE1X3ByaXZhdGUgKmRl dl9wcml2KTsKPiArYm9vbCBpOTE1X21lbWNweV9mcm9tX3djKHZvaWQgKmRzdCwgY29uc3Qgdm9p ZCAqc3JjLCB1bnNpZ25lZCBsb25nIGxlbik7Cj4gKwo+ICAgI2VuZGlmCj4gZGlmZiAtLWdpdCBh L2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfbWVtY3B5LmMgYi9kcml2ZXJzL2dwdS9kcm0vaTkx NS9pOTE1X21lbWNweS5jCj4gbmV3IGZpbGUgbW9kZSAxMDA2NDQKPiBpbmRleCAwMDAwMDAwMDAw MDAuLjRlZDRkM2JiMmYzZQo+IC0tLSAvZGV2L251bGwKPiArKysgYi9kcml2ZXJzL2dwdS9kcm0v aTkxNS9pOTE1X21lbWNweS5jCj4gQEAgLTAsMCArMSw3NSBAQAo+ICsjaW5jbHVkZSAiaTkxNV9k cnYuaCIKPiArCj4gK0RFRklORV9TVEFUSUNfS0VZX0ZBTFNFKGhhc19tb3ZudHFhKTsKPiArCj4g KyNpZmRlZiBDT05GSUdfQVNfTU9WTlREUUEKPiArc3RhdGljIHZvaWQgX19tb3ZudHFkYSh2b2lk ICpkc3QsIGNvbnN0IHZvaWQgKnNyYywgdW5zaWduZWQgbG9uZyBsZW4pCj4gK3sKPiArCWxlbiA+ Pj0gNDsKPiArCXdoaWxlIChsZW4gPj0gNCkgewo+ICsJCV9fYXNtX18gX192b2xhdGlsZV9fKAo+ ICsJCSJtb3ZudGRxYSAoJTApLCAlJXhtbTBcbiIKPiArCQkibW92bnRkcWEgMTYoJTApLCAlJXht bTFcbiIKPiArCQkibW92bnRkcWEgMzIoJTApLCAlJXhtbTJcbiIKPiArCQkibW92bnRkcWEgNDgo JTApLCAlJXhtbTNcbiIKPiArCQkibW92YXBzICUleG1tMCwgKCUxKVxuIgo+ICsJCSJtb3ZhcHMg JSV4bW0xLCAxNiglMSlcbiIKPiArCQkibW92YXBzICUleG1tMiwgMzIoJTEpXG4iCj4gKwkJIm1v dmFwcyAlJXhtbTMsIDQ4KCUxKVxuIgo+ICsJCTogOiAiciIgKHNyYyksICJyIiAoZHN0KSA6ICJt ZW1vcnkiKTsKPiArCQlzcmMgKz0gNjQ7Cj4gKwkJZHN0ICs9IDY0Owo+ICsJCWxlbiAtPSA0Owo+ ICsJfQo+ICsJd2hpbGUgKGxlbi0tKSB7Cj4gKwkJX19hc21fXyBfX3ZvbGF0aWxlX18oCj4gKwkJ Im1vdm50ZHFhICglMCksICUleG1tMFxuIgo+ICsJCSJtb3ZhcHMgJSV4bW0wLCAoJTEpXG4iCj4g KwkJOiA6ICJyIiAoc3JjKSwgInIiIChkc3QpIDogIm1lbW9yeSIpOwo+ICsJCXNyYyArPSAxNjsK PiArCQlkc3QgKz0gMTY7Cj4gKwl9Cj4gK30KPiArI2VuZGlmCgpJcyBpdCBva2F5IG5vd2FkYXlz IHRvIGp1c3QgdXNlIHRoZXNlIHJlZ2lzdGVycyBpbiB0aGUga2VybmVsPwoKTWFueSB5ZWFycyBh Z28gd2hlbiBJIGxhc3QgbG9va2VkIGludG8gdGhpcyBGUFUgYW5kIE1NWCByZWdpc3RlcnMgd2Vy ZSAKZGlzY291cmFnZWQgYWdhaW5zdCBhbmQgbmVlZGVkIGV4cGxpY2l0IGtlcm5lbF9ncHVfYmVn aW4vZW5kIGFyb3VuZCB0aGUgCmJsb2NrLiBTaW5jZSB0aGV5IHdlcmUgbm90IHNhdmVkL3Jlc3Rv cmVkIGJ5IHRoZSBrZXJuZWwgYW5kIGRvaW5nIApvdGhlcndpc2Ugd291bGQgbWVzcyB1cCB0aGUg dXNlcnNwYWNlIGNvbnRleHQuCgpQZXJoYXBzIHRoZXNlIG5ldyByZWdpc3RlcnMgYXJlIGRpZmZl cmVudCBvciB0aGUgdGhpbmdzIGhhdmUgZ2VuZXJhbGx5IApjaGFuZ2VkIHNpbmNlIHRoZW4/CgpS ZWdhcmRzLAoKVHZydGtvCgo+ICsKPiArLyoqCj4gKyAqIGk5MTVfbWVtY3B5X2Zyb21fd2M6IHBl cmZvcm0gYW4gYWNjZWxlcmF0ZWQgKmFsaWduZWQqIHJlYWQgZnJvbSBXQwo+ICsgKiBAZHN0OiBk ZXN0aW5hdGlvbiBwb2ludGVyCj4gKyAqIEBzcmM6IHNvdXJjZSBwb2ludGVyCj4gKyAqIEBsZW46 IGhvdyBtYW55IGJ5dGVzIHRvIGNvcHkKPiArICoKPiArICogaTkxNV9tZW1jcHlfZnJvbV93YyBj b3BpZXMgQGxlbiBieXRlcyBmcm9tIEBzcmMgdG8gQGRzdCB1c2luZwo+ICsgKiBub24tdGVtcG9y YWwgaW5zdHJ1Y3Rpb25zIHdoZXJlIGF2YWlsYWJsZS4gTm90ZSB0aGF0IGFsbCBhcmd1bWVudHMK PiArICogKEBzcmMsIEBkc3QpIG11c3QgYmUgYWxpZ25lZCB0byAxNiBieXRlcyBhbmQgQGxlbiBt dXN0IGJlIGEgbXVsdGlwbGUKPiArICogb2YgMTYuCj4gKyAqCj4gKyAqIFRvIHRlc3Qgd2hldGhl ciBhY2NlbGVyYXRlZCByZWFkcyBmcm9tIFdDIGFyZSBzdXBwb3J0ZWQsIHVzZQo+ICsgKiBpOTE1 X21lbWNweV9mcm9tX3djKE5VTEwsIE5VTEwsIDApOwo+ICsgKgo+ICsgKiBSZXR1cm5zIHRydWUg aWYgdGhlIGNvcHkgd2FzIHN1Y2Nlc3NmdWwsIGZhbHNlIGlmIHRoZSBwcmVjb25kaXRpb25zCj4g KyAqIGFyZSBub3QgbWV0Lgo+ICsgKi8KPiArYm9vbCBpOTE1X21lbWNweV9mcm9tX3djKHZvaWQg KmRzdCwgY29uc3Qgdm9pZCAqc3JjLCB1bnNpZ25lZCBsb25nIGxlbikKPiArewo+ICsJR0VNX0JV R19PTigodW5zaWduZWQgbG9uZylkc3QgJiAxNSk7Cj4gKwlHRU1fQlVHX09OKCh1bnNpZ25lZCBs b25nKXNyYyAmIDE1KTsKPiArCj4gKwlpZiAodW5saWtlbHkobGVuICYgMTUpKQo+ICsJCXJldHVy biBmYWxzZTsKPiArCj4gKyNpZmRlZiBDT05GSUdfQVNfTU9WTlREUUEKPiArCWlmIChzdGF0aWNf YnJhbmNoX2xpa2VseSgmaGFzX21vdm50cWEpKSB7Cj4gKwkJaWYgKGxlbikKPiArCQkJX19tb3Zu dHFkYShkc3QsIHNyYywgbGVuKTsKPiArCQlyZXR1cm4gdHJ1ZTsKPiArCX0KPiArI2VuZGlmCj4g Kwo+ICsJcmV0dXJuIGZhbHNlOwo+ICt9Cj4gKwo+ICt2b2lkIGk5MTVfbWVtY3B5X2luaXRfZWFy bHkoc3RydWN0IGRybV9pOTE1X3ByaXZhdGUgKmRldl9wcml2KQo+ICt7Cj4gKwlpZiAoc3RhdGlj X2NwdV9oYXMoWDg2X0ZFQVRVUkVfWE1NNF8xKSkKPiArCQlzdGF0aWNfYnJhbmNoX2VuYWJsZSgm aGFzX21vdm50cWEpOwo+ICt9Cj4KX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX18KSW50ZWwtZ2Z4IG1haWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRl c2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8v aW50ZWwtZ2Z4Cg==