From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory Date: Mon, 18 Jul 2016 13:56:52 +0100 Message-ID: <578CD214.8070703@linux.intel.com> References: <20160718100111.GD21839@nuc-i3427.alporthouse.com> <1468836434-29107-1-git-send-email-chris@chris-wilson.co.uk> <578CBA54.40107@linux.intel.com> <20160718113501.GH21839@nuc-i3427.alporthouse.com> <578CC415.202@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by gabe.freedesktop.org (Postfix) with ESMTP id 2A2096E127 for ; Mon, 18 Jul 2016 12:56:55 +0000 (UTC) In-Reply-To: <578CC415.202@intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: Dave Gordon , Chris Wilson , intel-gfx@lists.freedesktop.org, Akash Goel , Mika Kuoppala List-Id: intel-gfx@lists.freedesktop.org Ck9uIDE4LzA3LzE2IDEyOjU3LCBEYXZlIEdvcmRvbiB3cm90ZToKPiBPbiAxOC8wNy8xNiAxMjoz NSwgQ2hyaXMgV2lsc29uIHdyb3RlOgo+PiBPbiBNb24sIEp1bCAxOCwgMjAxNiBhdCAxMjoxNToz MlBNICswMTAwLCBUdnJ0a28gVXJzdWxpbiB3cm90ZToKPj4+IEkgYW0gbm90IHN1cmUgYWJvdXQg dGhpcywgYnV0IGxvb2tpbmcgYXQgdGhlIHJhaWQ2IGZvciBleGFtcGxlLCBpdAo+Pj4gaGFzIGEg bG90IG1vcmUgYW5ub3RhdGlvbnMgaW4gY2FzZXMgbGlrZSB0aGlzLgo+Pj4KPj4+IEl0IHNlZW1z IHRvIGJlIHRlbGxpbmcgdGhlIGNvbXBpbGVyIHdoaWNoIG1lbW9yeSByYW5nZXMgZG9lcyBlYWNo Cj4+PiBpbnN0cnVjdGlvbiBhY2Nlc3MsIGFuZCBhbHNvIHVzZXMgImFzbSB2b2xhdGlsZSIgLSB3 aGV0aGVyIG9yIG5vdAo+Pj4gdGhhdCBpcyByZWFsbHkgbmVlZGVkIEkgZG9uJ3Qga25vdy4KPj4+ Cj4+PiBGb3IgZXhhbXBsZToKPj4+ICAgICAgICAgICAgICAgICAgYXNtIHZvbGF0aWxlKCJtb3Zk cWEgJTAsJSV4bW00IiA6OiAibSIgKGRwdHJbejBdW2RdKSk7Cj4+Pgo+Pj4gQW5kOgo+Pj4gICAg ICAgICAgICAgICAgICBhc20gdm9sYXRpbGUoIm1vdmRxYSAlJXhtbTQsJTAiIDogIj1tIiAocVtk XSkpOwo+Pj4KPj4+IEVhY2ggb25lIGlzIHRlbGxpbmcgdGhlIGNvbXBpbGVyIHRoZSBpbnN0cnVj dGlvbiBpcyBlaXRoZXIgcmVhZGluZwo+Pj4gb3Igd3JpdGluZyByZXNwZWN0aXZlbHkgZnJvbSBh IGNlcnRhaW4gbWVtb3J5IGFkZHJlc3MuCj4+Pgo+Pj4gWW91IGRvbid0IGhhdmUgYW55IG9mIHRo YXQsIGFuZCBkb24ndCBldmVuIHNwZWNpZnkgbm90aGluZyBhcyBhbgo+Pj4gb3V0cHV0IHBhcmFt ZXRlciBzbyBJIGFtIG5vdCBzdXJlIGlmIHlvdXIgY29kZSBpcyBzYWZlLgo+Pgo+PiBUaGUgYXNt IGlzIGNvcnJlY3QuIFdlIGRvIG5vdCBtb2RpZnkgZWl0aGVyIG9mIHRoZSB0d28gcG9pbnRlcnMg d2hpY2ggd2UKPj4gcGFzcyBpbiB2aWEgcmVnaXN0ZXIgaW5wdXRzLCBidXQgdGhlIG1lbW9yeSBi ZWhpbmQgdGhlbSAtIGhlbmNlIHRoZSAKPj4gbWVtb3J5Cj4+IGNsb2JiZXIuCj4gCj4gVGhpcyBp cyBhIGNob2ljZSBvZiBob3cgbXVjaCB3ZSBsZXQgdGhlIGNvbXBpbGVyIGRlY2lkZSBhYm91dCAK PiBhZGRyZXNzaW5nLCBhbmQgaG93IG11Y2ggd2UgdGVsbCBpdCBhYm91dCB3aGF0IHRoZSBhc20g Y29kZSByZWFsbHkgZG9lcy4gCj4gVGhlIGV4YW1wbGVzIGFib3ZlIGdldCB0aGUgY29tcGlsZXIg dG8gZ2VuZXJhdGUgKmFueSogc3VpdGFibGUgCj4gYWRkcmVzc2luZyBtb2RlIGZvciBlYWNoIHNw ZWNpZmljIGxvY2F0aW9uIGludm9sdmVkIGluIHRoZSB0cmFuc2ZlcnMsIHNvIAo+IHRoZSBjb21w aWxlciBrbm93cyBhIGxvdCBhYm91dCB3aGF0J3MgaGFwcGVuaW5nIGFuZCBjYW4gdHJhY2sgd2hl cmUgZWFjaCAKPiBkYXR1bSBjb21lcyBmcm9tIGFuZCBnb2VzIHRvLgo+IAo+IE9UT0ggQ2hyaXMn IGNvZGUKPiAKPiArICAgICAgICBhc20oIm1vdm50ZHFhICAgKCUwKSwgJSV4bW0wXG4iCj4gKyAg ICAgICAgICAgICJtb3ZudGRxYSAxNiglMCksICUleG1tMVxuIgo+ICsgICAgICAgICAgICAibW92 bnRkcWEgMzIoJTApLCAlJXhtbTJcbiIKPiArICAgICAgICAgICAgIm1vdm50ZHFhIDQ4KCUwKSwg JSV4bW0zXG4iCj4gKyAgICAgICAgICAgICJtb3ZhcHMgJSV4bW0wLCAgICglMSlcbiIKPiArICAg ICAgICAgICAgIm1vdmFwcyAlJXhtbTEsIDE2KCUxKVxuIgo+ICsgICAgICAgICAgICAibW92YXBz ICUleG1tMiwgMzIoJTEpXG4iCj4gKyAgICAgICAgICAgICJtb3ZhcHMgJSV4bW0zLCA0OCglMSlc biIKPiArICAgICAgICAgICAgOjogInIiIChzcmMpLCAiciIgKGRzdCkgOiAibWVtb3J5Iik7Cj4g Cj4gLSBkb2Vzbid0IG5lZWQgInZvbGF0aWxlIiBiZWNhdXNlIGFzbSBzdGF0ZW1lbnRzIHRoYXQg aGF2ZSBubyBvdXRwdXQgCj4gb3BlcmFuZHMgYXJlIGltcGxpY2l0bHkgdm9sYXRpbGUuCj4gCj4g LSBtYWtlcyB0aGUgY29tcGlsZXIgZ2l2ZSB1cyB0aGUgc291cmNlIGFuZCBkZXN0aW5hdGlvbiAq YWRkcmVzc2VzKiBpbiBhIAo+IHJlZ2lzdGVyIGVhY2g7IGJleW9uZCB0aGF0LCBpdCBkb2Vzbid0 IGtub3cgd2hhdCB3ZSdyZSBkb2luZyB3aXRoIHRoZW0sIAo+IHNvIHRoZSB0aGlyZCAoImNsb2Ji ZXJzIikgcGFyYW1ldGVyIGhhcyB0byBzYXkgIm1lbW9yeSIgaS5lLiB0cmVhdCAqYWxsKiAKPiBt ZW1vcnkgY29udGVudHMgYXMgdW5rbm93biBhZnRlciB0aGlzLgo+IAo+IFtbRnJvbSBHQ0MgZG9j czogVGhlICJtZW1vcnkiIGNsb2JiZXIgdGVsbHMgdGhlIGNvbXBpbGVyIHRoYXQgdGhlIAo+IGFz c2VtYmx5IGNvZGUgcGVyZm9ybXMgbWVtb3J5IHJlYWRzIG9yIHdyaXRlcyB0byBpdGVtcyBvdGhl ciB0aGFuIHRob3NlIAo+IGxpc3RlZCBpbiB0aGUgaW5wdXQgYW5kIG91dHB1dCBvcGVyYW5kcyAo Zm9yIGV4YW1wbGUsIGFjY2Vzc2luZyB0aGUgCj4gbWVtb3J5IHBvaW50ZWQgdG8gYnkgb25lIG9m IHRoZSBpbnB1dCBwYXJhbWV0ZXJzKS4gVG8gZW5zdXJlIG1lbW9yeSAKPiBjb250YWlucyBjb3Jy ZWN0IHZhbHVlcywgR0NDIG1heSBuZWVkIHRvIGZsdXNoIHNwZWNpZmljIHJlZ2lzdGVyIHZhbHVl cyAKPiB0byBtZW1vcnkgYmVmb3JlIGV4ZWN1dGluZyB0aGUgYXNtLiBGdXJ0aGVyLCB0aGUgY29t cGlsZXIgZG9lcyBub3QgCj4gYXNzdW1lIHRoYXQgYW55IHZhbHVlcyByZWFkIGZyb20gbWVtb3J5 IGJlZm9yZSBhbiBhc20gcmVtYWluIHVuY2hhbmdlZCAKPiBhZnRlciB0aGF0IGFzbTsgaXQgcmVs b2FkcyB0aGVtIGFzIG5lZWRlZC4gVXNpbmcgdGhlICJtZW1vcnkiIGNsb2JiZXIgCj4gZWZmZWN0 aXZlbHkgZm9ybXMgYSByZWFkL3dyaXRlIG1lbW9yeSBiYXJyaWVyIGZvciB0aGUgY29tcGlsZXIu XV0KPiAKPiBCVFcsIHNob3VsZCB3ZSBub3QgdGVsbCBpdCB3ZSd2ZSAqYWxzbyogY2xvYmJlcmVk ICV4bW1bMC0zXT8KPiAKPiBTbyB0aGV5J3JlIGJvdGggY29ycmVjdCwganVzdCB0YWtpbmcgZGlm ZmVyZW50IGFwcHJvYWNoZXMuIEkgZG9uJ3Qga25vdyAKPiB3aGljaCB3b3VsZCBnaXZlIHRoZSBi ZXN0IHBlcmZvcm1hbmNlIGZvciB0aGlzIHNwZWNpZmljIGNhc2UuCgpDb29sLCBsZWFybiBzb21l dGhpbmcgbmV3IGV2ZXJ5IGRheS4gOikKCkkndmUgdHJpZWQgd3JpdGluZyBpdCBhczoKCnN0cnVj dCBxdzIgewoJdTY0CXFbMl07Cn0gX19hdHRyaWJ1dGVfXygocGFja2VkKSk7CgpzdGF0aWMgdm9p ZCBfX21lbWNweV9udGRxYShzdHJ1Y3QgcXcyICpkc3QsIGNvbnN0IHN0cnVjdCBxdzIgKnNyYywg dW5zaWduZWQgbG9uZyBsZW4pCnsKCWtlcm5lbF9mcHVfYmVnaW4oKTsKCglsZW4gPj49IDQ7Cgl3 aGlsZSAobGVuID49IDQpIHsKCQlhc20oIm1vdm50ZHFhICAgKCUwKSwgJSV4bW0wIiA6OiAiciIg KHNyYyksICJtIiAoc3JjWzBdKSk7CgkJYXNtKCJtb3ZudGRxYSAxNiglMCksICUleG1tMSIgOjog InIiIChzcmMpLCAibSIgKHNyY1sxXSkpOwoJCWFzbSgibW92bnRkcWEgMzIoJTApLCAlJXhtbTIi IDo6ICJyIiAoc3JjKSwgIm0iIChzcmNbMl0pKTsKCQlhc20oIm1vdm50ZHFhIDQ4KCUwKSwgJSV4 bW0zIiA6OiAiciIgKHNyYyksICJtIiAoc3JjWzNdKSk7CgkJYXNtKCJtb3ZhcHMgJSV4bW0wLCAg ICglMSkiIDogIj1tIiAoZHN0WzBdKSA6ICJyIiAoZHN0KSk7CgkJYXNtKCJtb3ZhcHMgJSV4bW0x LCAxNiglMSkiIDogIj1tIiAoZHN0WzFdKSA6ICJyIiAoZHN0KSk7CgkJYXNtKCJtb3ZhcHMgJSV4 bW0yLCAzMiglMSkiIDogIj1tIiAoZHN0WzJdKSA6ICJyIiAoZHN0KSk7CgkJYXNtKCJtb3ZhcHMg JSV4bW0zLCA0OCglMSkiIDogIj1tIiAoZHN0WzNdKSA6ICJyIiAoZHN0KSk7CgkJc3JjICs9IDQ7 CgkJZHN0ICs9IDQ7CgkJbGVuIC09IDQ7Cgl9Cgl3aGlsZSAobGVuLS0pIHsKCQlhc20oIm1vdm50 ZHFhICglMCksICUleG1tMCIgOjogInIiIChzcmMpLCAibSIgKHNyY1swXSkpOwoJCWFzbSgibW92 YXBzICUleG1tMCwgKCUxKSIgOiAiPW0iIChkc3RbMF0pIDogInIiIChkc3QpKTsKCQlzcmMrKzsK CQlkc3QrKzsKCX0KCglrZXJuZWxfZnB1X2VuZCgpOwp9CgpUaGF0IGFwcGVhcnMgdG8gYWxsb3cg R0NDIHRvIGludGVybGVhdmUgU1NFIGFuZCBub3JtYWwgaW5zdHJ1Y3Rpb25zLApwcmVzdW1hYmx5 IHRoYXQgbWVhbnMgaXQgaXMgdHJ5aW5nIHRvIHV0aWxpemUgdGhlIGV4ZWN1dGlvbiB1bml0cyBi ZXR0ZXI/CgpJIHdvbmRlciBpZiBpdCBtYWtlcyBhIGRpZmZlcmVuY2UgaW4gc3BlZWQ/CgoKT2xk IGNvZGUgbWFpbiBsb29wIGFzc2VtYmx5IGxvb2tzIGxpa2U6CgogIDU4OiAgIDY2IDBmIDM4IDJh IDAwICAgICAgICAgIG1vdm50ZHFhICglcmF4KSwleG1tMAogIDVkOiAgIDY2IDBmIDM4IDJhIDQ4 IDEwICAgICAgIG1vdm50ZHFhIDB4MTAoJXJheCksJXhtbTEKICA2MzogICA2NiAwZiAzOCAyYSA1 MCAyMCAgICAgICBtb3ZudGRxYSAweDIwKCVyYXgpLCV4bW0yCiAgNjk6ICAgNjYgMGYgMzggMmEg NTggMzAgICAgICAgbW92bnRkcWEgMHgzMCglcmF4KSwleG1tMwogIDZmOiAgIDBmIDI5IDAxICAg ICAgICAgICAgICAgIG1vdmFwcyAleG1tMCwoJXJjeCkKICA3MjogICAwZiAyOSA0OSAxMCAgICAg ICAgICAgICBtb3ZhcHMgJXhtbTEsMHgxMCglcmN4KQogIDc2OiAgIDBmIDI5IDUxIDIwICAgICAg ICAgICAgIG1vdmFwcyAleG1tMiwweDIwKCVyY3gpCiAgN2E6ICAgMGYgMjkgNTkgMzAgICAgICAg ICAgICAgbW92YXBzICV4bW0zLDB4MzAoJXJjeCkKICA3ZTogICA0OSA4MyBlOCAwNCAgICAgICAg ICAgICBzdWIgICAgJDB4NCwlcjgKICA4MjogICA0OCA4MyBjMCA0MCAgICAgICAgICAgICBhZGQg ICAgJDB4NDAsJXJheAogIDg2OiAgIDQ4IDgzIGMxIDQwICAgICAgICAgICAgIGFkZCAgICAkMHg0 MCwlcmN4CiAgOGE6ICAgNDkgODMgZjggMDMgICAgICAgICAgICAgY21wICAgICQweDMsJXI4CiAg OGU6ICAgNzcgYzggICAgICAgICAgICAgICAgICAgamEgICAgIDU4IDxpOTE1X21lbWNweV9mcm9t X3djKzB4NTg+CgpXaGlsZSB0aGUgYWJvdmUgdmVyc2lvbiBnZW5lcmF0ZXM6CgogIDU4OiAgIDY2 IDBmIDM4IDJhIDAwICAgICAgICAgIG1vdm50ZHFhICglcmF4KSwleG1tMAogIDVkOiAgIDY2IDBm IDM4IDJhIDQ4IDEwICAgICAgIG1vdm50ZHFhIDB4MTAoJXJheCksJXhtbTEKICA2MzogICA2NiAw ZiAzOCAyYSA1MCAyMCAgICAgICBtb3ZudGRxYSAweDIwKCVyYXgpLCV4bW0yCiAgNjk6ICAgNjYg MGYgMzggMmEgNTggMzAgICAgICAgbW92bnRkcWEgMHgzMCglcmF4KSwleG1tMwogIDZmOiAgIDQ5 IDgzIGU4IDA0ICAgICAgICAgICAgIHN1YiAgICAkMHg0LCVyOAogIDczOiAgIDQ4IDgzIGMwIDQw ICAgICAgICAgICAgIGFkZCAgICAkMHg0MCwlcmF4CiAgNzc6ICAgMGYgMjkgMDEgICAgICAgICAg ICAgICAgbW92YXBzICV4bW0wLCglcmN4KQogIDdhOiAgIDBmIDI5IDQ5IDEwICAgICAgICAgICAg IG1vdmFwcyAleG1tMSwweDEwKCVyY3gpCiAgN2U6ICAgMGYgMjkgNTEgMjAgICAgICAgICAgICAg bW92YXBzICV4bW0yLDB4MjAoJXJjeCkKICA4MjogICAwZiAyOSA1OSAzMCAgICAgICAgICAgICBt b3ZhcHMgJXhtbTMsMHgzMCglcmN4KQogIDg2OiAgIDQ4IDgzIGMxIDQwICAgICAgICAgICAgIGFk ZCAgICAkMHg0MCwlcmN4CiAgOGE6ICAgNDkgODMgZjggMDMgICAgICAgICAgICAgY21wICAgICQw eDMsJXI4CiAgOGU6ICAgNzcgYzggICAgICAgICAgICAgICAgICAgamEgICAgIDU4IDxpOTE1X21l bWNweV9mcm9tX3djKzB4NTg+CgpJbnRlcmVzdGluZ2x5LCBpbiBib3RoIGNhc2VzIEdDQyBkb2Vz IHNvbWUgaW4gbXkgbWluZCBmdXRpbGUKc2h1ZmZsaW5nIGFyb3VuZyBiZXR3ZWVuIHRoZSB0d28g bG9vcHMuIEluc3RlYWQgb2YganVzdApjYXJyeWluZyBvbiB3aXRoIHNyYyBhbmQgZHN0IGFuZCBs ZW4gaG93IHRoZXkgYXJlLCBpdCBnb2VzCnRvIHVzZSBhIGRpZmZlcmVudCByZWdpc3RlciBzZXQg Zm9yIHRoZSBzZWNvbmQgbG9vcDoKClNvIHRoaXMgcmVzaHVmZmxpbmc6CgogIDkwOiAgIDQ4IDhk IDQyIGZjICAgICAgICAgICAgIGxlYSAgICAtMHg0KCVyZHgpLCVyYXgKICA5NDogICA4MyBlMiAw MyAgICAgICAgICAgICAgICBhbmQgICAgJDB4MywlZWR4CiAgOTc6ICAgNDggYzEgZTggMDIgICAg ICAgICAgICAgc2hyICAgICQweDIsJXJheAogIDliOiAgIDQ4IDgzIGMwIDAxICAgICAgICAgICAg IGFkZCAgICAkMHgxLCVyYXgKICA5ZjogICA0OCBjMSBlMCAwNiAgICAgICAgICAgICBzaGwgICAg JDB4NiwlcmF4CiAgYTM6ICAgNDggMDEgYzYgICAgICAgICAgICAgICAgYWRkICAgICVyYXgsJXJz aQogIGE2OiAgIDQ4IDAxIGM3ICAgICAgICAgICAgICAgIGFkZCAgICAlcmF4LCVyZGkKICBhOTog ICA0OCA4ZCA0MiBmZiAgICAgICAgICAgICBsZWEgICAgLTB4MSglcmR4KSwlcmF4CiAgYWQ6ICAg NDggODUgZDIgICAgICAgICAgICAgICAgdGVzdCAgICVyZHgsJXJkeAogIGIwOiAgIDc0IDFhICAg ICAgICAgICAgICAgICAgIGplICAgICBjYyA8aTkxNV9tZW1jcHlfZnJvbV93YysweGNjPgoKQW5k IHRoZW4gdGhlIHNlY29uZCBsb29wOgoKICBiMjogICA2NiAwZiAzOCAyYSAwNiAgICAgICAgICBt b3ZudGRxYSAoJXJzaSksJXhtbTAKICBiNzogICA0OCA4MyBlOCAwMSAgICAgICAgICAgICBzdWIg ICAgJDB4MSwlcmF4CiAgYmI6ICAgNDggODMgYzYgMTAgICAgICAgICAgICAgYWRkICAgICQweDEw LCVyc2kKICBiZjogICAwZiAyOSAwNyAgICAgICAgICAgICAgICBtb3ZhcHMgJXhtbTAsKCVyZGkp CiAgYzI6ICAgNDggODMgYzcgMTAgICAgICAgICAgICAgYWRkICAgICQweDEwLCVyZGkKICBjNjog ICA0OCA4MyBmOCBmZiAgICAgICAgICAgICBjbXAgICAgJDB4ZmZmZmZmZmZmZmZmZmZmZiwlcmF4 CiAgY2E6ICAgNzUgZTYgICAgICAgICAgICAgICAgICAgam5lICAgIGIyIDxpOTE1X21lbWNweV9m cm9tX3djKzB4YjI+CgpBbnkgdGhvdWdodHMgb24gdGhpcz8KClJlZ2FyZHMsCgpUdnJ0a28KX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KSW50ZWwtZ2Z4IG1h aWxpbmcgbGlzdApJbnRlbC1nZnhAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlzdHMu ZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vaW50ZWwtZ2Z4Cg==