From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tvrtko Ursulin Subject: Re: [PATCH 18/20] drm/i915: Use SSE4.1 movntdqa to accelerate reads from WC memory Date: Fri, 12 Aug 2016 11:54:04 +0100 Message-ID: <57ADAACC.2030405@linux.intel.com> References: <1470983123-22127-1-git-send-email-akash.goel@intel.com> <1470983123-22127-19-git-send-email-akash.goel@intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8"; Format="flowed" Content-Transfer-Encoding: base64 Return-path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTP id 12CBC6EB18 for ; Fri, 12 Aug 2016 10:54:33 +0000 (UTC) In-Reply-To: <1470983123-22127-19-git-send-email-akash.goel@intel.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" To: akash.goel@intel.com, intel-gfx@lists.freedesktop.org Cc: Mika Kuoppala List-Id: intel-gfx@lists.freedesktop.org Ck9uIDEyLzA4LzE2IDA3OjI1LCBha2FzaC5nb2VsQGludGVsLmNvbSB3cm90ZToKPiBGcm9tOiBD aHJpcyBXaWxzb24gPGNocmlzQGNocmlzLXdpbHNvbi5jby51az4KPgo+IFRoaXMgcGF0Y2ggcHJv dmlkZXMgdGhlIGluZnJhc3RydWN0dXJlIGZvciBwZXJmb3JtaW5nIGEgMTYtYnl0ZSBhbGlnbmVk Cj4gcmVhZCBmcm9tIFdDIG1lbW9yeSB1c2luZyBub24tdGVtcG9yYWwgaW5zdHJ1Y3Rpb25zIGlu dHJvZHVjZWQgd2l0aCBzc2U0LjEuCj4gVXNpbmcgbW92bnRkcWEgd2UgY2FuIGJ5cGFzcyB0aGUg Q1BVIGNhY2hlcyBhbmQgcmVhZCBkaXJlY3RseSBmcm9tIG1lbW9yeQo+IGFuZCBpZ25vcmluZyB0 aGUgcGFnZSBhdHRyaWJ1dGVzIHNldCBvbiB0aGUgQ1BVIFBURSBpLmUuIG5lZ2F0aW5nIHRoZQo+ IGltcGFjdCBvZiBhbiBvdGhlcndpc2UgVUMgYWNjZXNzLiBDb3B5aW5nIHVzaW5nIG1vdm50cWRh IGZyb20gV0MgaXMgYWxtb3N0Cj4gYXMgZmFzdCBhcyByZWFkaW5nIGZyb20gV0IgbWVtb3J5LCBt b2R1bG8gdGhlIHBvc3NpYmlsaXR5IG9mIGJvdGggaGl0dGluZwo+IHRoZSBDUFUgY2FjaGUgb3Ig bGVhdmluZyB0aGUgZGF0YSBpbiB0aGUgQ1BVIGNhY2hlIGZvciB0aGUgbmV4dCBjb25zdW1lci4K PiAoVGhlIENQVSBjYWNoZSBpdHNlbGYgbXkgYmUgZmx1c2hlZCBmb3IgdGhlIHJlZ2lvbiBvZiB0 aGUgbW92bnRkcWEgYW5kIG9uCj4gbGF0ZXIgYWNjZXNzIHRoZSBtb3ZudGRxYSByZWFkcyBmcm9t IGEgc2VwYXJhdGUgaW50ZXJuYWwgYnVmZmVyIGZvciB0aGUKPiBjYWNoZWxpbmUuKSBUaGUgd3Jp dGUgYmFjayB0byB0aGUgbWVtb3J5IGlzIGhvd2V2ZXIgY2FjaGVkLgo+Cj4gVGhpcyB3aWxsIGJl IHVzZWQgaW4gbGF0ZXIgcGF0Y2hlcyB0byBhY2NlbGVyYXRlIGFjY2Vzc2luZyBXQyBtZW1vcnku Cj4KPiB2MjogUmVwb3J0IHdoZXRoZXIgdGhlIGFjY2VsZXJhdGVkIGNvcHkgaXMgc3VjY2Vzc2Z1 bC9wb3NzaWJsZS4KPiB2MzogRnVuY3Rpb24gYWxpZ25tZW50IG92ZXJyaWRlIHdhcyBvbmx5IG5l Y2Vzc2FyeSB3aGVuIHVzaW5nIHRoZQo+IGZ1bmN0aW9uIHRhcmdldCgic3NlNC4xIikgLSB3aGlj aCBpcyBub3QgbmVjZXNzYXJ5IGZvciBlbWl0dGluZyBtb3ZudGRxYQo+IGZyb20gX19hc21fXy4K PiB2NDogSW1wcm92ZSBub3RlcyBvbiBDUFUgY2FjaGUgYmVoYXZpb3VyIHZzIG5vbi10ZW1wb3Jh bCBzdG9yZXMuCj4gdjU6IEZpeCBieXRlIG9mZnNldHMgZm9yIHVucm9sbGVkIG1vdmVzLgo+IHY2 OiBGaW5kIGFsbCByZW1haW5pbmcgdHlwb3Mgb2YgbW92bnRxZGEsIHVzZSBrZXJuZWxfZnB1X2Jl Z2luLgo+Cj4gU2lnbmVkLW9mZi1ieTogQ2hyaXMgV2lsc29uIDxjaHJpc0BjaHJpcy13aWxzb24u Y28udWs+Cj4gQ2M6IEFrYXNoIEdvZWwgPGFrYXNoLmdvZWxAaW50ZWwuY29tPgo+IENjOiBEYW1p ZW4gTGVzcGlhdSA8ZGFtaWVuLmxlc3BpYXVAaW50ZWwuY29tPgo+IENjOiBNaWthIEt1b3BwYWxh IDxtaWthLmt1b3BwYWxhQGludGVsLmNvbT4KPiBDYzogVHZydGtvIFVyc3VsaW4gPHR2cnRrby51 cnN1bGluQGludGVsLmNvbT4KPiAtLS0KPiAgIGRyaXZlcnMvZ3B1L2RybS9pOTE1L01ha2VmaWxl ICAgICAgfCAgIDMgKysKPiAgIGRyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZHJ2LmMgICAgfCAg IDIgKwo+ICAgZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuaCAgICB8ICAgMyArKwo+ICAg ZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9tZW1jcHkuYyB8IDEwMSArKysrKysrKysrKysrKysr KysrKysrKysrKysrKysrKysrKysrCj4gICA0IGZpbGVzIGNoYW5nZWQsIDEwOSBpbnNlcnRpb25z KCspCj4gICBjcmVhdGUgbW9kZSAxMDA2NDQgZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9tZW1j cHkuYwo+Cj4gZGlmZiAtLWdpdCBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L01ha2VmaWxlIGIvZHJp dmVycy9ncHUvZHJtL2k5MTUvTWFrZWZpbGUKPiBpbmRleCBkZGE3MjRmLi4zNDEyNDEzIDEwMDY0 NAo+IC0tLSBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L01ha2VmaWxlCj4gKysrIGIvZHJpdmVycy9n cHUvZHJtL2k5MTUvTWFrZWZpbGUKPiBAQCAtMywxMiArMywxNSBAQAo+ICAgIyBEaXJlY3QgUmVu ZGVyaW5nIEluZnJhc3RydWN0dXJlIChEUkkpIGluIFhGcmVlODYgNC4xLjAgYW5kIGhpZ2hlci4K Pgo+ICAgc3ViZGlyLWNjZmxhZ3MtJChDT05GSUdfRFJNX0k5MTVfV0VSUk9SKSA6PSAtV2Vycm9y Cj4gK3N1YmRpci1jY2ZsYWdzLXkgKz0gXAo+ICsJJChjYWxsIGFzLWluc3RyLG1vdm50ZHFhICgl ZWF4KSQoY29tbWEpJXhtbTAsLURDT05GSUdfQVNfTU9WTlREUUEpCj4KPiAgICMgUGxlYXNlIGtl ZXAgdGhlc2UgYnVpbGQgbGlzdHMgc29ydGVkIQo+Cj4gICAjIGNvcmUgZHJpdmVyIGNvZGUKPiAg IGk5MTUteSA6PSBpOTE1X2Rydi5vIFwKPiAgIAkgIGk5MTVfaXJxLm8gXAo+ICsJICBpOTE1X21l bWNweS5vIFwKPiAgIAkgIGk5MTVfcGFyYW1zLm8gXAo+ICAgCSAgaTkxNV9wY2kubyBcCj4gICAg ICAgICAgICAgaTkxNV9zdXNwZW5kLm8gXAo+IGRpZmYgLS1naXQgYS9kcml2ZXJzL2dwdS9kcm0v aTkxNS9pOTE1X2Rydi5jIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuYwo+IGluZGV4 IGNiOGM5NDMuLjRiYmYwYWYgMTAwNjQ0Cj4gLS0tIGEvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkx NV9kcnYuYwo+ICsrKyBiL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZHJ2LmMKPiBAQCAtODQx LDYgKzg0MSw4IEBAIHN0YXRpYyBpbnQgaTkxNV9kcml2ZXJfaW5pdF9lYXJseShzdHJ1Y3QgZHJt X2k5MTVfcHJpdmF0ZSAqZGV2X3ByaXYsCj4gICAJbXV0ZXhfaW5pdCgmZGV2X3ByaXYtPndtLndt X211dGV4KTsKPiAgIAltdXRleF9pbml0KCZkZXZfcHJpdi0+cHBzX211dGV4KTsKPgo+ICsJaTkx NV9tZW1jcHlfaW5pdF9lYXJseShkZXZfcHJpdik7Cj4gKwo+ICAgCXJldCA9IGk5MTVfd29ya3F1 ZXVlc19pbml0KGRldl9wcml2KTsKPiAgIAlpZiAocmV0IDwgMCkKPiAgIAkJcmV0dXJuIHJldDsK PiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9kcnYuaCBiL2RyaXZlcnMv Z3B1L2RybS9pOTE1L2k5MTVfZHJ2LmgKPiBpbmRleCA2NjAzODEyLi5mY2EwOWVhIDEwMDY0NAo+ IC0tLSBhL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfZHJ2LmgKPiArKysgYi9kcml2ZXJzL2dw dS9kcm0vaTkxNS9pOTE1X2Rydi5oCj4gQEAgLTM5MDksNCArMzkwOSw3IEBAIHN0YXRpYyBpbmxp bmUgYm9vbCBfX2k5MTVfcmVxdWVzdF9pcnFfY29tcGxldGUoc3RydWN0IGRybV9pOTE1X2dlbV9y ZXF1ZXN0ICpyZXEpCj4gICAJcmV0dXJuIGZhbHNlOwo+ICAgfQo+Cj4gK3ZvaWQgaTkxNV9tZW1j cHlfaW5pdF9lYXJseShzdHJ1Y3QgZHJtX2k5MTVfcHJpdmF0ZSAqZGV2X3ByaXYpOwo+ICtib29s IGk5MTVfbWVtY3B5X2Zyb21fd2Modm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMsIHVuc2lnbmVk IGxvbmcgbGVuKTsKPiArCj4gICAjZW5kaWYKPiBkaWZmIC0tZ2l0IGEvZHJpdmVycy9ncHUvZHJt L2k5MTUvaTkxNV9tZW1jcHkuYyBiL2RyaXZlcnMvZ3B1L2RybS9pOTE1L2k5MTVfbWVtY3B5LmMK PiBuZXcgZmlsZSBtb2RlIDEwMDY0NAo+IGluZGV4IDAwMDAwMDAuLjUwZmM1NzkKPiAtLS0gL2Rl di9udWxsCj4gKysrIGIvZHJpdmVycy9ncHUvZHJtL2k5MTUvaTkxNV9tZW1jcHkuYwo+IEBAIC0w LDAgKzEsMTAxIEBACj4gKy8qCj4gKyAqIENvcHlyaWdodCDCqSAyMDE2IEludGVsIENvcnBvcmF0 aW9uCj4gKyAqCj4gKyAqIFBlcm1pc3Npb24gaXMgaGVyZWJ5IGdyYW50ZWQsIGZyZWUgb2YgY2hh cmdlLCB0byBhbnkgcGVyc29uIG9idGFpbmluZyBhCj4gKyAqIGNvcHkgb2YgdGhpcyBzb2Z0d2Fy ZSBhbmQgYXNzb2NpYXRlZCBkb2N1bWVudGF0aW9uIGZpbGVzICh0aGUgIlNvZnR3YXJlIiksCj4g KyAqIHRvIGRlYWwgaW4gdGhlIFNvZnR3YXJlIHdpdGhvdXQgcmVzdHJpY3Rpb24sIGluY2x1ZGlu ZyB3aXRob3V0IGxpbWl0YXRpb24KPiArICogdGhlIHJpZ2h0cyB0byB1c2UsIGNvcHksIG1vZGlm eSwgbWVyZ2UsIHB1Ymxpc2gsIGRpc3RyaWJ1dGUsIHN1YmxpY2Vuc2UsCj4gKyAqIGFuZC9vciBz ZWxsIGNvcGllcyBvZiB0aGUgU29mdHdhcmUsIGFuZCB0byBwZXJtaXQgcGVyc29ucyB0byB3aG9t IHRoZQo+ICsgKiBTb2Z0d2FyZSBpcyBmdXJuaXNoZWQgdG8gZG8gc28sIHN1YmplY3QgdG8gdGhl IGZvbGxvd2luZyBjb25kaXRpb25zOgo+ICsgKgo+ICsgKiBUaGUgYWJvdmUgY29weXJpZ2h0IG5v dGljZSBhbmQgdGhpcyBwZXJtaXNzaW9uIG5vdGljZSAoaW5jbHVkaW5nIHRoZSBuZXh0Cj4gKyAq IHBhcmFncmFwaCkgc2hhbGwgYmUgaW5jbHVkZWQgaW4gYWxsIGNvcGllcyBvciBzdWJzdGFudGlh bCBwb3J0aW9ucyBvZiB0aGUKPiArICogU29mdHdhcmUuCj4gKyAqCj4gKyAqIFRIRSBTT0ZUV0FS RSBJUyBQUk9WSURFRCAiQVMgSVMiLCBXSVRIT1VUIFdBUlJBTlRZIE9GIEFOWSBLSU5ELCBFWFBS RVNTIE9SCj4gKyAqIElNUExJRUQsIElOQ0xVRElORyBCVVQgTk9UIExJTUlURUQgVE8gVEhFIFdB UlJBTlRJRVMgT0YgTUVSQ0hBTlRBQklMSVRZLAo+ICsgKiBGSVRORVNTIEZPUiBBIFBBUlRJQ1VM QVIgUFVSUE9TRSBBTkQgTk9OSU5GUklOR0VNRU5ULiAgSU4gTk8gRVZFTlQgU0hBTEwKPiArICog VEhFIEFVVEhPUlMgT1IgQ09QWVJJR0hUIEhPTERFUlMgQkUgTElBQkxFIEZPUiBBTlkgQ0xBSU0s IERBTUFHRVMgT1IgT1RIRVIKPiArICogTElBQklMSVRZLCBXSEVUSEVSIElOIEFOIEFDVElPTiBP RiBDT05UUkFDVCwgVE9SVCBPUiBPVEhFUldJU0UsIEFSSVNJTkcKPiArICogRlJPTSwgT1VUIE9G IE9SIElOIENPTk5FQ1RJT04gV0lUSCBUSEUgU09GVFdBUkUgT1IgVEhFIFVTRSBPUiBPVEhFUiBE RUFMSU5HUwo+ICsgKiBJTiBUSEUgU09GVFdBUkUuCj4gKyAqCj4gKyAqLwo+ICsKPiArI2luY2x1 ZGUgPGxpbnV4L2tlcm5lbC5oPgo+ICsjaW5jbHVkZSA8YXNtL2ZwdS9hcGkuaD4KPiArCj4gKyNp bmNsdWRlICJpOTE1X2Rydi5oIgo+ICsKPiArREVGSU5FX1NUQVRJQ19LRVlfRkFMU0UoaGFzX21v dm50ZHFhKTsKPiArCj4gKyNpZmRlZiBDT05GSUdfQVNfTU9WTlREUUEKPiArc3RhdGljIHZvaWQg X19tZW1jcHlfbnRkcWEodm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMsIHVuc2lnbmVkIGxvbmcg bGVuKQo+ICt7Cj4gKwlrZXJuZWxfZnB1X2JlZ2luKCk7Cj4gKwo+ICsJbGVuID4+PSA0Owo+ICsJ d2hpbGUgKGxlbiA+PSA0KSB7Cj4gKwkJYXNtKCJtb3ZudGRxYSAgICglMCksICUleG1tMFxuIgo+ ICsJCSAgICAibW92bnRkcWEgMTYoJTApLCAlJXhtbTFcbiIKPiArCQkgICAgIm1vdm50ZHFhIDMy KCUwKSwgJSV4bW0yXG4iCj4gKwkJICAgICJtb3ZudGRxYSA0OCglMCksICUleG1tM1xuIgo+ICsJ CSAgICAibW92YXBzICUleG1tMCwgICAoJTEpXG4iCj4gKwkJICAgICJtb3ZhcHMgJSV4bW0xLCAx NiglMSlcbiIKPiArCQkgICAgIm1vdmFwcyAlJXhtbTIsIDMyKCUxKVxuIgo+ICsJCSAgICAibW92 YXBzICUleG1tMywgNDgoJTEpXG4iCj4gKwkJICAgIDo6ICJyIiAoc3JjKSwgInIiIChkc3QpIDog Im1lbW9yeSIpOwo+ICsJCXNyYyArPSA2NDsKPiArCQlkc3QgKz0gNjQ7Cj4gKwkJbGVuIC09IDQ7 Cj4gKwl9Cj4gKwl3aGlsZSAobGVuLS0pIHsKPiArCQlhc20oIm1vdm50ZHFhICglMCksICUleG1t MFxuIgo+ICsJCSAgICAibW92YXBzICUleG1tMCwgKCUxKVxuIgo+ICsJCSAgICA6OiAiciIgKHNy YyksICJyIiAoZHN0KSA6ICJtZW1vcnkiKTsKPiArCQlzcmMgKz0gMTY7Cj4gKwkJZHN0ICs9IDE2 Owo+ICsJfQo+ICsKPiArCWtlcm5lbF9mcHVfZW5kKCk7Cj4gK30KPiArI2VuZGlmCj4gKwo+ICsv KioKPiArICogaTkxNV9tZW1jcHlfZnJvbV93YzogcGVyZm9ybSBhbiBhY2NlbGVyYXRlZCAqYWxp Z25lZCogcmVhZCBmcm9tIFdDCj4gKyAqIEBkc3Q6IGRlc3RpbmF0aW9uIHBvaW50ZXIKPiArICog QHNyYzogc291cmNlIHBvaW50ZXIKPiArICogQGxlbjogaG93IG1hbnkgYnl0ZXMgdG8gY29weQo+ ICsgKgo+ICsgKiBpOTE1X21lbWNweV9mcm9tX3djIGNvcGllcyBAbGVuIGJ5dGVzIGZyb20gQHNy YyB0byBAZHN0IHVzaW5nCj4gKyAqIG5vbi10ZW1wb3JhbCBpbnN0cnVjdGlvbnMgd2hlcmUgYXZh aWxhYmxlLiBOb3RlIHRoYXQgYWxsIGFyZ3VtZW50cwo+ICsgKiAoQHNyYywgQGRzdCkgbXVzdCBi ZSBhbGlnbmVkIHRvIDE2IGJ5dGVzIGFuZCBAbGVuIG11c3QgYmUgYSBtdWx0aXBsZQo+ICsgKiBv ZiAxNi4KPiArICoKPiArICogVG8gdGVzdCB3aGV0aGVyIGFjY2VsZXJhdGVkIHJlYWRzIGZyb20g V0MgYXJlIHN1cHBvcnRlZCwgdXNlCj4gKyAqIGk5MTVfbWVtY3B5X2Zyb21fd2MoTlVMTCwgTlVM TCwgMCk7Cj4gKyAqCj4gKyAqIFJldHVybnMgdHJ1ZSBpZiB0aGUgY29weSB3YXMgc3VjY2Vzc2Z1 bCwgZmFsc2UgaWYgdGhlIHByZWNvbmRpdGlvbnMKPiArICogYXJlIG5vdCBtZXQuCj4gKyAqLwo+ ICtib29sIGk5MTVfbWVtY3B5X2Zyb21fd2Modm9pZCAqZHN0LCBjb25zdCB2b2lkICpzcmMsIHVu c2lnbmVkIGxvbmcgbGVuKQo+ICt7Cj4gKwlpZiAodW5saWtlbHkoKCh1bnNpZ25lZCBsb25nKWRz dCB8ICh1bnNpZ25lZCBsb25nKXNyYyB8IGxlbikgJiAxNSkpCj4gKwkJcmV0dXJuIGZhbHNlOwo+ ICsKPiArI2lmZGVmIENPTkZJR19BU19NT1ZOVERRQQo+ICsJaWYgKHN0YXRpY19icmFuY2hfbGlr ZWx5KCZoYXNfbW92bnRkcWEpKSB7Cj4gKwkJaWYgKGxlbikKClBvdGVudGlhbGx5IGNvdWxkIGFu bm90YXRlIHRoaXMgd2l0aCBhbm90aGVyIGxpa2VseS4KCj4gKwkJCV9fbWVtY3B5X250ZHFhKGRz dCwgc3JjLCBsZW4pOwo+ICsJCXJldHVybiB0cnVlOwo+ICsJfQo+ICsjZW5kaWYKPiArCj4gKwly ZXR1cm4gZmFsc2U7Cj4gK30KPiArCj4gK3ZvaWQgaTkxNV9tZW1jcHlfaW5pdF9lYXJseShzdHJ1 Y3QgZHJtX2k5MTVfcHJpdmF0ZSAqZGV2X3ByaXYpCj4gK3sKPiArCWlmIChzdGF0aWNfY3B1X2hh cyhYODZfRkVBVFVSRV9YTU00XzEpKQo+ICsJCXN0YXRpY19icmFuY2hfZW5hYmxlKCZoYXNfbW92 bnRkcWEpOwo+ICt9Cj4KClJldmlld2VkLWJ5OiBUdnJ0a28gVXJzdWxpbiA8dHZydGtvLnVyc3Vs aW5AaW50ZWwuY29tPgoKUmVnYXJkcywKClR2cnRrbwpfX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fXwpJbnRlbC1nZnggbWFpbGluZyBsaXN0CkludGVsLWdmeEBs aXN0cy5mcmVlZGVza3RvcC5vcmcKaHR0cHM6Ly9saXN0cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1h bi9saXN0aW5mby9pbnRlbC1nZngK