From mboxrd@z Thu Jan 1 00:00:00 1970 From: "H. Peter Anvin" Subject: Re: [PATCH] x86: Add an explicit barrier() to clflushopt() Date: Thu, 7 Jan 2016 14:29:58 -0800 Message-ID: <568EE6E6.4000904@zytor.com> References: <1445248735-11915-1-git-send-email-chris@chris-wilson.co.uk> <20160107101652.GF652@nuc-i3427.alporthouse.com> <20160107194413.GA25144@nuc-i3427.alporthouse.com> <568ED31F.1090004@zytor.com> <20160107215401.GB25144@nuc-i3427.alporthouse.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Received: from mail.zytor.com (terminus.zytor.com [198.137.202.10]) by gabe.freedesktop.org (Postfix) with ESMTPS id CEF576E36A for ; Thu, 7 Jan 2016 14:30:41 -0800 (PST) In-Reply-To: <20160107215401.GB25144@nuc-i3427.alporthouse.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Chris Wilson , Andy Lutomirski , "linux-kernel@vger.kernel.org" , Ross Zwisler , "H . Peter Anvin" , Borislav Petkov , Brian Gerst , Denys Vlasenko , Linus Torvalds , Thomas Gleixner , Imre Deak , Daniel Vetter , DRI List-Id: dri-devel@lists.freedesktop.org T24gMDEvMDcvMTYgMTM6NTQsIENocmlzIFdpbHNvbiB3cm90ZToKPiAKPiBXaGlsc3QgeW91IGFy ZSBsb29raW5nIGF0IHRoaXMgYXNtLCBub3RlIHRoYXQgd2UgcmVsb2FkCj4gYm9vdF9jcHVfZGF0 YS54ODZfY2ZsdXNoX3NpemUgZXZlcnl0aW1lIGFyb3VuZCB0aGUgbG9vcC4gVGhhdCdzIGEgc21h bGwKPiBidXQgbm90aWNlYWJsZSBleHRyYSBjb3N0IChlc3BlY2lhbGx5IHdoZW4gd2UgYXJlIG9u bHkgZmx1c2hpbmcgYSBzaW5nbGUKPiBjYWNoZWxpbmUpLgo+IAoKSSBkaWQgbm90aWNlIHRoYXQ7 IEkgZG9uJ3Qga25vdyBpZiB0aGlzIGlzIGEgY29tcGlsZXIgZmFpbHVyZSB0byBkbyBhbgpvYnZp b3VzIGhvaXN0aW5nICh3aGljaCBzaG91bGQgdGhlbiBiZSBtZXJnZWQgd2l0aCB0aGUgb3RoZXIg bG9hZCkgb3IgYQpjb25zZXF1ZW5jZSBvZiB0aGUgdm9sYXRpbGUuICBJdCBtaWdodCBiZSB1c2Vm dWwgdG8gcmVwb3J0IHRoaXMgdG8gdGhlCmdjYyBidWd6aWxsYSB0byBsZXQgdGhlbSBsb29rIGF0 IGl0LgoKRWl0aGVyIHdheSwgdGhlIGRpZmYgbG9va3MgZ29vZCBhbmQgeW91IGNhbiBhZGQgbXk6 CgpBY2tlZC1ieTogSC4gUGV0ZXIgQW52aW4gPGhwYUBsaW51eC5pbnRlbC5jb20+CgpIb3dldmVy LCBJIHNlZSBhYnNvbHV0ZWx5IG5vdGhpbmcgd3Jvbmcgd2l0aCB0aGUgYXNzZW1ibHkgb3RoZXIg dGhhbgptaW5vciBvcHRpbWl6YXRpb24gcG9zc2liaWxpdGllczogc2luY2UgZ2NjIGdlbmVyYXRl cyBhbiBlYXJseS1vdXQgdGVzdAphcyB3ZWxsIGFzIGEgbGF0ZS1vdXQgdGVzdCBhbnl3YXksIHdl IGNvdWxkIGFkZCBhbiBleHBsaWNpdDoKCglpZiAocCA8IHZlbmQpCgkJcmV0dXJuOwoKYmVmb3Jl IHRoZSBmaXJzdCBtYigpIGF0IG5vIGFkZGl0aW9uYWwgY29zdCAoYXNzdW1pbmcgZ2NjIGlzIHNt YXJ0CmVub3VnaCB0byBza2lwIHRoZSBzZWNvbmQgdGVzdDsgb3RoZXJ3aXNlIHRoYXQgY2FuIGVh c2lseSBiZSBkb25lCm1hbnVhbGx5IGJ5IHJlcGxhY2luZyB0aGUgZm9yIGxvb3Agd2l0aCBhIGRv IHsgfSB3aGlsZSBsb29wLgoKSSB3b3VsZCBiZSB2ZXJ5IGludGVyZXN0ZWQgaW4ga25vd2luZyBp ZiByZXBsYWNpbmcgdGhlIGZpbmFsIGNsZmx1c2hvcHQKd2l0aCBhIGNsZmx1c2ggd291bGQgcmVz b2x2ZSB5b3VyIHByb2JsZW1zIChpbiB3aGljaCBjYXNlIHRoZSBsYXN0IG1iKCkKc2hvdWxkbid0 IGJlIG5lY2Vzc2FyeSBlaXRoZXIuKQoKU29tZXRoaW5nIGxpa2U6Cgp2b2lkIGNsZmx1c2hfY2Fj aGVfcmFuZ2Uodm9pZCAqdmFkZHIsIHVuc2lnbmVkIGludCBzaXplKQp7Cgl1bnNpZ25lZCBsb25n IGNsZmx1c2hfc2l6ZSA9IGJvb3RfY3B1X2RhdGEueDg2X2NsZmx1c2hfc2l6ZTsKCWNoYXIgKnZl bmQgPSAoY2hhciAqKXZhZGRyICsgc2l6ZTsKCWNoYXIgKnAgPSAoY2hhciAqKSgodW5zaWduZWQg bG9uZyl2YWRkciAmIH4oY2xmbHVzaF9zaXplIC0gMSk7CgoJaWYgKHAgPj0gdmVuZCkKCQlyZXR1 cm47CgoJbWIoKTsKCglmb3IgKDsgcCA8IHZlbmQgLSBjbGZsdXNoX3NpemU7IHAgKz0gY2xmbHVz aF9zaXplKQoJCWNsZmx1c2hvcHQocCk7CgoJY2xmbHVzaChwKTsJCS8qIFNlcmlhbGl6aW5nIGFu ZCB0aHVzIGEgYmFycmllciAqLwp9CgpfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXwpkcmktZGV2ZWwgbWFpbGluZyBsaXN0CmRyaS1kZXZlbEBsaXN0cy5mcmVl ZGVza3RvcC5vcmcKaHR0cDovL2xpc3RzLmZyZWVkZXNrdG9wLm9yZy9tYWlsbWFuL2xpc3RpbmZv L2RyaS1kZXZlbAo= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753796AbcAGWap (ORCPT ); Thu, 7 Jan 2016 17:30:45 -0500 Received: from terminus.zytor.com ([198.137.202.10]:44731 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753199AbcAGWan (ORCPT ); Thu, 7 Jan 2016 17:30:43 -0500 Subject: Re: [PATCH] x86: Add an explicit barrier() to clflushopt() To: Chris Wilson , Andy Lutomirski , "linux-kernel@vger.kernel.org" , Ross Zwisler , "H . Peter Anvin" , Borislav Petkov , Brian Gerst , Denys Vlasenko , Linus Torvalds , Thomas Gleixner , Imre Deak , Daniel Vetter , DRI References: <1445248735-11915-1-git-send-email-chris@chris-wilson.co.uk> <20160107101652.GF652@nuc-i3427.alporthouse.com> <20160107194413.GA25144@nuc-i3427.alporthouse.com> <568ED31F.1090004@zytor.com> <20160107215401.GB25144@nuc-i3427.alporthouse.com> From: "H. Peter Anvin" X-Enigmail-Draft-Status: N1110 Message-ID: <568EE6E6.4000904@zytor.com> Date: Thu, 7 Jan 2016 14:29:58 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 MIME-Version: 1.0 In-Reply-To: <20160107215401.GB25144@nuc-i3427.alporthouse.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/07/16 13:54, Chris Wilson wrote: > > Whilst you are looking at this asm, note that we reload > boot_cpu_data.x86_cflush_size everytime around the loop. That's a small > but noticeable extra cost (especially when we are only flushing a single > cacheline). > I did notice that; I don't know if this is a compiler failure to do an obvious hoisting (which should then be merged with the other load) or a consequence of the volatile. It might be useful to report this to the gcc bugzilla to let them look at it. Either way, the diff looks good and you can add my: Acked-by: H. Peter Anvin However, I see absolutely nothing wrong with the assembly other than minor optimization possibilities: since gcc generates an early-out test as well as a late-out test anyway, we could add an explicit: if (p < vend) return; before the first mb() at no additional cost (assuming gcc is smart enough to skip the second test; otherwise that can easily be done manually by replacing the for loop with a do { } while loop. I would be very interested in knowing if replacing the final clflushopt with a clflush would resolve your problems (in which case the last mb() shouldn't be necessary either.) Something like: void clflush_cache_range(void *vaddr, unsigned int size) { unsigned long clflush_size = boot_cpu_data.x86_clflush_size; char *vend = (char *)vaddr + size; char *p = (char *)((unsigned long)vaddr & ~(clflush_size - 1); if (p >= vend) return; mb(); for (; p < vend - clflush_size; p += clflush_size) clflushopt(p); clflush(p); /* Serializing and thus a barrier */ }