From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH 2/3] x86: use CLFLUSHOPT when available
Date: Wed, 10 Feb 2016 16:27:06 +0000
Message-ID: <56BB64DA.4040409@citrix.com>
References: <56BB3DE902000078000D087A@prv-mh.provo.novell.com>
	<56BB41BC02000078000D08AD@prv-mh.provo.novell.com>
	<56BB5158.5050507@citrix.com>
	<56BB67C602000078000D0A83@prv-mh.provo.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta14.messagelabs.com ([193.109.254.103])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <prvs=84103137d=Andrew.Cooper3@citrix.com>)
	id 1aTXbi-0003Ei-PM
	for xen-devel@lists.xenproject.org; Wed, 10 Feb 2016 16:27:10 +0000
In-Reply-To: <56BB67C602000078000D0A83@prv-mh.provo.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org

On 10/02/16 15:39, Jan Beulich wrote:
>>>> On 10.02.16 at 16:03, <andrew.cooper3@citrix.com> wrote:
>> On 10/02/16 12:57, Jan Beulich wrote:
>>> Also drop an unnecessary va adjustment in the code being touched.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>>
>>> --- a/xen/arch/x86/flushtlb.c
>>> +++ b/xen/arch/x86/flushtlb.c
>>> @@ -139,10 +139,12 @@ unsigned int flush_area_local(const void
>>>               c->x86_clflush_size && c->x86_cache_size && sz &&
>>>               ((sz >> 10) < c->x86_cache_size) )
>>>          {
>>> -            va = (const void *)((unsigned long)va & ~(sz - 1));
>>> +            alternative(ASM_NOP3, "sfence", X86_FEATURE_CLFLUSHOPT);
>> Why separate?  This would be better in the lower alternative(), with one
>> single nop making up the difference in length.  That way, processors
>> without CLFLUSHOPT don't suffer the 1 cycle instruction decode stall
>> from the redundant rex prefix.
> Why would we want the fence inside the loop - a single fence is
> sufficient for the entire flush.

Ah yes - of course.

>
> Also if we're worried about the REX decode, this could easily be a
> NOP instead, just that I'm not certain which one in the end is less
> decode overhead.

A redundant prefix will generally have a lower overhead than a full new
instruction.

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>