All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: jeremy@goop.org,
	"asit.k.mallick@intel.com" <asit.k.mallick@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	tglx@linutronix.de, Andi Kleen <ak@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>
Subject: Re: [RFC patch] cmpxchg_double: remove local variables to get better performance
Date: Fri, 02 Mar 2012 17:00:43 +0800	[thread overview]
Message-ID: <1330678843.21053.1553.camel@debian> (raw)
In-Reply-To: <4F5098B80200007800075F33@nat28.tlf.novell.com>

On Fri, 2012-03-02 at 08:54 +0000, Jan Beulich wrote:
> >>> On 02.03.12 at 09:31, Alex Shi <alex.shi@intel.com> wrote:
> > There are some local variables in cmpxchg_double macro, seems these are
> > used to for force casting on input variables to transfer them into '*p1'
> > type. May there are some reason I don't know. But I just saw 2 problems
> > here:
> > 
> > 1, user may mis-use the macro, like give a 'long' type o1, but just use
> > a 'int*' or 'char*' p1.  
> 
> No - see the BUILD_BUG_ON()s right after the lines you suggest to
> remove.
> 
> Further, it seems to be intentional to allow _compatible_ types for
> o1 and o2 - you could pass in a literal number without L suffix here,
> which I don't think you can anymore with the intermediate variable
> removed.

Yes, we can use cast for intermediate data. And actually, current kernel
has live mis-used case on cmpxchg(), that I plan to point out too. 

-- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -203,12 +203,12 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)

 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-       int dirty_count = kvm->tlbs_dirty;
+       long dirty_count = kvm->tlbs_dirty;

        smp_mb();
        if (make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
                ++kvm->stat.remote_tlb_flush;
-       cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
+       cmpxchg(&kvm->tlbs_dirty, dirty_count, 0L);
 }

> 
> > If we remove the force cast here, gcc will check the mis-using in
> > compiling. and user can get the error report in compiling for such
> > issues.
> > 
> > 2, local variable increased the data section, and bring extra memory bus
> 
> These aren't static, so the data section can't possibly increase.

sorry, it is text section increasing. 
> 
> > accesses, that hurt performance in this critical macro.
> 
> With optimization enabled, the compiler should eliminate all unnecessary
> intermediate variables.

oh, I don't know now. I will recheck this point. 
> 
> > I did a little experiment on my nhm i7 desktop, to run the macro with a
> > fixed times, here is the data:
> > 			 using local vars         no local variable
> > with lock prefix,         267700578ns             232079696ns
> > without lock prefix,      34715666ns              34687566ns
> > 
> > So, we may need rethink about the local variable usage here. 
> > 
> > Signed-off-by: Alex Shi <alex.shi@intel.com>
> 
> Sorry, but if this counts, this is a nack from me.
> 
> Jan
> 
> > diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> > index b3b7332..8bf9127 100644
> > --- a/arch/x86/include/asm/cmpxchg.h
> > +++ b/arch/x86/include/asm/cmpxchg.h
> > @@ -210,17 +210,15 @@ extern void __add_wrong_size(void)
> >  #define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2)			\
> >  ({									\
> >  	bool __ret;							\
> > -	__typeof__(*(p1)) __old1 = (o1), __new1 = (n1);			\
> > -	__typeof__(*(p2)) __old2 = (o2), __new2 = (n2);			\
> >  	BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long));			\
> >  	BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long));			\
> >  	VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long)));		\
> >  	VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2));	\
> >  	asm volatile(pfx "cmpxchg%c4b %2; sete %0"			\
> > -		     : "=a" (__ret), "+d" (__old2),			\
> > +		     : "=a" (__ret), "+d" (o2),				\
> >  		       "+m" (*(p1)), "+m" (*(p2))			\
> > -		     : "i" (2 * sizeof(long)), "a" (__old1),		\
> > -		       "b" (__new1), "c" (__new2));			\
> > +		     : "i" (2 * sizeof(long)), "a" (o1),		\
> > +		       "b" (n1), "c" (n2));				\
> >  	__ret;								\
> >  })
> >  
> 
> 
> 



  reply	other threads:[~2012-03-02  9:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-02  8:31 [RFC patch] cmpxchg_double: remove local variables to get better performance Alex Shi
2012-03-02  8:54 ` Jan Beulich
2012-03-02  9:00   ` Alex Shi [this message]
2012-03-02  9:11     ` Jan Beulich
2012-03-02 15:12       ` Alex Shi
2012-03-02 15:30         ` Jan Beulich
2012-03-03  6:03           ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1330678843.21053.1553.camel@debian \
    --to=alex.shi@intel.com \
    --cc=JBeulich@suse.com \
    --cc=ak@linux.intel.com \
    --cc=asit.k.mallick@intel.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.