public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Alex Shi <alex.shi@intel.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: jeremy@goop.org,
	"asit.k.mallick@intel.com" <asit.k.mallick@intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	tglx@linutronix.de, Andi Kleen <ak@linux.intel.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>
Subject: Re: [RFC patch] cmpxchg_double: remove local variables to get better performance
Date: Fri, 02 Mar 2012 17:00:43 +0800	[thread overview]
Message-ID: <1330678843.21053.1553.camel@debian> (raw)
In-Reply-To: <4F5098B80200007800075F33@nat28.tlf.novell.com>

On Fri, 2012-03-02 at 08:54 +0000, Jan Beulich wrote:
> >>> On 02.03.12 at 09:31, Alex Shi <alex.shi@intel.com> wrote:
> > There are some local variables in cmpxchg_double macro, seems these are
> > used to for force casting on input variables to transfer them into '*p1'
> > type. May there are some reason I don't know. But I just saw 2 problems
> > here:
> > 
> > 1, user may mis-use the macro, like give a 'long' type o1, but just use
> > a 'int*' or 'char*' p1.  
> 
> No - see the BUILD_BUG_ON()s right after the lines you suggest to
> remove.
> 
> Further, it seems to be intentional to allow _compatible_ types for
> o1 and o2 - you could pass in a literal number without L suffix here,
> which I don't think you can anymore with the intermediate variable
> removed.

Yes, we can use cast for intermediate data. And actually, current kernel
has live mis-used case on cmpxchg(), that I plan to point out too. 

-- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -203,12 +203,12 @@ static bool make_all_cpus_request(struct kvm *kvm, unsigned int req)

 void kvm_flush_remote_tlbs(struct kvm *kvm)
 {
-       int dirty_count = kvm->tlbs_dirty;
+       long dirty_count = kvm->tlbs_dirty;

        smp_mb();
        if (make_all_cpus_request(kvm, KVM_REQ_TLB_FLUSH))
                ++kvm->stat.remote_tlb_flush;
-       cmpxchg(&kvm->tlbs_dirty, dirty_count, 0);
+       cmpxchg(&kvm->tlbs_dirty, dirty_count, 0L);
 }

> 
> > If we remove the force cast here, gcc will check the mis-using in
> > compiling. and user can get the error report in compiling for such
> > issues.
> > 
> > 2, local variable increased the data section, and bring extra memory bus
> 
> These aren't static, so the data section can't possibly increase.

sorry, it is text section increasing. 
> 
> > accesses, that hurt performance in this critical macro.
> 
> With optimization enabled, the compiler should eliminate all unnecessary
> intermediate variables.

oh, I don't know now. I will recheck this point. 
> 
> > I did a little experiment on my nhm i7 desktop, to run the macro with a
> > fixed times, here is the data:
> > 			 using local vars         no local variable
> > with lock prefix,         267700578ns             232079696ns
> > without lock prefix,      34715666ns              34687566ns
> > 
> > So, we may need rethink about the local variable usage here. 
> > 
> > Signed-off-by: Alex Shi <alex.shi@intel.com>
> 
> Sorry, but if this counts, this is a nack from me.
> 
> Jan
> 
> > diff --git a/arch/x86/include/asm/cmpxchg.h b/arch/x86/include/asm/cmpxchg.h
> > index b3b7332..8bf9127 100644
> > --- a/arch/x86/include/asm/cmpxchg.h
> > +++ b/arch/x86/include/asm/cmpxchg.h
> > @@ -210,17 +210,15 @@ extern void __add_wrong_size(void)
> >  #define __cmpxchg_double(pfx, p1, p2, o1, o2, n1, n2)			\
> >  ({									\
> >  	bool __ret;							\
> > -	__typeof__(*(p1)) __old1 = (o1), __new1 = (n1);			\
> > -	__typeof__(*(p2)) __old2 = (o2), __new2 = (n2);			\
> >  	BUILD_BUG_ON(sizeof(*(p1)) != sizeof(long));			\
> >  	BUILD_BUG_ON(sizeof(*(p2)) != sizeof(long));			\
> >  	VM_BUG_ON((unsigned long)(p1) % (2 * sizeof(long)));		\
> >  	VM_BUG_ON((unsigned long)((p1) + 1) != (unsigned long)(p2));	\
> >  	asm volatile(pfx "cmpxchg%c4b %2; sete %0"			\
> > -		     : "=a" (__ret), "+d" (__old2),			\
> > +		     : "=a" (__ret), "+d" (o2),				\
> >  		       "+m" (*(p1)), "+m" (*(p2))			\
> > -		     : "i" (2 * sizeof(long)), "a" (__old1),		\
> > -		       "b" (__new1), "c" (__new2));			\
> > +		     : "i" (2 * sizeof(long)), "a" (o1),		\
> > +		       "b" (n1), "c" (n2));				\
> >  	__ret;								\
> >  })
> >  
> 
> 
> 



  reply	other threads:[~2012-03-02  9:01 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-02  8:31 [RFC patch] cmpxchg_double: remove local variables to get better performance Alex Shi
2012-03-02  8:54 ` Jan Beulich
2012-03-02  9:00   ` Alex Shi [this message]
2012-03-02  9:11     ` Jan Beulich
2012-03-02 15:12       ` Alex Shi
2012-03-02 15:30         ` Jan Beulich
2012-03-03  6:03           ` Alex Shi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1330678843.21053.1553.camel@debian \
    --to=alex.shi@intel.com \
    --cc=JBeulich@suse.com \
    --cc=ak@linux.intel.com \
    --cc=asit.k.mallick@intel.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox