From mboxrd@z Thu Jan 1 00:00:00 1970 From: Julien Grall Subject: Re: [PATCH 2/2] xen: arm: update arm32 assembly primitives to Linux v3.16-rc6 Date: Fri, 25 Jul 2014 17:20:56 +0100 Message-ID: <53D283E8.4000709@linaro.org> References: <80a33cc325055bc9d63e4ef272c5b7f68f8fa812.1406301772.git.ian.campbell@citrix.com> <2c06427f1180cf408a3e9750de3040dde0afe2ea.1406301772.git.ian.campbell@citrix.com> <53D27AF3.5070706@linaro.org> <1406303283.24842.41.camel@kazak.uk.xensource.com> <53D27C6B.6070907@linaro.org> <1406304202.24842.50.camel@kazak.uk.xensource.com> <1406304831.24842.54.camel@kazak.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1406304831.24842.54.camel@kazak.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: xen-devel@lists.xen.org, tim@xen.org, stefano.stabellini@eu.citrix.com List-Id: xen-devel@lists.xenproject.org On 07/25/2014 05:13 PM, Ian Campbell wrote: > On Fri, 2014-07-25 at 17:03 +0100, Ian Campbell wrote: >> On Fri, 2014-07-25 at 16:48 +0100, Julien Grall wrote: >>> On 07/25/2014 04:48 PM, Ian Campbell wrote: >>>> On Fri, 2014-07-25 at 16:42 +0100, Julien Grall wrote: >>>>> Hi Ian, >>>>> >>>>> On 07/25/2014 04:22 PM, Ian Campbell wrote: >>>>>> bitops, cmpxchg, atomics: Import: >>>>>> c32ffce ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics >>>>> >>>>> Compare to Linux we don't have specific prefetch* helpers. We directly >>>>> use the compiler builtin ones. Shouldn't we import the ARM specific >>>>> helpers to gain in performance? >>>> >>>> My binaries are full of pld instructions where I think I would expect >>>> them, so it seems like the compiler builtin ones are sufficient. >>>> >>>> I suspect the Linux define is there to cope with older compilers or >>>> something. >>> >>> If so: >> >> The compiled output is very different if I use the arch specific >> explicit variants. The explicit variant generates (lots) more pldw and >> (somewhat) fewer pld. I've no idea what this means... > > It's a bit more obvious for aarch64 where gcc 4.8 doesn't generate any > prefetches at all via the builtins... > > Here's what I've got in my tree. I've no idea if we should take some or > all of it... I don't think it will be harmful for ARMv7 to use specific prefetch* helpers. [..] > +/* > + * Prefetching support > + */ > +#define ARCH_HAS_PREFETCH > +static inline void prefetch(const void *ptr) > +{ > + asm volatile("prfm pldl1keep, %a0\n" : : "p" (ptr)); > +} > + > +#define ARCH_HAS_PREFETCHW > +static inline void prefetchw(const void *ptr) > +{ > + asm volatile("prfm pstl1keep, %a0\n" : : "p" (ptr)); > +} > + > +#define ARCH_HAS_SPINLOCK_PREFETCH > +static inline void spin_lock_prefetch(const void *x) > +{ > + prefetchw(x); > +} Looking to the code. spin_lock_prefetch is called in the tree. I'm not sure we should keep this helper. Regards, -- Julien Grall