From: Joakim Tjernlund <joakim.tjernlund@transmode.se>
To: "scottwood@freescale.com" <scottwood@freescale.com>
Cc: "christophe.leroy@c-s.fr" <christophe.leroy@c-s.fr>,
"paulus@samba.org" <paulus@samba.org>,
"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline
Date: Tue, 22 Sep 2015 20:32:51 +0000 [thread overview]
Message-ID: <1442953971.29498.76.camel@transmode.se> (raw)
In-Reply-To: <1442952852.19102.281.camel@freescale.com>
On Tue, 2015-09-22 at 15:14 -0500, Scott Wood wrote:
> On Tue, 2015-09-22 at 19:55 +0000, Joakim Tjernlund wrote:
> > On Tue, 2015-09-22 at 14:42 -0500, Scott Wood wrote:
> > > On Tue, 2015-09-22 at 19:34 +0000, Joakim Tjernlund wrote:
> > > > On Tue, 2015-09-22 at 13:58 -0500, Scott Wood wrote:
> > > > > On Tue, 2015-09-22 at 18:12 +0000, Joakim Tjernlund wrote:
> > > > > > On Tue, 2015-09-22 at 18:51 +0200, Christophe Leroy wrote:
> > > > > > > flush/clean/invalidate _dcache_range() functions are all very
> > > > > > > similar and are quite short. They are mainly used in __dma_sy=
nc()
> > > > > > > perf_event locate them in the top 3 consumming functions duri=
ng
> > > > > > > heavy ethernet activity
> > > > > > >=20
> > > > > > > They are good candidate for inlining, as __dma_sync() does
> > > > > > > almost nothing but calling them
> > > > > > >=20
> > > > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > > > > ---
> > > > > > > New in v2
> > > > > > >=20
> > > > > > > arch/powerpc/include/asm/cacheflush.h | 55=20
> > > > > > > +++++++++++++++++++++++++++--
> > > > > > > arch/powerpc/kernel/misc_32.S | 65 -----------------=
-----
> > > > > > > ----
> > > > > > > ----
> > > > > > > -----
> > > > > > > arch/powerpc/kernel/ppc_ksyms.c | 2 ++
> > > > > > > 3 files changed, 54 insertions(+), 68 deletions(-)
> > > > > > >=20
> > > > > > > diff --git a/arch/powerpc/include/asm/cacheflush.h=20
> > > > > > > b/arch/powerpc/include/asm/cacheflush.h
> > > > > > > index 6229e6b..6169604 100644
> > > > > > > --- a/arch/powerpc/include/asm/cacheflush.h
> > > > > > > +++ b/arch/powerpc/include/asm/cacheflush.h
> > > > > > > @@ -47,12 +47,61 @@ static inline void=20
> > > > > > > __flush_dcache_icache_phys(unsigned long physaddr)
> > > > > > > }
> > > > > > > #endif
> > > > > > > =20
> > > > > > > -extern void flush_dcache_range(unsigned long start, unsigned=
=20
> > > > > > > long=20
> > > > > > > stop);
> > > > > > > #ifdef CONFIG_PPC32
> > > > > > > -extern void clean_dcache_range(unsigned long start, unsigned=
=20
> > > > > > > long=20
> > > > > > > stop);
> > > > > > > -extern void invalidate_dcache_range(unsigned long start,=20
> > > > > > > unsigned=20
> > > > > > > long=20
> > > > > > > stop);
> > > > > > > +/*
> > > > > > > + * Write any modified data cache blocks out to memory and=20
> > > > > > > invalidate=20
> > > > > > > them.
> > > > > > > + * Does not invalidate the corresponding instruction cache=20
> > > > > > > blocks.
> > > > > > > + */
> > > > > > > +static inline void flush_dcache_range(unsigned long start,=20
> > > > > > > unsigned=20
> > > > > > > long=20
> > > > > > > stop)
> > > > > > > +{
> > > > > > > + void *addr =3D (void *)(start & ~(L1_CACHE_BYTES - 1));
> > > > > > > + unsigned int size =3D stop - (unsigned long)addr +=20
> > > > > > > (L1_CACHE_BYTES -
> > > > > > > 1);
> > > > > > > + unsigned int i;
> > > > > > > +
> > > > > > > + for (i =3D 0; i < size >> L1_CACHE_SHIFT; i++, addr +=3D=
=20
> > > > > > > L1_CACHE_BYTES)
> > > > > > > + dcbf(addr);
> > > > > > > + if (i)
> > > > > > > + mb(); /* sync */
> > > > > > > +}
> > > > > >=20
> > > > > > This feels optimized for the uncommon case when there is no=20
> > > > > > invalidation.
> > > > >=20
> > > > > If you mean the "if (i)", yes, that looks odd.
> > > >=20
> > > > Yes.
> > > >=20
> > > > >=20
> > > > > > I THINK it would be better to bail early=20
> > > > >=20
> > > > > Bail under what conditions?
> > > >=20
> > > > test for "i =3D 0" and return.=20
> > >=20
> > > Why bother?
> >=20
> > I usally find it better to dela with special cases upfront s=E5 the res=
t=20
> > doesn't need to
> > bother. i=3D0 is a NOP and it is clearer to show that upfront.
>=20
> No, I mean why bother special casing this at all?
I just said why?=20
You to found the if(i) mb() a bit odd and it took a little time to see why =
it was there.
Ahh, you mean just skip the if(i) and have mb() unconditionally?
That changes the semantics a little from the ASM version but perhaps that d=
oesn't matter?=
WARNING: multiple messages have this Message-ID (diff)
From: Joakim Tjernlund <joakim.tjernlund@transmode.se>
To: "scottwood@freescale.com" <scottwood@freescale.com>
Cc: "christophe.leroy@c-s.fr" <christophe.leroy@c-s.fr>,
"paulus@samba.org" <paulus@samba.org>,
"mpe@ellerman.id.au" <mpe@ellerman.id.au>,
"benh@kernel.crashing.org" <benh@kernel.crashing.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline
Date: Tue, 22 Sep 2015 20:32:51 +0000 [thread overview]
Message-ID: <1442953971.29498.76.camel@transmode.se> (raw)
In-Reply-To: <1442952852.19102.281.camel@freescale.com>
On Tue, 2015-09-22 at 15:14 -0500, Scott Wood wrote:
> On Tue, 2015-09-22 at 19:55 +0000, Joakim Tjernlund wrote:
> > On Tue, 2015-09-22 at 14:42 -0500, Scott Wood wrote:
> > > On Tue, 2015-09-22 at 19:34 +0000, Joakim Tjernlund wrote:
> > > > On Tue, 2015-09-22 at 13:58 -0500, Scott Wood wrote:
> > > > > On Tue, 2015-09-22 at 18:12 +0000, Joakim Tjernlund wrote:
> > > > > > On Tue, 2015-09-22 at 18:51 +0200, Christophe Leroy wrote:
> > > > > > > flush/clean/invalidate _dcache_range() functions are all very
> > > > > > > similar and are quite short. They are mainly used in __dma_sync()
> > > > > > > perf_event locate them in the top 3 consumming functions during
> > > > > > > heavy ethernet activity
> > > > > > >
> > > > > > > They are good candidate for inlining, as __dma_sync() does
> > > > > > > almost nothing but calling them
> > > > > > >
> > > > > > > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > > > > > > ---
> > > > > > > New in v2
> > > > > > >
> > > > > > > arch/powerpc/include/asm/cacheflush.h | 55
> > > > > > > +++++++++++++++++++++++++++--
> > > > > > > arch/powerpc/kernel/misc_32.S | 65 ----------------------
> > > > > > > ----
> > > > > > > ----
> > > > > > > -----
> > > > > > > arch/powerpc/kernel/ppc_ksyms.c | 2 ++
> > > > > > > 3 files changed, 54 insertions(+), 68 deletions(-)
> > > > > > >
> > > > > > > diff --git a/arch/powerpc/include/asm/cacheflush.h
> > > > > > > b/arch/powerpc/include/asm/cacheflush.h
> > > > > > > index 6229e6b..6169604 100644
> > > > > > > --- a/arch/powerpc/include/asm/cacheflush.h
> > > > > > > +++ b/arch/powerpc/include/asm/cacheflush.h
> > > > > > > @@ -47,12 +47,61 @@ static inline void
> > > > > > > __flush_dcache_icache_phys(unsigned long physaddr)
> > > > > > > }
> > > > > > > #endif
> > > > > > >
> > > > > > > -extern void flush_dcache_range(unsigned long start, unsigned
> > > > > > > long
> > > > > > > stop);
> > > > > > > #ifdef CONFIG_PPC32
> > > > > > > -extern void clean_dcache_range(unsigned long start, unsigned
> > > > > > > long
> > > > > > > stop);
> > > > > > > -extern void invalidate_dcache_range(unsigned long start,
> > > > > > > unsigned
> > > > > > > long
> > > > > > > stop);
> > > > > > > +/*
> > > > > > > + * Write any modified data cache blocks out to memory and
> > > > > > > invalidate
> > > > > > > them.
> > > > > > > + * Does not invalidate the corresponding instruction cache
> > > > > > > blocks.
> > > > > > > + */
> > > > > > > +static inline void flush_dcache_range(unsigned long start,
> > > > > > > unsigned
> > > > > > > long
> > > > > > > stop)
> > > > > > > +{
> > > > > > > + void *addr = (void *)(start & ~(L1_CACHE_BYTES - 1));
> > > > > > > + unsigned int size = stop - (unsigned long)addr +
> > > > > > > (L1_CACHE_BYTES -
> > > > > > > 1);
> > > > > > > + unsigned int i;
> > > > > > > +
> > > > > > > + for (i = 0; i < size >> L1_CACHE_SHIFT; i++, addr +=
> > > > > > > L1_CACHE_BYTES)
> > > > > > > + dcbf(addr);
> > > > > > > + if (i)
> > > > > > > + mb(); /* sync */
> > > > > > > +}
> > > > > >
> > > > > > This feels optimized for the uncommon case when there is no
> > > > > > invalidation.
> > > > >
> > > > > If you mean the "if (i)", yes, that looks odd.
> > > >
> > > > Yes.
> > > >
> > > > >
> > > > > > I THINK it would be better to bail early
> > > > >
> > > > > Bail under what conditions?
> > > >
> > > > test for "i = 0" and return.
> > >
> > > Why bother?
> >
> > I usally find it better to dela with special cases upfront så the rest
> > doesn't need to
> > bother. i=0 is a NOP and it is clearer to show that upfront.
>
> No, I mean why bother special casing this at all?
I just said why?
You to found the if(i) mb() a bit odd and it took a little time to see why it was there.
Ahh, you mean just skip the if(i) and have mb() unconditionally?
That changes the semantics a little from the ASM version but perhaps that doesn't matter?
next prev parent reply other threads:[~2015-09-22 20:32 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-22 16:50 [PATCH v2 00/25] powerpc/8xx: Use large pages for RAM and IMMR and other improvments Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 01/25] powerpc/8xx: Save r3 all the time in DTLB miss handler Christophe Leroy
2015-09-28 22:07 ` Scott Wood
2015-10-06 13:35 ` Christophe Leroy
2015-10-06 16:39 ` Scott Wood
2015-10-06 16:46 ` Scott Wood
2015-10-06 20:30 ` christophe leroy
2015-10-06 20:38 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 02/25] powerpc/8xx: Map linear kernel RAM with 8M pages Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 03/25] powerpc: Update documentation for noltlbs kernel parameter Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 04/25] powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 05/25] powerpc/8xx: Fix vaddr for IMMR early remap Christophe Leroy
2015-09-28 23:39 ` Scott Wood
2015-10-08 12:34 ` Christophe Leroy
2015-10-08 19:13 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 06/25] powerpc32: iounmap() cannot vunmap() area mapped by TLBCAMs either Christophe Leroy
2015-09-28 23:41 ` Scott Wood
2015-10-06 13:50 ` Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 07/25] powerpc32: refactor x_mapped_by_bats() and x_mapped_by_tlbcam() together Christophe Leroy
2015-09-28 23:47 ` Scott Wood
2015-10-06 14:02 ` Christophe Leroy
2015-10-06 15:16 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 08/25] powerpc/8xx: Map IMMR area with 512k page at a fixed address Christophe Leroy
2015-09-24 11:41 ` David Laight
2015-09-24 11:41 ` David Laight
2015-09-24 20:14 ` Scott Wood
2015-09-25 14:46 ` David Laight
2015-09-25 14:46 ` David Laight
2015-09-25 17:09 ` Scott Wood
2015-09-28 23:53 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 09/25] powerpc/8xx: show IMMR area in startup memory layout Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 10/25] powerpc/8xx: CONFIG_PIN_TLB unneeded for CONFIG_PPC_EARLY_DEBUG_CPM Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 11/25] powerpc/8xx: map 16M RAM at startup Christophe Leroy
2015-09-28 23:58 ` Scott Wood
2015-10-06 14:10 ` Christophe Leroy
2015-10-06 15:17 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 12/25] powerpc32: Remove useless/wrong MMU:setio progress message Christophe Leroy
2015-09-22 16:50 ` [PATCH v2 13/25] powerpc/8xx: also use r3 in the ITLB miss in all situations Christophe Leroy
2015-09-29 0:00 ` Scott Wood
2015-10-06 14:12 ` Christophe Leroy
2015-10-06 16:48 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 14/25] powerpc32: remove ioremap_base Christophe Leroy
2015-09-29 0:38 ` Scott Wood
2015-09-22 16:50 ` [PATCH v2 15/25] powerpc/8xx: move 8xx SPRN defines into reg_8xx.h and add some missing ones Christophe Leroy
2015-09-29 0:03 ` Scott Wood
2015-10-06 14:35 ` Christophe Leroy
2015-10-06 16:56 ` Scott Wood
2015-09-22 16:51 ` [PATCH v2 16/25] powerpc/8xx: Handle CPU6 ERRATA directly in mtspr() macro Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 17/25] powerpc/8xx: remove special handling of CPU6 errata in set_dec() Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 18/25] powerpc/8xx: rewrite set_context() in C Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 19/25] powerpc/8xx: rewrite flush_instruction_cache() " Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 20/25] powerpc32: Remove clear_pages() and define clear_page() inline Christophe Leroy
2015-09-22 17:57 ` Joakim Tjernlund
2015-09-22 17:57 ` Joakim Tjernlund
2015-09-29 0:23 ` Scott Wood
2015-09-22 16:51 ` [PATCH v2 21/25] powerpc: add inline functions for cache related instructions Christophe Leroy
2015-09-29 0:25 ` Scott Wood
2015-09-22 16:51 ` [PATCH v2 22/25] powerpc32: move xxxxx_dcache_range() functions inline Christophe Leroy
2015-09-22 18:12 ` Joakim Tjernlund
2015-09-22 18:12 ` Joakim Tjernlund
2015-09-22 18:58 ` Scott Wood
2015-09-22 19:34 ` Joakim Tjernlund
2015-09-22 19:34 ` Joakim Tjernlund
2015-09-22 19:42 ` Scott Wood
2015-09-22 19:55 ` Joakim Tjernlund
2015-09-22 19:55 ` Joakim Tjernlund
2015-09-22 20:07 ` Joakim Tjernlund
2015-09-22 20:07 ` Joakim Tjernlund
2015-09-22 20:14 ` Scott Wood
2015-09-22 20:32 ` Joakim Tjernlund [this message]
2015-09-22 20:32 ` Joakim Tjernlund
2015-09-22 20:35 ` Scott Wood
2015-09-22 20:38 ` Joakim Tjernlund
2015-09-22 20:38 ` Joakim Tjernlund
2015-09-22 20:57 ` Christophe Leroy
2015-09-22 22:34 ` Scott Wood
2015-09-22 22:49 ` Christophe Leroy
2015-09-22 22:52 ` Scott Wood
2015-09-29 0:29 ` Scott Wood
2015-10-07 12:49 ` Christophe Leroy
2015-10-08 19:12 ` Scott Wood
2015-10-12 18:08 ` christophe leroy
2015-09-22 16:51 ` [PATCH v2 23/25] powerpc: Simplify test in __dma_sync() Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 24/25] powerpc32: small optimisation in flush_icache_range() Christophe Leroy
2015-09-22 16:51 ` [PATCH v2 25/25] powerpc32: Remove one insn in mulhdu Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1442953971.29498.76.camel@transmode.se \
--to=joakim.tjernlund@transmode.se \
--cc=benh@kernel.crashing.org \
--cc=christophe.leroy@c-s.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.