From: Michael Ellerman <michael@ellerman.id.au>
To: Mark Nelson <markn@au1.ibm.com>
Cc: linuxppc-dev@ozlabs.org, cbe-oss-dev@ozlabs.org
Subject: Re: [RFC 2/2] powerpc: copy_4K_page tweaked for Cell - add CPU feature
Date: Thu, 14 Aug 2008 20:51:35 +1000 [thread overview]
Message-ID: <1218711095.10673.4.camel@localhost> (raw)
In-Reply-To: <200808141618.23818.markn@au1.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 2279 bytes --]
On Thu, 2008-08-14 at 16:18 +1000, Mark Nelson wrote:
> Add a new CPU feature, CPU_FTR_CP_USE_DCBTZ, to be added to the CPUs that benefit
> from having dcbt and dcbz instructions used in copy_4K_page(). So far Cell, PPC970
> and Power4 benefit.
>
> This way all the other 64bit powerpc chips will have the whole prefetching loop
> nop'ed out.
> Index: upstream/arch/powerpc/lib/copypage_64.S
> ===================================================================
> --- upstream.orig/arch/powerpc/lib/copypage_64.S
> +++ upstream/arch/powerpc/lib/copypage_64.S
> @@ -18,6 +18,7 @@ PPC64_CACHES:
>
> _GLOBAL(copy_4K_page)
> li r5,4096 /* 4K page size */
> +BEGIN_FTR_SECTION
> ld r10,PPC64_CACHES@toc(r2)
> lwz r11,DCACHEL1LOGLINESIZE(r10) /* log2 of cache line size */
> lwz r12,DCACHEL1LINESIZE(r10) /* Get cache line size */
> @@ -30,7 +31,7 @@ setup:
> dcbz r9,r3
> add r9,r9,r12
> bdnz setup
> -
> +END_FTR_SECTION_IFSET(CPU_FTR_CP_USE_DCBTZ)
> addi r3,r3,-8
> srdi r8,r5,7 /* page is copied in 128 byte strides */
> addi r8,r8,-1 /* one stride copied outside loop */
Instead of nop'ing it out, we could use an alternative feature section
to either run it or jump over it. It would look something like:
_GLOBAL(copy_4K_page)
BEGIN_FTR_SECTION
li r5,4096 /* 4K page size */
ld r10,PPC64_CACHES@toc(r2)
lwz r11,DCACHEL1LOGLINESIZE(r10) /* log2 of cache line size */
lwz r12,DCACHEL1LINESIZE(r10) /* Get cache line size */
li r9,0
srd r8,r5,r11
mtctr r8
setup:
dcbt r9,r4
dcbz r9,r3
add r9,r9,r12
bdnz setup
FTR_SECTION_ELSE
b 1f
ALT_FTR_SECTION_END_IFSET(CPU_FTR_CP_USE_DCBTZ)
1:
addi r3,r3,-8
So in the no-dcbtz case you'd get a branch instead of 11 nops.
Of course you'd need to benchmark it to see if skipping the nops is
better than executing them ;P
cheers
--
Michael Ellerman
OzLabs, IBM Australia Development Lab
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2008-08-14 10:51 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-14 6:18 [RFC 2/2] powerpc: copy_4K_page tweaked for Cell - add CPU feature Mark Nelson
2008-08-14 10:51 ` Michael Ellerman [this message]
2008-08-14 11:48 ` Mark Nelson
2008-08-14 12:10 ` Michael Ellerman
2008-08-15 6:33 ` Mark Nelson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1218711095.10673.4.camel@localhost \
--to=michael@ellerman.id.au \
--cc=cbe-oss-dev@ozlabs.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=markn@au1.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.