linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* Re: Improved copy_page() function, about 30% speed up for mpc860!
@ 2003-03-02 17:50 Joakim Tjernlund
  2003-03-03 21:18 ` Dan Malek
  0 siblings, 1 reply; 27+ messages in thread
From: Joakim Tjernlund @ 2003-03-02 17:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: drow


> > > I can't tell you what revs they were, but all of the MPC860's I could
> > > get my hands on here the last time I tried to use dcbz on them were
> > > faulty. You may just not be triggering the bug.
> >
> > hmm, what boards was this?
> > I am planning to a larger test here with all our custom mpc860 and mpc862 boards. We have them in
> > 100, 80 and 50 MHZ variants.
> >
> > May be the bug is related to board design?  Is there an official errata from Motorla
> > regarding this bug? I can't find any.
> >
> > Anyhow I had a flaw in my testprogram, so you can throw this version of copy_page() away.
> > But enabling the use of dcbz in the current version still gives me 30%+ performance increase.
> >
> > See the embedded list for details.
> >
> >  Jocke

I found a link that may have relevance regarding  the dcbz problem , can anybody confirm this?
http://www.uwsg.iu.edu/hypermail/linux/kernel/0012.0/0529.html

    Jocke

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 27+ messages in thread
* Improved copy_page() function, about 30% speed up for mpc860!
@ 2003-02-27 13:08 Joakim Tjernlund
  2003-02-27 15:45 ` Joakim Tjernlund
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Joakim Tjernlund @ 2003-02-27 13:08 UTC (permalink / raw)
  To: Linuxppc-Embedded@Lists. Linuxppc. Org


Hi all

I have been playing with the copy_page() function in arch/ppc/kernel/misc.S
and gained about 30% speed up for my mpc860, rev D4 MHz.

This is what i did:
- Use dcbz on 8xx but clear ahead one cache line(performance is really crappy
  if I don't clear ahead). This is the biggest improvement.
- Use prefetch for 8xx as well.

I know that dcbz is buggy for some 8xx CPUs but I don't know which ones.
For me works just fine, except in copy_tofrom_user(don't know why).

I would like to get some feedback & test results both for 8xx and non 8xx.
Please include exact CPU and revision.

 Thanks
         Jocke

_GLOBAL(copy_page)
	addi	r3,r3,-4
	addi	r4,r4,-4
	li	r5,4
#if MAX_COPY_PREFETCH > 1
	/* This will prefetch past end of page, does not seem to be a problem? */
	li	r0,MAX_COPY_PREFETCH
	li	r11,4
	mtctr	r0
11:	dcbt	r11,r4
	addi	r11,r11,L1_CACHE_LINE_SIZE
	bdnz	11b
#else /* MAX_L1_COPY_PREFETCH == 1 */
	dcbt	r5,r4
	li	r11,L1_CACHE_LINE_SIZE+4
#endif /* MAX_L1_COPY_PREFETCH */
	dcbz	r5,r3 /* older 8xx CPUs may have buggy dcbz instructions, if so try "dcbt r5,r3" instead */
	addi	r5,r5,L1_CACHE_LINE_SIZE
	li	r0,4096/L1_CACHE_LINE_SIZE-1 /* All, but the last cache line of data due dcbz below */
	mtctr	r0
1:
	dcbt	r11,r4
	dcbz	r5,r3 /* zero the cache line after the one that is beeing copied
		       * older 8xx CPUs may have buggy dcbz instructions, if so try "dcbt r5,r3" instead */
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 32
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 64
	COPY_16_BYTES
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 128
	COPY_16_BYTES
	COPY_16_BYTES
	COPY_16_BYTES
	COPY_16_BYTES
#endif
#endif
#endif
	bdnz	1b
/* Copy the last cache line of data */
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 32
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 64
	COPY_16_BYTES
	COPY_16_BYTES
#if L1_CACHE_LINE_SIZE >= 128
	COPY_16_BYTES
	COPY_16_BYTES
	COPY_16_BYTES
	COPY_16_BYTES
#endif
#endif
#endif
	blr


** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2003-03-05 17:50 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-03-02 17:50 Improved copy_page() function, about 30% speed up for mpc860! Joakim Tjernlund
2003-03-03 21:18 ` Dan Malek
2003-03-03 23:16   ` Joakim Tjernlund
2003-03-04  0:43     ` Dan Malek
2003-03-04  0:54       ` Daniel Jacobowitz
2003-03-04  3:38         ` Dan Malek
2003-03-04  8:29           ` Joakim Tjernlund
2003-03-04 13:33             ` Dan Malek
2003-03-04 15:24               ` Joakim Tjernlund
2003-03-04 17:00                 ` Dan Malek
2003-03-04 22:01                   ` Joakim Tjernlund
2003-03-04 22:41                     ` Dan Malek
2003-03-04 23:20                       ` Joakim Tjernlund
2003-03-04 23:35                     ` Tom Rini
2003-03-04 23:45                       ` Joakim Tjernlund
2003-03-05  0:05                         ` Tom Rini
2003-03-05  0:19                           ` Joakim Tjernlund
2003-03-05 17:12                             ` Tom Rini
2003-03-05 17:50                               ` Joakim Tjernlund
2003-03-05 17:15                       ` Dan Malek
     [not found]     ` <1046737789.885.15.camel@zion.wanadoo.fr>
2003-03-04  0:51       ` Dan Malek
  -- strict thread matches above, loose matches on Subject: below --
2003-02-27 13:08 Joakim Tjernlund
2003-02-27 15:45 ` Joakim Tjernlund
2003-02-28 17:31 ` Joakim Tjernlund
2003-03-03 21:28 ` Dan Malek
2003-03-04  0:09   ` Joakim Tjernlund
2003-03-04  0:19   ` Paul Mackerras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).