linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Paul Mackerras <paulus@cs.anu.edu.au>
To: rth@cygnus.com
Cc: Jes.Sorensen@cern.ch, Geert.Uytterhoeven@cs.kuleuven.ac.be,
	linuxppc-dev@lists.linuxppc.org, linux-fbdev@vuser.vu.union.edu
Subject: Re: [linux-fbdev] Re: readl() and friends and eieio on PPC
Date: Thu, 12 Aug 1999 17:07:02 +1000	[thread overview]
Message-ID: <199908120707.RAA30438@tango.anu.edu.au> (raw)
In-Reply-To: <19990811224344.A14713@cygnus.com> (message from Richard Henderson on Wed, 11 Aug 1999 22:43:44 -0700)


Richard Henderson <rth@cygnus.com> wrote:

> As I see it, testing against main memory should be the lower
> bound of the numbers, since it's the quickest to respond.  A
> real device will take longer to respond, so any enforced delays
> (or failures to write-combine) will only exagerate the difference.

Hmmm, no, doesn't it go the other way around?

Going to L1 cache will mean that we can isolate the overhead of the
wmb, and will exaggerate the ratio between the two cases.

A real device that takes longer to respond will make the overhead of
the wmb a smaller fraction of the total time.  And you would hope that
the cpu could overlap the wmb, or at least the time to decode and
issue it, with the time waiting for the device to respond.

> Anyway, the results (in cycles) from my 533MHz sx164 are:
> 
> 10

One-cycle access to L1 cache, I guess?

> 10
> 10
> 10
> 10
> 223

Because of i-cache misses, presumably

> 94
> 94
> 94
> 94
> 
> So the cost of wmb for 8 store+wmb, versus 8 stores with one wmb,
> is over 9:1.

Interesting.  Sounds like each wmb takes about 12 cycles ((94-10)/7),
which sounds a bit like it is going all the way out to the memory bus
and back before the cpu does the next instruction.

(Ob. nitpicking: if a wmb takes 12 cycles, how come we can do a wmb
and 8 stores in 10 cycles? :-)

> For grins, will you try the same test on your ppc?

Sure, happy to.

I think I have correctly understood the alpha assembly syntax.  My PPC
version is below.  I've added a couple of things.  First, PPC has a
`timebase' register which counts at 1/4 of the bus clock, which means
once every 16 cycles on my G3 desktop at work.  For this reason I have
put a loop around the sets of stores to do them 16 times.  The
overhead of the loop should be zero (the branch is pretty easily
predictable :-).  The numbers should thus be cycles per iteration.

Secondly, I added stuff to mmap a framebuffer and do the stores to a
word in it, just for grins.

The results tended to vary quite a lot from run to run, but here's a
typical set:

17 10 9 9 9
24 17 16 16 16
732 731 736 786 727
666 755 840 774 801

So the eieio doesn't look to be nearly as expensive on PPC as wmb is
on alpha.  (16 - 9) / 7 = 1 cycle for the eieio, which is going to be
insignificant in the context of an access to a device register, which
can easily take ~ 50 to 100 cycles.

The average of the 3rd line is 742, and of the 4th line is 767.  But
given the spread of the numbers, I don't think that the difference is
statistically significant.  This is going to the framebuffer on an ATI
Rage chip.  760 cycles is 95 cpu cycles per access, or about 350ns.  I
guess ATI chips expect you to use the drawing engine if you are doing
any significant amount of stuff. :-)

What numbers do you get on alpha if you point it at a framebuffer,
just for interest?

Paul.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>

test(unsigned long *ptr)
{
  int i;
  unsigned s, e;

  for (i = 0; i < 5; ++i)
    {
      asm("mftb %0
	   mtctr %3
	1: stw 16,%2
	   stw 16,%2
	   stw 16,%2
	   stw 16,%2
	   stw 16,%2
	   stw 16,%2
	   stw 16,%2
	   eieio
	   stw 16,%2
	   bdnz 1b
	   mftb %1"
	: "=r"(s), "=r"(e), "=m"(*ptr)
	: "r"(16));
      printf("%u ", e-s);
    }
  printf("\n");

  for (i = 0; i < 5; ++i)
    {
      asm("mftb %0
	   mtctr %3
	1: stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   stw 16,%2
	   eieio
	   bdnz 1b
	   mftb %1"
	: "=r"(s), "=r"(e), "=m"(*ptr)
	: "r"(16));
      printf("%u ", e-s);
    }
  printf("\n");
}

#define PAGESIZE	0x1000

main(int ac, char **av)
{
	unsigned long base, offset;
	int fd;
	unsigned long mem;
	unsigned long *ptr;

	test(&mem);
	if (ac > 1) {
		base = strtoul(av[1], 0, 16);
		offset = (base & (PAGESIZE - 1)) / sizeof(unsigned long);
		base &= -PAGESIZE;
		if ((fd = open("/dev/mem", 2)) < 0) {
			perror("/dev/mem");
			exit(1);
		}
		ptr = (unsigned long *)
			mmap(0, PAGESIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, base);
		if ((long)ptr == -1) {
			perror("mmap");
			exit(1);
		}
		test(ptr + offset);
	}
	exit(0);
}

[[ This message was sent via the linuxppc-dev mailing list.  Replies are ]]
[[ not  forced  back  to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting.   ]]

  reply	other threads:[~1999-08-12  7:07 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-08-09  8:17 readl() and friends and eieio on PPC Geert Uytterhoeven
1999-08-09 17:19 ` David A. Gatwood
1999-08-10  1:00 ` Paul Mackerras
1999-08-10  7:18   ` [linux-fbdev] " Jes Sorensen
1999-08-11  0:23     ` Paul Mackerras
1999-08-11  7:23       ` Jes Sorensen
1999-08-11  7:38         ` Richard Henderson
1999-08-12  0:13           ` Paul Mackerras
1999-08-12  1:39             ` Peter Chang
1999-08-12  4:52               ` Paul Mackerras
1999-08-12  6:17                 ` Peter Chang
1999-08-12  0:17           ` Paul Mackerras
1999-08-12  4:40             ` Richard Henderson
1999-08-12  5:00               ` Paul Mackerras
1999-08-12  5:43                 ` Richard Henderson
1999-08-12  7:07                   ` Paul Mackerras [this message]
1999-08-12  7:33                     ` Richard Henderson
1999-08-12  9:58                       ` Paul Mackerras
1999-08-12 12:31                     ` Geert Uytterhoeven
1999-08-13 12:18                       ` Paul Mackerras
1999-08-18 11:02                       ` Gabriel Paubert
1999-08-13 18:33                     ` Richard Henderson
1999-08-12  5:16               ` David Edelsohn
1999-08-12  5:27                 ` Paul Mackerras
1999-08-12  5:52                 ` Richard Henderson
1999-08-12  7:11                   ` Paul Mackerras
1999-08-12  7:32                 ` Jes Sorensen
1999-08-11 23:52         ` Paul Mackerras
1999-08-12  7:38           ` Jes Sorensen
1999-08-12 19:00           ` David A. Gatwood
1999-08-13  1:51             ` Paul Mackerras
     [not found] <Pine.LNX.3.96.990813143741.27557B-100000@mvista.com>
     [not found] ` <d3so5mdyta.fsf@lxp03.cern.ch>
1999-08-14 18:34   ` Geert Uytterhoeven
1999-08-14 18:36   ` David A. Gatwood
1999-08-14 19:48     ` Jes Sorensen
1999-08-15  1:28       ` David A. Gatwood
1999-08-14 21:39   ` Richard Henderson
1999-08-15 23:16   ` Paul Mackerras
1999-08-16  0:29     ` Richard Henderson
1999-08-16  7:11     ` Jes Sorensen
     [not found] <m3672hkxri.fsf@soma.andreas.org>
1999-08-15 13:39 ` James Simmons
     [not found] <d3pv0p72yr.fsf@lxp03.cern.ch>
1999-08-15 19:43 ` David A. Gatwood

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=199908120707.RAA30438@tango.anu.edu.au \
    --to=paulus@cs.anu.edu.au \
    --cc=Geert.Uytterhoeven@cs.kuleuven.ac.be \
    --cc=Jes.Sorensen@cern.ch \
    --cc=Paul.Mackerras@cs.anu.edu.au \
    --cc=linux-fbdev@vuser.vu.union.edu \
    --cc=linuxppc-dev@lists.linuxppc.org \
    --cc=rth@cygnus.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).