Re: [Linux-ia64] Strange performance monitoring results

Linux IA64 platform development
 help / color / mirror / Atom feed

From: Stephane Eranian <eranian@frankl.hpl.hp.com>
To: linux-ia64@vger.kernel.org
Subject: Re: [Linux-ia64] Strange performance monitoring results
Date: Thu, 16 May 2002 16:31:34 +0000	[thread overview]
Message-ID: <marc-linux-ia64-105590701905582@msgid-missing> (raw)
In-Reply-To: <marc-linux-ia64-105590701905566@msgid-missing>

[-- Attachment #1: Type: text/plain, Size: 2006 bytes --]

Matt,

On Fri, May 10, 2002 at 07:20:06PM +1000, Matt Chapman wrote:
> * Linux 2.4.18-ia64-020508 (CONFIG_PERFMON, !CONFIG_DISABLE_VHPT)
> * pfmon 1.0
> * Uniprocessor Itanium C1-step 
> * lat_ctx from LMbench 2.0p2 (ftp://ftp.bitmover.com/lmbench/)
> 
> (Though I get the same results with 2.4.16 and pfmon 0.06a.)
> 
> % pfmon -e ITLB_MISSES_FETCH,ITLB_INSERTS_HPW ./lat_ctx 5
> 
> "size=0k ovr=2.65
> 5 2.52
>                221400 ITLB_MISSES_FETCH
>                   133 ITLB_INSERTS_HPW
> 
> The ITLB misses figure seems much too big, especially given the number
> of hardware pagetable walker inserts is low.  Every few times I also get
> very big figures for DTLB_MISSES, although not DTC_MISSES (I would have
> thought DTLB_MISSES should be less than DTC_MISSES?).
> 
> Am I doing something wrong?
> 

I was able to reproduce what you are seing. In fact, I tried measuring
the same event using a different program. The LMbench test involves
several processes competing for the CPU. I used a single process instead.
I verified with the knowlegeable people that the counter is not bogus. 
However it does count more than what you'd expect. It counts all the 
detected ITLB misses (including a demand fetch), however not all of them 
end up in a translation being inserted because they get cancelled. This can 
happen because of prefetching and branch prediction. So if a branch is 
mispredicted, the ITLB misses generated by the (wrong) prediction will get 
cancelled but they are counted.

The attached test program stresses the TLB by having one function
per page. It involves an indirect branch which (most likely) is always
mispredicted. Now if you increase the number of iteration with a constant
(and small) number of functions called, you see the ITLB_MISSES_FETCH count
increase linearly. If you modify the assembly code and try to avoid
the misprediction with a hinted mov to br (mov.sptk.imp), then you suddenly
see the ITLB_MISSES_FETCH remain constant.

Hope this helps.

-- 
-Stephane

[-- Attachment #2: itlb_test.c --]
[-- Type: text/plain, Size: 2180 bytes --]

#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>

typedef struct {
	unsigned long addr;
	unsigned long gp;
} func_desc_t;

#define FUNC(n) int func_##n(void) { return n; }

FUNC(1) FUNC(2) FUNC(3) FUNC(4) FUNC(5) FUNC(6) FUNC(7) FUNC(8) FUNC(9)
FUNC(10) FUNC(11) FUNC(12) FUNC(13) FUNC(14) FUNC(15) FUNC(16) FUNC(17) FUNC(18) FUNC(19)
FUNC(20) FUNC(21) FUNC(22) FUNC(23) FUNC(24) FUNC(25) FUNC(26) FUNC(27) FUNC(28) FUNC(29)
FUNC(30) FUNC(31) FUNC(32) FUNC(33) FUNC(34) FUNC(35) FUNC(36) FUNC(37) FUNC(38) FUNC(39)
FUNC(40) FUNC(41) FUNC(42) FUNC(43) FUNC(44) FUNC(45) FUNC(46) FUNC(47) FUNC(48) FUNC(49)
FUNC(50) FUNC(51) FUNC(52) FUNC(53) FUNC(54) FUNC(55) FUNC(56) FUNC(57) FUNC(58) FUNC(59)
FUNC(60) FUNC(61) FUNC(62) FUNC(63) FUNC(64)

static int (*tab[])(void)={
	func_1, func_2, func_3, func_4, func_5, func_6, func_7, func_8, func_9,
	func_10, func_11, func_12, func_13, func_14, func_15, func_16, func_17, func_18, func_19,
	func_20, func_21, func_22, func_23, func_24, func_25, func_26, func_27, func_28, func_29,
	func_30, func_31, func_32, func_33, func_34, func_35, func_36, func_37, func_38, func_39,
	func_40, func_41, func_42, func_43, func_44, func_45, func_46, func_47, func_48, func_49,
	func_50, func_51, func_52, func_53, func_54, func_55, func_56, func_57, func_58, func_59,
	func_60, func_61, func_62, func_63, func_64,
	NULL
};

int 
doit(unsigned long iter, unsigned int max)
{
	unsigned int sum = 0, i, j;
	int (**pf)(void);

	for(j=0; j < iter; j++) {
		for(i=0, pf = tab; i < max && *pf; i++, pf++) {
			sum += (**pf)();
		}
	}
	return sum; /* ensures the compiler does not get rid of everything */
}


int 
main(int argc, char **argv)
{
	func_desc_t *fd1, *fd2;
	int pgsz;
	unsigned long iter;
	unsigned int sum = 0, i, j;
	unsigned int max = -1;
	int (**pf)(void);

	pgsz = getpagesize();

	fd1 = (func_desc_t *)func_1;
	fd2 = (func_desc_t *)func_2;

	if ((fd2->addr-fd1->addr) != pgsz) {
		printf("the program was not compiled with -falign-funtions=%d\n", pgsz);
		exit(1);
	}

	iter = argc > 1 ?strtoul(argv[1], NULL, 10) : 10000; 
	max  = argc > 2 ? atoi(argv[2]) : -1;

	doit(iter, max);
	_exit(0); /* short circuit libc exit() */
}

     prev parent reply	other threads:[~2002-05-16 16:31 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-05-10  9:20 [Linux-ia64] Strange performance monitoring results Matt Chapman
2002-05-16 16:31 ` Stephane Eranian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=marc-linux-ia64-105590701905582@msgid-missing \
    --to=eranian@frankl.hpl.hp.com \
    --cc=linux-ia64@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox