All of lore.kernel.org
 help / color / mirror / Atom feed
From: bert hubert <ahu@ds9a.nl>
To: Mikael Pettersson <mikpe@csd.uu.se>
Cc: linux-kernel@vger.kernel.org
Subject: small perfctr bug or misunderstanding
Date: Sat, 3 Jul 2004 16:08:29 +0200	[thread overview]
Message-ID: <20040703140829.GA13241@outpost.ds9a.nl> (raw)
In-Reply-To: <200407031028.i63AS9W3018392@harpo.it.uu.se>

On Sat, Jul 03, 2004 at 12:28:09PM +0200, Mikael Pettersson wrote:

> There would be a /proc/<pid>/<tid>/perfctr/ directory
> with files representing the control data, counter
> state, general info, and auxiliary control ops.

Mikael, thanks for the low-level-api.txt documentation. Will vperfctr_* see
some documentation? Want me to whip up manpages?

So far perfctr has been very useful to me already - I now know parts of
PowerDNS that are completely memory bound, which I so far only suspected.
Are the global counters available? There is a note in the perfctl
distribution that says they aren't?

One thing - on my Pentium M I'm unable to get more than one counter going
simultaneously, I get 'Operation not permitted'. Perfex reports that
supposedly two are possible.

PerfCtr Info:
abi_version		0x06000500
driver_version		2.7.3
cpu_type		14 (Intel Pentium M)
cpu_features		0x3 (rdpmc,rdtsc)
cpu_khz			1399252
tsc_to_cpu_mult		1
cpu_nrctrs		2
cpus			[0], total: 1
cpus_forbidden		[], total: 0

PERFCTR INIT: vendor 0, family 6, model 9, stepping 5, clock 1399252 kHz
PERFCTR INIT: NITER == 64
PERFCTR INIT: loop overhead is 118 cycles
PERFCTR INIT: rdtsc cost is 48.5 cycles (3223 total)
PERFCTR INIT: rdpmc cost is 45.4 cycles (3027 total)
PERFCTR INIT: rdmsr (counter) cost is 95.4 cycles (6229 total)
PERFCTR INIT: rdmsr (evntsel) cost is 81.3 cycles (5322 total)
PERFCTR INIT: wrmsr (counter) cost is 143.7 cycles (9318 total)
PERFCTR INIT: wrmsr (evntsel) cost is 132.3 cycles (8591 total)
PERFCTR INIT: read cr4 cost is 3.0 cycles (311 total)
PERFCTR INIT: write cr4 cost is 49.8 cycles (3308 total)
perfctr: driver 2.7.3, cpu type Intel P6 at 1399252 kHz

On my Athlon, 4 are reported possible and 4 work just fine. But I might be
misunderstanding the Intel docs.

The code below works fine when the second counter is commented out:

#include <iostream>
using namespace std;
extern "C" {
#include "libperfctr.h"
}
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "arch.h"

class PerfCtr
{
public:
  PerfCtr()
  {
    d_self = vperfctr_open();
    if( !d_self ) {
	perror("vperfctr_open");
	exit(1);
    }

    memset(&d_control.cpu_control, 0, sizeof(d_control.cpu_control));
    d_control.cpu_control.tsc_on=1;
  }

  void addCounter(unsigned int v, unsigned int unit=0) 
  {
    int count=d_control.cpu_control.nractrs;

    d_control.cpu_control.evntsel[count] = v | (1 << 16) | (1 << 22) | (unit << 8); 
    d_control.cpu_control.pmc_map[count] = count;
    d_control.cpu_control.nractrs++; // no support for .nrictrs
  }

  void go()
  {
    if(vperfctr_control(d_self, &d_control) < 0) {
      perror("vperfctr_control");
      exit(1);
    }
    zero();
  }

  void zero()
  {
    memset(&d_baseline,0,sizeof(d_baseline));
    vperfctr_read_ctrs(d_self, &d_baseline);
  }

  ~PerfCtr()
  {
    vperfctr_close(d_self);
  }

  void get(long long* counters, long long& tsc)
  {
    struct perfctr_sum_ctrs now;
    memset(&now,0,sizeof(d_baseline));
    if(vperfctr_read_ctrs(d_self, &now) < 0) {
      perror("read counters");
      exit(1);
    }
    
    for(unsigned int n=0;n<d_control.cpu_control.nractrs;++n)
      counters[n]=now.pmc[n] - d_baseline.pmc[n];

    tsc=now.tsc - d_baseline.tsc;
  }

private:
  struct vperfctr *d_self;
  struct vperfctr_control d_control;
  struct perfctr_sum_ctrs d_baseline;
};


int main()
{
  PerfCtr pc;
  pc.addCounter(0x48); // DCU MISS OUTSTANDING
  pc.addCounter(0x43); // DATA_MEM_REFS

  pc.go();

  long long results[2], tsc;
  pc.get(results,tsc);

  cout<<"Cycles waiting on DCU miss:  "<<results[0]<<endl;
  cout<<"Number of memory references: "<<results[1]<<endl;
  cout<<"Cycles spent:                "<<tsc<<endl;
}



-- 
http://www.PowerDNS.com      Open source, database driven DNS Software 
http://lartc.org           Linux Advanced Routing & Traffic Control HOWTO

  parent reply	other threads:[~2004-07-03 14:08 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-03 10:28 [PATCH][2.6.7-mm5] perfctr low-level documentation Mikael Pettersson
2004-07-03 10:34 ` Andrew Morton
2004-07-03 14:08 ` bert hubert [this message]
  -- strict thread matches above, loose matches on Subject: below --
2004-07-03 14:58 small perfctr bug or misunderstanding Mikael Pettersson
2004-07-04  1:15 ` bert hubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040703140829.GA13241@outpost.ds9a.nl \
    --to=ahu@ds9a.nl \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikpe@csd.uu.se \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.