All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>,
	William Cohen <wcohen@redhat.com>,
	Stephane Eranian <eranian@google.com>,
	Arun Sharma <asharma@fb.com>, Vince Weaver <vince@deater.net>,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC][PATCH 0/6] perf: x86 RDPMC and RDTSC support
Date: Wed, 21 Dec 2011 14:15:11 +0100	[thread overview]
Message-ID: <20111221131511.GA31186@elte.hu> (raw)
In-Reply-To: <1324472337.10752.11.camel@twins>


* Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:

> > I used the mmap_read_self() routine from your example as the 
> > "read" performance that I measured.
> 
> Yeah that's about it, if you want to discard the overload 
> scenario, eg you use pinned counters or so, you can optimize 
> it further by stripping out the tsc and scaling muck.

With the TSC scaling muck it a bit above 50 cycles here.

Without that, by optimizing it further for pinned counters, the 
overhead of mmap_read_self() gets down to 36 cycles on a Nehalem 
box.

A PEBS profile run of it shows that 90% of the overhead is in 
the RDPMC instruction:

    2.92 :          42a919:       lea    -0x1(%rax),%ecx                                          ▒
         :                                                                                        ▒
         :        static u64 rdpmc(unsigned int counter)                                          ▒
         :        {                                                                               ▒
         :                unsigned int low, high;                                                 ▒
         :                                                                                        ▒
         :                asm volatile("rdpmc" : "=a" (low), "=d" (high) : "c" (counter));        ▒
   89.20 :          42a91c:       rdpmc                                                           ▒
         :                        count = pc->offset;                                             ▒
         :                        if (idx)                                                        ▒
         :                                count += rdpmc(idx - 1);                                ▒
         :                                                                                        ▒

So the RDPMC instruction alone is 32 cycles. The perf way of 
reading it adds another 4 cycles to it.

So the measured 'perf overhead' is so ridiculously low that it's 
virtually non-existent.

Here's "pinned events" variant i've measured:

static u64 mmap_read_self(void *addr)
{
        struct perf_event_mmap_page *pc = addr;
        u32 seq, idx;
        u64 count;

        do {
                seq = pc->lock;
                barrier();

                idx = pc->index;
                count = pc->offset;
                if (idx)
                        count += rdpmc(idx - 1);

                barrier();
        } while (pc->lock != seq);

        return count;
}

Thanks,

	Ingo

  reply	other threads:[~2011-12-21 14:33 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-21 14:51 [RFC][PATCH 0/6] perf: x86 RDPMC and RDTSC support Peter Zijlstra
2011-11-21 14:51 ` [RFC][PATCH 1/6] perf: Update the mmap control page on mmap() Peter Zijlstra
2011-11-21 14:51 ` [RFC][PATCH 2/6] perf, arch: Rework perf_event_index() Peter Zijlstra
2011-11-21 16:22   ` Eric B Munson
2011-11-21 17:23   ` Will Deacon
2011-11-21 19:18     ` Peter Zijlstra
2011-11-21 20:31       ` Will Deacon
2011-11-21 20:35         ` Peter Zijlstra
2011-11-21 22:43           ` Will Deacon
2011-11-22 11:26             ` Peter Zijlstra
2011-11-22 11:47               ` Will Deacon
2011-11-22 11:49                 ` Oleg Strikov
2011-11-22 11:52                   ` Will Deacon
2011-11-22 11:56                     ` Oleg Strikov
2011-11-22 12:00                     ` Oleg Strikov
2011-11-22 12:14                       ` Will Deacon
2011-11-22 12:25                         ` Oleg Strikov
2011-11-22 11:51                 ` Peter Zijlstra
2011-11-22 11:54                   ` Will Deacon
2011-11-22 11:48               ` Oleg Strikov
2011-11-21 14:51 ` [RFC][PATCH 3/6] perf, x86: Implement userspace RDPMC Peter Zijlstra
2011-11-21 14:51 ` [RFC][PATCH 4/6] perf, x86: Provide means of disabling " Peter Zijlstra
2011-11-21 14:51 ` [RFC][PATCH 5/6] perf: Extend the mmap control page with time (TSC) fields Peter Zijlstra
2011-12-28 17:55   ` Stephane Eranian
2011-11-21 14:51 ` [RFC][PATCH 6/6] perf, tools: X86 RDPMC, RDTSC test Peter Zijlstra
2011-11-21 15:29   ` Stephane Eranian
2011-11-21 15:37     ` Peter Zijlstra
2011-11-21 16:59       ` Peter Zijlstra
2011-11-21 17:42         ` Stephane Eranian
2011-11-21 15:02 ` [RFC][PATCH 0/6] perf: x86 RDPMC and RDTSC support Vince Weaver
2011-11-21 16:05   ` William Cohen
2011-11-21 16:08   ` William Cohen
2011-12-02 19:26 ` Arun Sharma
2011-12-02 22:22   ` Stephane Eranian
2011-12-05 20:16     ` Arun Sharma
2011-12-05 23:17       ` Arun Sharma
2011-12-06  1:38         ` Stephane Eranian
2011-12-06  9:42         ` Peter Zijlstra
2011-12-06 21:53           ` Arun Sharma
2011-12-16 22:36 ` Vince Weaver
2011-12-21 12:58   ` Peter Zijlstra
2011-12-21 13:15     ` Ingo Molnar [this message]
2011-12-23 20:12       ` Vince Weaver
2011-12-21 15:04     ` Vince Weaver
2011-12-21 21:32       ` Vince Weaver
2011-12-21 21:41         ` Peter Zijlstra
2011-12-21 22:19           ` Vince Weaver
2011-12-21 22:32             ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111221131511.GA31186@elte.hu \
    --to=mingo@elte.hu \
    --cc=a.p.zijlstra@chello.nl \
    --cc=asharma@fb.com \
    --cc=eranian@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=vince@deater.net \
    --cc=vweaver1@eecs.utk.edu \
    --cc=wcohen@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.