From: Ingo Molnar <mingo@elte.hu>
To: Zachary Amsden <zach@vmware.com>
Cc: Nick Piggin <npiggin@suse.de>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
"hpa@zytor.com" <hpa@zytor.com>,
"jeremy@xensource.com" <jeremy@xensource.com>,
"chrisw@sous-sol.org" <chrisw@sous-sol.org>,
"rusty@rustcorp.com.au" <rusty@rustcorp.com.au>
Subject: Re: lmbench lat_mmap slowdown with CONFIG_PARAVIRT
Date: Tue, 20 Jan 2009 20:31:28 +0100 [thread overview]
Message-ID: <20090120193128.GA21481@elte.hu> (raw)
In-Reply-To: <1232478318.16317.160.camel@bodhitayantram.eng.vmware.com>
* Zachary Amsden <zach@vmware.com> wrote:
> On Tue, 2009-01-20 at 03:26 -0800, Ingo Molnar wrote:
>
> > Jeremy, any ideas where this slowdown comes from and how it could be
> > fixed?
>
> Well I'm early responding to this thread before reading on, but I looked
> at the generated assembly for some common mm paths and it looked awful.
> The biggest loser was probably having functions to convert pte_t back
> and forth to pteval_t, which makes most potential mask / shift
> optimizations impossible - indeed, because the compiler doesn't even
> understand pte_val(X) = Y is static over the lifetime of the function,
> it often calls these same conversions back and forth several times, and
> because this is often done inside hidden macros, it's not even possible
> to save a cached value in most places.
>
> The bulk of state required to keep this extra conversion around ties up
> a lot of registers and as a result heavily limits potential further
> optimizations.
>
> The code did not look more branchy to me, however, and gcc seemed to do
> a good job with lining up a nice branch structure in the few paths I
> looked at.
i've extended my mmap test with branch execution hw-perfcounter stats:
-----------------------------------------------
| Performance counter stats for './mmap-perf' |
-----------------------------------------------
| |
| x86-defconfig | PARAVIRT=y
|------------------------------------------------------------------
|
| 1311.554526 | 1360.624932 task clock ticks (msecs) +3.74%
| |
| 1 | 1 CPU migrations
| 91 | 79 context switches
| 55945 | 55943 pagefaults
| ............................................
| 3781392474 | 3918777174 CPU cycles +3.63%
| 1957153827 | 2161280486 instructions +10.43%
| 50234816 | 51303520 cache references +2.12%
| 5428258 | 5583728 cache misses +2.86%
|
| 437983499 | 478967061 branches +9.36%
| 32486067 | 32336874 branch-misses -0.46%
| |
| 1314.782469 | 1363.694447 time elapsed (msecs) +3.72%
| |
-----------------------------------
So we execute 9.36% more branches - i.e. very noticeably higher as well.
The CPU predicts them slightly more effectively though, the -0.46% for
branch-misses is well above measurement noise (of ~0.02% for the branch
metric) so it's a systematic effect.
Non-functional 'boring' bloat tends to be easier to predict so it's not
necessarily a real surprise. That also explains why despite +10.43% more
instructions the total cycle count went up by a comparatively smaller
+3.63%.
[ that's 64-bit x86 btw. ]
Ingo
next prev parent reply other threads:[~2009-01-20 19:32 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-01-20 11:05 lmbench lat_mmap slowdown with CONFIG_PARAVIRT Nick Piggin
2009-01-20 11:26 ` Ingo Molnar
2009-01-20 12:34 ` Nick Piggin
2009-01-20 12:45 ` Ingo Molnar
2009-01-20 13:41 ` Nick Piggin
2009-01-20 14:03 ` Ingo Molnar
2009-01-20 14:14 ` Nick Piggin
2009-01-20 14:17 ` Ingo Molnar
2009-01-20 14:41 ` Nick Piggin
2009-01-20 15:00 ` Ingo Molnar
2009-01-20 15:13 ` Ingo Molnar
2009-01-20 19:37 ` Ingo Molnar
2009-01-20 20:45 ` Jeremy Fitzhardinge
2009-01-20 20:56 ` Ingo Molnar
2009-01-21 7:27 ` Nick Piggin
2009-01-21 22:23 ` Jeremy Fitzhardinge
2009-01-22 22:28 ` Zachary Amsden
2009-01-22 22:44 ` Jeremy Fitzhardinge
2009-01-22 22:49 ` H. Peter Anvin
2009-01-22 22:58 ` Zachary Amsden
2009-01-22 23:52 ` H. Peter Anvin
2009-01-23 0:08 ` Jeremy Fitzhardinge
2009-01-22 22:55 ` Zachary Amsden
2009-01-23 0:14 ` Jeremy Fitzhardinge
2009-01-27 7:59 ` Ingo Molnar
2009-01-27 8:24 ` Jeremy Fitzhardinge
2009-01-27 10:17 ` Jeremy Fitzhardinge
2009-01-20 19:05 ` Zachary Amsden
2009-01-20 19:31 ` Ingo Molnar [this message]
2009-01-22 22:26 ` Jeremy Fitzhardinge
2009-01-22 23:04 ` Ingo Molnar
2009-01-22 23:30 ` Zachary Amsden
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090120193128.GA21481@elte.hu \
--to=mingo@elte.hu \
--cc=chrisw@sous-sol.org \
--cc=hpa@zytor.com \
--cc=jeremy@xensource.com \
--cc=linux-kernel@vger.kernel.org \
--cc=npiggin@suse.de \
--cc=rusty@rustcorp.com.au \
--cc=torvalds@linux-foundation.org \
--cc=zach@vmware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox