From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:46477)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1XEGZ2-00015k-Ca
	for qemu-devel@nongnu.org; Mon, 04 Aug 2014 07:36:33 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1XEGYx-0003Bo-8h
	for qemu-devel@nongnu.org; Mon, 04 Aug 2014 07:36:28 -0400
Received: from static.88-198-71-155.clients.your-server.de
	([88.198.71.155]:36725 helo=socrates.bennee.com)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <alex.bennee@linaro.org>) id 1XEGYx-0003Bj-1o
	for qemu-devel@nongnu.org; Mon, 04 Aug 2014 07:36:23 -0400
References: <1406733627-24255-1-git-send-email-alex.bennee@linaro.org>
	<53DBEC17.9040009@redhat.com> <87a97klf7p.fsf@linaro.org>
From: Alex =?utf-8?Q?Benn=C3=A9e?= <alex.bennee@linaro.org>
Date: Mon, 04 Aug 2014 12:34:22 +0100
In-reply-to: <87a97klf7p.fsf@linaro.org>
Message-ID: <878un4lc5d.fsf@linaro.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [PATCH v2 0/5] AArch64 TLB performance improvements
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: peter.maydell@linaro.org, Xin Tong <trent.tong@gmail.com>, qemu-devel@nongnu.org


Alex Bennée writes:

> Paolo Bonzini writes:
>
>> Il 30/07/2014 17:20, Alex Bennée ha scritto:
>>> Hi,
>>> 
> <snip>
>>> The most important thing is I've measured a 25-30% improvement in
>>> kernel and android boot time.
>>> 
> <snip>
>> Hi Alex, have you seen this patch?  Perhaps you're interested in
>> reviving it.
>>
>> http://article.gmane.org/gmane.comp.emulators.qemu/253864
>
> I saw it when it first came out but I didn't quite follow what it was
> doing as I hadn't looked at the TLB code. I'll have another look and see
> what difference it can make.

A quick and dirty benchmark:

**** Comparing 10bit/12bit tables with and without [[http://article.gmane.org/gmane.comp.emulators.qemu/253864][victim cache]]

#+BEGIN_NOTES
Time in seconds, smaller is better
Percentage is amount of time compared to run to the left
#+END_NOTES

| Code  |   10 bit | 10 bit + victim |    12 bit | 12 bit + victim |
|-------+----------+-----------------+-----------+-----------------|
|       |   12.783 |          11.664 |    10.348 |           9.527 |
| Runs  |   13.046 |          11.971 |    10.123 |           9.326 |
|       |   12.929 |          11.673 |    11.130 |           9.858 |
|       |   12.981 |          11.941 |    10.223 |           9.673 |
|-------+----------+-----------------+-----------+-----------------|
| Avgs  | 12.93475 |        11.81225 |    10.456 |           9.596 |
|-------+----------+-----------------+-----------+-----------------|
| %prev |     100% |       91.321827 | 88.518276 |       91.775057 |
#+TBLFM: $2=vmean(@I..II)::$3=(@II$3/@II$2)*100::$4=vmean(@I..II)::$5=vmean(@I..II)

Which as you expect shows the page table size is a greater improvement
to the performance but the victim cache also improves the run time on
top of this.

I say as you would expect because any time you need to exit translated
code there is a bunch of overhead in doing so.

-- 
Alex Bennée