From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from list by monty-python.gnu.org with tmda-scanned (Exim 4.20)
	id 19T2Ub-0004fn-Em
	for qemu-devel@nongnu.org; Thu, 19 Jun 2003 12:42:33 -0400
Received: from mail by monty-python.gnu.org with spam-scanned (Exim 4.20)
	id 19T2SS-0003FU-PH
	for qemu-devel@nongnu.org; Thu, 19 Jun 2003 12:40:22 -0400
Received: from alcor.imaginet.fr ([195.68.86.12])
	by monty-python.gnu.org with esmtp (Exim 4.20)
	id 19T2M4-0000OQ-Pj
	for qemu-devel@nongnu.org; Thu, 19 Jun 2003 12:33:49 -0400
Received: from free.fr (gw.netgem.com [195.68.2.34])
	by alcor.imaginet.fr (8.11.6+Sun/8.8.8) with ESMTP id h5JGZlS05058
	for <qemu-devel@nongnu.org>; Thu, 19 Jun 2003 18:35:47 +0200 (MET DST)
Message-ID: <3EF1E5EB.4060204@free.fr>
Date: Thu, 19 Jun 2003 18:33:47 +0200
From: Fabrice Bellard <fabrice.bellard@free.fr>
MIME-Version: 1.0
Subject: Re: [Qemu-devel] tests with simulated memory
References: <20030619170256.6ee32716.jrydberg@night.trouble.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Reply-To: qemu-devel@nongnu.org
List-Id: <qemu-devel.nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Subscribe: <http://mail.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
List-Archive: <http://mail.gnu.org/pipermail/qemu-devel>
List-Unsubscribe: <http://mail.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
To: qemu-devel@nongnu.org

Some days ago I looked for a benchmark to publish QEMU objective results 
(gzip is not enough!). The BYTEmark you mention seems a good start. SPEC 
benchmarks are less interesting since their source code cannot be easily 
distributed.

I would have expected the slowdown to be more important. In the code you 
submitted you did not include alignment tests. Hopefully by just anding 
the address with '0xffff0003' you do both address translation _and_ 
unaligned access handling...

It seems that Valgrind is twice as fast on most tests. Some 
optimisations will be needed in qemu to correct that :-)

Fabrice.

Johan Rydberg wrote:
> Hi,
> 
> I've hacked a bit on QEMU and added simulated memory using a 
> translation cache as we discussed earlier.  These tests are 
> mostly for my own interest, but you might find them interesting
> aswell.
> 
> Below is the result of the BYTEmark [1] benchmark (I do not have 
> access to any of the SPEC benchmarks) using simulated memory:
> 
> NUMERIC SORT:  Iterations/sec.: 10.385957       Index: 0.268406 
> STRING SORT:   Iterations/sec.: 1.807970        Index: 0.794712
> BITFIELD:      Iterations/sec.: 1913364.293577  Index: 0.328203
> FP EMULATION:  Iterations/sec.: 0.647669        Index: 0.311379
> FOURIER:       Iterations/sec.: 619.696756      Index: 0.701681
> ASSIGNMENT:    Iterations/sec.: 0.151103        Index: 0.575697
> IDEA:          Iterations/sec.: 21.734410       Index: 0.332534
> HUFFMAN:       Iterations/sec.: 13.068599       Index: 0.363168
> 
> Same test without the simulated memory (original QEMU):
> 
> NUMERIC SORT:  Iterations/sec.: 20.327522       Index: 0.525327
> STRING SORT:   Iterations/sec.: 2.919430        Index: 1.283266
> BITFIELD:      Iterations/sec.: 3086647.786244  Index: 0.529458
> FP EMULATION:  Iterations/sec.: 1.112348        Index: 0.534783
> FOURIER:       Iterations/sec.: 717.791439      Index: 0.812754
> ASSIGNMENT:    Iterations/sec.: 0.208943        Index: 0.796063
> IDEA:          Iterations/sec.: 39.651108       Index: 0.606657
> HUFFMAN:       Iterations/sec.: 19.098677       Index: 0.530740
> 
> Slowdown rates (calculated using the Index field from original 
> QEMU divided with the Index from the QEMU w/ simulated memory):
> 
> NUMERIC SORT:  1.96
> STRING SORT:   1.61
> BITFIELD:      1.61
> FP EMULATION:  1.72
> FOURIER:       1.16
> ASSIGNMENT:    1.38
> IDEA:          1.82
> HUFFMAN:       1.46
> 
> The slowdown would be greater if any processing must have been
> done on every cache miss.  The current hack just adds the page
> to the cache and does the memory transaction and returns.
> 
> A slowdown between 1.16x and ~2x is pretty good I think.
> 
> As reference, below is the results for Valgrind (CVS version,
> running the none skin):
> 
> NUMERIC SORT:  Iterations/sec.: 36.541455       Index: 0.944346
> STRING SORT:   Iterations/sec.: 2.181686        Index: 0.958983
> BITFIELD:      Iterations/sec.: 6294984.678336  Index: 1.079789
> FP EMULATION:  Iterations/sec.: 2.232148        Index: 1.073148
> FOURIER:       Iterations/sec.: 746.055296      Index: 0.844757
> ASSIGNMENT:    Iterations/sec.: 0.386720        Index: 1.473388
> IDEA:          Iterations/sec.: 80.463770       Index: 1.231086
> HUFFMAN:       Iterations/sec.: 43.592067       Index: 1.211396
> 
>  [1] http://www.byte.com/bmark/bmark.htm
> 
> best regards,
> johan