* [Qemu-devel] tests with simulated memory
@ 2003-06-19 15:02 Johan Rydberg
2003-06-19 16:33 ` Fabrice Bellard
0 siblings, 1 reply; 4+ messages in thread
From: Johan Rydberg @ 2003-06-19 15:02 UTC (permalink / raw)
To: qemu-devel
Hi,
I've hacked a bit on QEMU and added simulated memory using a
translation cache as we discussed earlier. These tests are
mostly for my own interest, but you might find them interesting
aswell.
Below is the result of the BYTEmark [1] benchmark (I do not have
access to any of the SPEC benchmarks) using simulated memory:
NUMERIC SORT: Iterations/sec.: 10.385957 Index: 0.268406
STRING SORT: Iterations/sec.: 1.807970 Index: 0.794712
BITFIELD: Iterations/sec.: 1913364.293577 Index: 0.328203
FP EMULATION: Iterations/sec.: 0.647669 Index: 0.311379
FOURIER: Iterations/sec.: 619.696756 Index: 0.701681
ASSIGNMENT: Iterations/sec.: 0.151103 Index: 0.575697
IDEA: Iterations/sec.: 21.734410 Index: 0.332534
HUFFMAN: Iterations/sec.: 13.068599 Index: 0.363168
Same test without the simulated memory (original QEMU):
NUMERIC SORT: Iterations/sec.: 20.327522 Index: 0.525327
STRING SORT: Iterations/sec.: 2.919430 Index: 1.283266
BITFIELD: Iterations/sec.: 3086647.786244 Index: 0.529458
FP EMULATION: Iterations/sec.: 1.112348 Index: 0.534783
FOURIER: Iterations/sec.: 717.791439 Index: 0.812754
ASSIGNMENT: Iterations/sec.: 0.208943 Index: 0.796063
IDEA: Iterations/sec.: 39.651108 Index: 0.606657
HUFFMAN: Iterations/sec.: 19.098677 Index: 0.530740
Slowdown rates (calculated using the Index field from original
QEMU divided with the Index from the QEMU w/ simulated memory):
NUMERIC SORT: 1.96
STRING SORT: 1.61
BITFIELD: 1.61
FP EMULATION: 1.72
FOURIER: 1.16
ASSIGNMENT: 1.38
IDEA: 1.82
HUFFMAN: 1.46
The slowdown would be greater if any processing must have been
done on every cache miss. The current hack just adds the page
to the cache and does the memory transaction and returns.
A slowdown between 1.16x and ~2x is pretty good I think.
As reference, below is the results for Valgrind (CVS version,
running the none skin):
NUMERIC SORT: Iterations/sec.: 36.541455 Index: 0.944346
STRING SORT: Iterations/sec.: 2.181686 Index: 0.958983
BITFIELD: Iterations/sec.: 6294984.678336 Index: 1.079789
FP EMULATION: Iterations/sec.: 2.232148 Index: 1.073148
FOURIER: Iterations/sec.: 746.055296 Index: 0.844757
ASSIGNMENT: Iterations/sec.: 0.386720 Index: 1.473388
IDEA: Iterations/sec.: 80.463770 Index: 1.231086
HUFFMAN: Iterations/sec.: 43.592067 Index: 1.211396
[1] http://www.byte.com/bmark/bmark.htm
best regards,
johan
--
Johan Rydberg, Free Software Developer, Sweden
http://rtmk.sf.net | http://www.nongnu.org/guss/
Playing Track No09
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] tests with simulated memory
2003-06-19 15:02 [Qemu-devel] tests with simulated memory Johan Rydberg
@ 2003-06-19 16:33 ` Fabrice Bellard
2003-06-19 17:31 ` Johan Rydberg
0 siblings, 1 reply; 4+ messages in thread
From: Fabrice Bellard @ 2003-06-19 16:33 UTC (permalink / raw)
To: qemu-devel
Some days ago I looked for a benchmark to publish QEMU objective results
(gzip is not enough!). The BYTEmark you mention seems a good start. SPEC
benchmarks are less interesting since their source code cannot be easily
distributed.
I would have expected the slowdown to be more important. In the code you
submitted you did not include alignment tests. Hopefully by just anding
the address with '0xffff0003' you do both address translation _and_
unaligned access handling...
It seems that Valgrind is twice as fast on most tests. Some
optimisations will be needed in qemu to correct that :-)
Fabrice.
Johan Rydberg wrote:
> Hi,
>
> I've hacked a bit on QEMU and added simulated memory using a
> translation cache as we discussed earlier. These tests are
> mostly for my own interest, but you might find them interesting
> aswell.
>
> Below is the result of the BYTEmark [1] benchmark (I do not have
> access to any of the SPEC benchmarks) using simulated memory:
>
> NUMERIC SORT: Iterations/sec.: 10.385957 Index: 0.268406
> STRING SORT: Iterations/sec.: 1.807970 Index: 0.794712
> BITFIELD: Iterations/sec.: 1913364.293577 Index: 0.328203
> FP EMULATION: Iterations/sec.: 0.647669 Index: 0.311379
> FOURIER: Iterations/sec.: 619.696756 Index: 0.701681
> ASSIGNMENT: Iterations/sec.: 0.151103 Index: 0.575697
> IDEA: Iterations/sec.: 21.734410 Index: 0.332534
> HUFFMAN: Iterations/sec.: 13.068599 Index: 0.363168
>
> Same test without the simulated memory (original QEMU):
>
> NUMERIC SORT: Iterations/sec.: 20.327522 Index: 0.525327
> STRING SORT: Iterations/sec.: 2.919430 Index: 1.283266
> BITFIELD: Iterations/sec.: 3086647.786244 Index: 0.529458
> FP EMULATION: Iterations/sec.: 1.112348 Index: 0.534783
> FOURIER: Iterations/sec.: 717.791439 Index: 0.812754
> ASSIGNMENT: Iterations/sec.: 0.208943 Index: 0.796063
> IDEA: Iterations/sec.: 39.651108 Index: 0.606657
> HUFFMAN: Iterations/sec.: 19.098677 Index: 0.530740
>
> Slowdown rates (calculated using the Index field from original
> QEMU divided with the Index from the QEMU w/ simulated memory):
>
> NUMERIC SORT: 1.96
> STRING SORT: 1.61
> BITFIELD: 1.61
> FP EMULATION: 1.72
> FOURIER: 1.16
> ASSIGNMENT: 1.38
> IDEA: 1.82
> HUFFMAN: 1.46
>
> The slowdown would be greater if any processing must have been
> done on every cache miss. The current hack just adds the page
> to the cache and does the memory transaction and returns.
>
> A slowdown between 1.16x and ~2x is pretty good I think.
>
> As reference, below is the results for Valgrind (CVS version,
> running the none skin):
>
> NUMERIC SORT: Iterations/sec.: 36.541455 Index: 0.944346
> STRING SORT: Iterations/sec.: 2.181686 Index: 0.958983
> BITFIELD: Iterations/sec.: 6294984.678336 Index: 1.079789
> FP EMULATION: Iterations/sec.: 2.232148 Index: 1.073148
> FOURIER: Iterations/sec.: 746.055296 Index: 0.844757
> ASSIGNMENT: Iterations/sec.: 0.386720 Index: 1.473388
> IDEA: Iterations/sec.: 80.463770 Index: 1.231086
> HUFFMAN: Iterations/sec.: 43.592067 Index: 1.211396
>
> [1] http://www.byte.com/bmark/bmark.htm
>
> best regards,
> johan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] tests with simulated memory
2003-06-19 16:33 ` Fabrice Bellard
@ 2003-06-19 17:31 ` Johan Rydberg
2003-06-20 19:16 ` Gwenole Beauchesne
0 siblings, 1 reply; 4+ messages in thread
From: Johan Rydberg @ 2003-06-19 17:31 UTC (permalink / raw)
To: qemu-devel
Fabrice Bellard <fabrice.bellard@free.fr> wrote:
: Some days ago I looked for a benchmark to publish QEMU objective results
: (gzip is not enough!). The BYTEmark you mention seems a good start. SPEC
: benchmarks are less interesting since their source code cannot be easily
: distributed.
Yes. And the Linux nbench testsuite is based on BYTEmark. I used the
sources from byte's website. Maybe you should try nbench instead.
: I would have expected the slowdown to be more important. In the code you
: submitted you did not include alignment tests. Hopefully by just anding
: the address with '0xffff0003' you do both address translation _and_
: unaligned access handling...
I was a bit suprised aswell. It though you would get a slowdown of
something like 2x - 8x, esp since x86 programs does a lot of memory
accesses.
Regarding alignment checks. I'm not sure it is needed for the x86 platform
though, since it supports unaligned memory accesses. (but there is some bit
that enabled alignment checks, isn't there?)
: It seems that Valgrind is twice as fast on most tests. Some
: optimisations will be needed in qemu to correct that :-)
Hehe. I was a bit suprised by the result. I thought QEMU would perform
better, esp since it must spend less time decoding and more time executing
than Valgrind (doing the register allocation + improvements of the micro
operations isn't cheep).
--
Johan Rydberg, Free Software Developer, Sweden
http://rtmk.sf.net | http://www.nongnu.org/guss/
Playing Track No09
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Qemu-devel] tests with simulated memory
2003-06-19 17:31 ` Johan Rydberg
@ 2003-06-20 19:16 ` Gwenole Beauchesne
0 siblings, 0 replies; 4+ messages in thread
From: Gwenole Beauchesne @ 2003-06-20 19:16 UTC (permalink / raw)
To: qemu-devel
Hi,
> Yes. And the Linux nbench testsuite is based on BYTEmark. I used the
> sources from byte's website. Maybe you should try nbench instead.
FWIW, I am using SSBENCH for my PPC emulator.
Bye,
Gwenole
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2003-06-20 19:17 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-06-19 15:02 [Qemu-devel] tests with simulated memory Johan Rydberg
2003-06-19 16:33 ` Fabrice Bellard
2003-06-19 17:31 ` Johan Rydberg
2003-06-20 19:16 ` Gwenole Beauchesne
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).