From mboxrd@z Thu Jan 1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54063)
by lists.gnu.org with esmtp (Exim 4.71)
(envelope-from
) id 1ZSfSb-0006tC-0z
for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:54 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
(envelope-from ) id 1ZSfSW-0006Nm-1W
for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:52 -0400
Received: from mout.gmx.net ([212.227.15.18]:49300)
by eggs.gnu.org with esmtp (Exim 4.71)
(envelope-from ) id 1ZSfSV-0006NV-Mn
for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:47 -0400
References: <55B9DD60.8020801@gmx.net> <20150730085500.GV11361@aurel32.net>
<20150730155003.GE30591@aurel32.net>
<20150731154323.GD23508@aurel32.net>
<20150803091716.GF30591@aurel32.net>
<55D37189.3010809@twiddle.net>
<20150819110010.GJ23508@aurel32.net>
<55D60C1B.9010502@twiddle.net> <55D6A9DF.5070506@gmx.net>
<55D6BC00.50200@twiddle.net>
From: Dennis Luehring
Message-ID: <55D6BFAD.4080501@gmx.net>
Date: Fri, 21 Aug 2015 08:05:33 +0200
MIME-Version: 1.0
In-Reply-To: <55D6BC00.50200@twiddle.net>
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do
to speedup the emulation?
List-Id:
List-Unsubscribe: ,
List-Archive:
List-Post:
List-Help:
List-Subscribe: ,
To: Richard Henderson , Artyom Tarasenko , Aurelien Jarno
Cc: qemu-devel
Am 21.08.2015 um 07:49 schrieb Richard Henderson:
> On 08/20/2015 09:32 PM, Dennis Luehring wrote:
> > gcc prime.c -o prime.out -lm
> >
> > prime.out runtime
> >
> > tcg-indirect: ~9.3 sec (best result)
> > qemu.org-git: ~11 sec
> > without-optimization: ~9.9 sec (worst result)
>
> I presume this is integer prime factoring?
Aurelien Jarno extracted this code from sysbench (just for my qemu
sparc64 tests)
#include
unsigned long long max_prime = 2000;
void prime_test()
{
unsigned long long c;
unsigned long long l,t;
unsigned long long n=0;
/* So far we're using very simple test prime number tests in 64bit */
for(c=3; c < max_prime; c++)
{
t = sqrt(c);
for(l = 2; l <= t; l++)
if (c % l == 0)
break;
if (l > t )
n++;
}
}
int main()
{
int i;
for (i = 0 ; i < 10000 ; i++)
{
prime_test();
}
return 0;
}
>
> > g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD -MP
> >
> > tcg-indirect: ~2:46.5
> > qemu.org-git: ~2:51.2 (worst result)
> > without-optimization: ~2:14.1 (best result)
>
> No compiler optimization? I wouldn't expect there to be much for tcg to
> optimize there -- dropping values to memory all the time doesn't leave much.
without-optimization means qemu.org-git release build + undefine
USE_TCG_OPTIMIZATIONS in tcg/tcg.c
or what compiler do you mean?
>
> >
> > stream results (STREAM version $Revision: 5.10 $)
> >
> > tcg-indirect: (worst result)
> >
> > Your clock granularity/precision appears to be 41 microseconds.
> > Each test below will take on the order of 632527 microseconds.
> > (= 15427 clock ticks)
> > Function Best Rate MB/s Avg time Min time Max time
> > Copy: 320.8 0.511297 0.498785 0.590214
> > Scale: 187.0 0.858693 0.855465 0.863527
> > Add: 218.2 1.104654 1.099698 1.110341
> > Triad: 169.5 1.433273 1.416321 1.502248
> >
> > qemu.org-git: (best result)
> >
> > Your clock granularity/precision appears to be 42 microseconds.
> > Each test below will take on the order of 330428 microseconds.
> > (= 7867 clock ticks)
> > Function Best Rate MB/s Avg time Min time Max time
> > Copy: 771.5 0.214717 0.207377 0.244214
> > Scale: 288.1 0.573320 0.555401 0.660161
> > Add: 423.5 0.633523 0.566661 1.092067
> > Triad: 242.9 1.053032 0.987970 1.499563
> >
> > without-optimization:
> >
> > Your clock granularity/precision appears to be 41 microseconds.
> > Each test below will take on the order of 745254 microseconds.
> > (= 18176 clock ticks)
> > Function Best Rate MB/s Avg time Min time Max time
> > Copy: 316.6 0.524065 0.505313 0.580103
> > Scale: 200.5 0.813356 0.798024 0.840986
> > Add: 243.9 1.010247 0.984025 1.119149
> > Triad: 182.9 1.345601 1.312236 1.427459
>
> These results are weird. Unoptimized less than half the speed of mainline?
> Improving optimization (with no extra work, mind) brings the results back down?
yep they are - it seems that the assumption of the involved developers
where speed can be improved / or slowbess comes from is not correct
how are SPARC64 benchmarks done usually?
>
>
> r~