From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54063)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dl.soluz@gmx.net>) id 1ZSfSb-0006tC-0z
	for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:54 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dl.soluz@gmx.net>) id 1ZSfSW-0006Nm-1W
	for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:52 -0400
Received: from mout.gmx.net ([212.227.15.18]:49300)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dl.soluz@gmx.net>) id 1ZSfSV-0006NV-Mn
	for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:47 -0400
References: <55B9DD60.8020801@gmx.net> <20150730085500.GV11361@aurel32.net>
	<20150730155003.GE30591@aurel32.net>
	<CACXAS8ABo2yCVL9Hk9yoyY1wvF4X1Oq4vFs87gQyRjH59Yi1Lg@mail.gmail.com>
	<20150731154323.GD23508@aurel32.net>
	<CACXAS8CLVt_0rQCLWmYp=YhL1PJyVLGAxyvQB27Cm6rM+MTGWg@mail.gmail.com>
	<20150803091716.GF30591@aurel32.net>
	<CACXAS8Dv410OJn4tQ14hLOsR-TjHCMQXmTMU8Dk_XT1saPQz9g@mail.gmail.com>
	<55D37189.3010809@twiddle.net>
	<CACXAS8CcN9bz8GBtKyoK+FH29qpRm1XeiGf7apYTBfWYV4evKw@mail.gmail.com>
	<20150819110010.GJ23508@aurel32.net>
	<CACXAS8C82pZws_LYanyEa4YVZdkk717imyRcmCNfvv3uDivr0Q@mail.gmail.com>
	<55D60C1B.9010502@twiddle.net> <55D6A9DF.5070506@gmx.net>
	<55D6BC00.50200@twiddle.net>
From: Dennis Luehring <dl.soluz@gmx.net>
Message-ID: <55D6BFAD.4080501@gmx.net>
Date: Fri, 21 Aug 2015 08:05:33 +0200
MIME-Version: 1.0
In-Reply-To: <55D6BC00.50200@twiddle.net>
Content-Type: text/plain; charset=iso-8859-15; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do
 to speedup the emulation?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Richard Henderson <rth@twiddle.net>, Artyom Tarasenko <atar4qemu@gmail.com>, Aurelien Jarno <aurelien@aurel32.net>
Cc: qemu-devel <qemu-devel@nongnu.org>

Am 21.08.2015 um 07:49 schrieb Richard Henderson:
> On 08/20/2015 09:32 PM, Dennis Luehring wrote:
> > gcc prime.c -o prime.out -lm
> >
> > prime.out runtime
> >
> > tcg-indirect: ~9.3 sec (best result)
> > qemu.org-git: ~11 sec
> > without-optimization: ~9.9 sec (worst result)
>
> I presume this is integer prime factoring?


Aurelien Jarno extracted this code from sysbench (just for my qemu 
sparc64 tests)

#include <math.h>
unsigned long long max_prime = 2000;
void prime_test()
{
   unsigned long long c;
   unsigned long long l,t;
   unsigned long long n=0;
   /* So far we're using very simple test prime number tests in 64bit */
   for(c=3; c < max_prime; c++)
   {
     t = sqrt(c);
     for(l = 2; l <= t; l++)
       if (c % l == 0)
         break;
     if (l > t )
       n++;
   }
}
int main()
{
   int i;
   for (i = 0 ; i < 10000 ; i++)
   {
     prime_test();
   }
   return 0;
}


>
> > g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD -MP
> >
> > tcg-indirect: ~2:46.5
> > qemu.org-git: ~2:51.2 (worst result)
> > without-optimization: ~2:14.1 (best result)
>
> No compiler optimization?  I wouldn't expect there to be much for tcg to
> optimize there -- dropping values to memory all the time doesn't leave much.


without-optimization means qemu.org-git release build + undefine 
USE_TCG_OPTIMIZATIONS in tcg/tcg.c
or what compiler do you mean?


>
> >
> > stream results (STREAM version $Revision: 5.10 $)
> >
> > tcg-indirect: (worst result)
> >
> > Your clock granularity/precision appears to be 41 microseconds.
> > Each test below will take on the order of 632527 microseconds.
> >    (= 15427 clock ticks)
> > Function    Best Rate MB/s  Avg time     Min time     Max time
> > Copy:             320.8     0.511297     0.498785     0.590214
> > Scale:            187.0     0.858693     0.855465     0.863527
> > Add:              218.2     1.104654     1.099698     1.110341
> > Triad:            169.5     1.433273     1.416321     1.502248
> >
> > qemu.org-git: (best result)
> >
> > Your clock granularity/precision appears to be 42 microseconds.
> > Each test below will take on the order of 330428 microseconds.
> >     (= 7867 clock ticks)
> > Function    Best Rate MB/s  Avg time     Min time     Max time
> > Copy:             771.5     0.214717     0.207377     0.244214
> > Scale:            288.1     0.573320     0.555401     0.660161
> > Add:              423.5     0.633523     0.566661     1.092067
> > Triad:            242.9     1.053032     0.987970     1.499563
> >
> > without-optimization:
> >
> > Your clock granularity/precision appears to be 41 microseconds.
> > Each test below will take on the order of 745254 microseconds.
> >    (= 18176 clock ticks)
> > Function    Best Rate MB/s  Avg time     Min time     Max time
> > Copy:             316.6     0.524065     0.505313     0.580103
> > Scale:            200.5     0.813356     0.798024     0.840986
> > Add:              243.9     1.010247     0.984025     1.119149
> > Triad:            182.9     1.345601     1.312236     1.427459
>
> These results are weird.  Unoptimized less than half the speed of mainline?
> Improving optimization (with no extra work, mind) brings the results back down?


yep they are - it seems that the assumption of the involved developers
where speed can be improved / or slowbess comes from is not correct
how are SPARC64 benchmarks done usually?

>
>
> r~