From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54063) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZSfSb-0006tC-0z for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZSfSW-0006Nm-1W for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:52 -0400 Received: from mout.gmx.net ([212.227.15.18]:49300) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZSfSV-0006NV-Mn for qemu-devel@nongnu.org; Fri, 21 Aug 2015 02:05:47 -0400 References: <55B9DD60.8020801@gmx.net> <20150730085500.GV11361@aurel32.net> <20150730155003.GE30591@aurel32.net> <20150731154323.GD23508@aurel32.net> <20150803091716.GF30591@aurel32.net> <55D37189.3010809@twiddle.net> <20150819110010.GJ23508@aurel32.net> <55D60C1B.9010502@twiddle.net> <55D6A9DF.5070506@gmx.net> <55D6BC00.50200@twiddle.net> From: Dennis Luehring Message-ID: <55D6BFAD.4080501@gmx.net> Date: Fri, 21 Aug 2015 08:05:33 +0200 MIME-Version: 1.0 In-Reply-To: <55D6BC00.50200@twiddle.net> Content-Type: text/plain; charset=iso-8859-15; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] Debian 7.8.0 SPARC64 on qemu - anything i can do to speedup the emulation? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Richard Henderson , Artyom Tarasenko , Aurelien Jarno Cc: qemu-devel Am 21.08.2015 um 07:49 schrieb Richard Henderson: > On 08/20/2015 09:32 PM, Dennis Luehring wrote: > > gcc prime.c -o prime.out -lm > > > > prime.out runtime > > > > tcg-indirect: ~9.3 sec (best result) > > qemu.org-git: ~11 sec > > without-optimization: ~9.9 sec (worst result) > > I presume this is integer prime factoring? Aurelien Jarno extracted this code from sysbench (just for my qemu sparc64 tests) #include unsigned long long max_prime = 2000; void prime_test() { unsigned long long c; unsigned long long l,t; unsigned long long n=0; /* So far we're using very simple test prime number tests in 64bit */ for(c=3; c < max_prime; c++) { t = sqrt(c); for(l = 2; l <= t; l++) if (c % l == 0) break; if (l > t ) n++; } } int main() { int i; for (i = 0 ; i < 10000 ; i++) { prime_test(); } return 0; } > > > g++ src/pugixml.cpp -g -Wall -Wextra -Werror -pedantic -std=c++0x -c -MMD -MP > > > > tcg-indirect: ~2:46.5 > > qemu.org-git: ~2:51.2 (worst result) > > without-optimization: ~2:14.1 (best result) > > No compiler optimization? I wouldn't expect there to be much for tcg to > optimize there -- dropping values to memory all the time doesn't leave much. without-optimization means qemu.org-git release build + undefine USE_TCG_OPTIMIZATIONS in tcg/tcg.c or what compiler do you mean? > > > > > stream results (STREAM version $Revision: 5.10 $) > > > > tcg-indirect: (worst result) > > > > Your clock granularity/precision appears to be 41 microseconds. > > Each test below will take on the order of 632527 microseconds. > > (= 15427 clock ticks) > > Function Best Rate MB/s Avg time Min time Max time > > Copy: 320.8 0.511297 0.498785 0.590214 > > Scale: 187.0 0.858693 0.855465 0.863527 > > Add: 218.2 1.104654 1.099698 1.110341 > > Triad: 169.5 1.433273 1.416321 1.502248 > > > > qemu.org-git: (best result) > > > > Your clock granularity/precision appears to be 42 microseconds. > > Each test below will take on the order of 330428 microseconds. > > (= 7867 clock ticks) > > Function Best Rate MB/s Avg time Min time Max time > > Copy: 771.5 0.214717 0.207377 0.244214 > > Scale: 288.1 0.573320 0.555401 0.660161 > > Add: 423.5 0.633523 0.566661 1.092067 > > Triad: 242.9 1.053032 0.987970 1.499563 > > > > without-optimization: > > > > Your clock granularity/precision appears to be 41 microseconds. > > Each test below will take on the order of 745254 microseconds. > > (= 18176 clock ticks) > > Function Best Rate MB/s Avg time Min time Max time > > Copy: 316.6 0.524065 0.505313 0.580103 > > Scale: 200.5 0.813356 0.798024 0.840986 > > Add: 243.9 1.010247 0.984025 1.119149 > > Triad: 182.9 1.345601 1.312236 1.427459 > > These results are weird. Unoptimized less than half the speed of mainline? > Improving optimization (with no extra work, mind) brings the results back down? yep they are - it seems that the assumption of the involved developers where speed can be improved / or slowbess comes from is not correct how are SPARC64 benchmarks done usually? > > > r~