From: Jeremy Fitzhardinge <jeremy@goop.org>
To: John Morrison <john@clustered.net>
Cc: xen-devel@lists.xensource.com
Subject: Re: Poor SMP performance pv_ops domU
Date: Tue, 18 May 2010 11:38:53 -0700 [thread overview]
Message-ID: <4BF2DEBD.7040108@goop.org> (raw)
In-Reply-To: <E2279633-C226-4C37-9313-49CE6A53B628@clustered.net>
On 05/18/2010 10:34 AM, John Morrison wrote:
> Hi,
>
> Over the last year we have tried many times to get acceptable performance from pv_ops kernels.
>
> Tests done with 1,2,4 and 8 cores. The more cores the lower the score.
>
> Inside the domU it shows all cores, top -s shows all cores in use.
> xentop in dom0 never shows over 99% cpu.
>
> 2.6.18.8-xenU kernel show's over 700% cpu and the scores are about 8 x the pv_ops score.
>
> Any ideas ?
>
Well, I guess some kind of bad serialization is going on in there, and
it should be fairly obvious with a bit of examination.
Have you tried building your own pvops domu kernels? Does enabling PV
spinlocks make any difference? Also enabling some of the lock
debugging/profiling/contention monitoring stuff may give useful results.
Can you post the corresponding 2.6.18 results? Are there specific
sub-tests which show the effect more strongly than the others?
How does the 2.6.32 kernel fare when booted native?
Thanks,
J
>
> John
>
>
> 1 core
>
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066476 132875660 1% /
>
> Start Benchmark Run: Tue May 18 13:54:54 BST 2010
> 13:54:54 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:06:12 BST 2010
> 14:06:12 up 11 min, 2 users, load average: 11.48, 5.20, 2.43
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 8950813.0 237.6
> Double-Precision Whetstone 83.1 2103.7 253.2
> Execl Throughput 188.3 1568.4 83.3
> File Copy 1024 bufsize 2000 maxblocks 2672.0 64198.0 240.3
> File Copy 256 bufsize 500 maxblocks 1077.0 17781.0 165.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 643717.0 418.5
> Pipe-based Context Switching 15448.6 85379.4 55.3
> Pipe Throughput 111814.6 478490.1 42.8
> Process Creation 569.3 3329.6 58.5
> Shell Scripts (8 concurrent) 44.8 380.7 85.0
> System Call Overhead 114433.5 498712.3 43.6
> =========
> FINAL SCORE 114.1
>
> 2-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066548 132875588 1% /
>
> Start Benchmark Run: Tue May 18 14:07:27 BST 2010
> 14:07:27 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:18:04 BST 2010
> 14:18:04 up 10 min, 1 user, load average: 12.78, 5.53, 2.49
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 10124838.6 268.7
> Double-Precision Whetstone 83.1 1188.7 143.0
> Execl Throughput 188.3 1596.2 84.8
> File Copy 1024 bufsize 2000 maxblocks 2672.0 58323.0 218.3
> File Copy 256 bufsize 500 maxblocks 1077.0 17776.0 165.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 568217.0 369.4
> Pipe-based Context Switching 15448.6 86111.3 55.7
> Pipe Throughput 111814.6 469957.8 42.0
> Process Creation 569.3 3298.1 57.9
> Shell Scripts (8 concurrent) 44.8 378.9 84.6
> System Call Overhead 114433.5 532828.4 46.6
> =========
> FINAL SCORE 107.9
>
> 4-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066628 132875508 1% /
>
> Start Benchmark Run: Tue May 18 14:19:17 BST 2010
> 14:19:17 up 0 min, 1 user, load average: 0.00, 0.00, 0.00
>
> End Benchmark Run: Tue May 18 14:29:53 BST 2010
> 14:29:53 up 10 min, 1 user, load average: 13.59, 6.35, 2.97
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 10185429.8 270.3
> Double-Precision Whetstone 83.1 759.8 91.4
> Execl Throughput 188.3 1386.2 73.6
> File Copy 1024 bufsize 2000 maxblocks 2672.0 62331.0 233.3
> File Copy 256 bufsize 500 maxblocks 1077.0 16492.0 153.1
> File Read 4096 bufsize 8000 maxblocks 15382.0 563402.0 366.3
> Pipe-based Context Switching 15448.6 87176.0 56.4
> Pipe Throughput 111814.6 481068.1 43.0
> Process Creation 569.3 3128.9 55.0
> Shell Scripts (8 concurrent) 44.8 394.9 88.1
> System Call Overhead 114433.5 539996.1 47.2
> =========
> FINAL SCORE 102.6
> 8-cores
>
> ==============================================================
> BYTE UNIX Benchmarks (Version 4.1-wht.2, 8 threads)
> System -- Linux test 2.6.32-21-server #32-Ubuntu SMP Fri Apr 16 09:17:34 UTC 2010 x86_64 GNU/Linux
> /dev/xvda1 141110136 1066680 132875456 1% /
>
> Start Benchmark Run: Tue May 18 14:30:59 BST 2010
> 14:30:59 up 0 min, 1 user, load average: 0.07, 0.02, 0.00
>
> End Benchmark Run: Tue May 18 14:42:52 BST 2010
> 14:42:52 up 12 min, 1 user, load average: 25.56, 10.84, 4.96
>
>
> INDEX VALUES
> TEST BASELINE RESULT INDEX
>
> Dhrystone 2 using register variables 376783.7 9972130.3 264.7
> Double-Precision Whetstone 83.1 755.2 90.9
> Execl Throughput 188.3 1584.7 84.2
> File Copy 1024 bufsize 2000 maxblocks 2672.0 58981.0 220.7
> File Copy 256 bufsize 500 maxblocks 1077.0 16904.0 157.0
> File Read 4096 bufsize 8000 maxblocks 15382.0 557735.0 362.6
> Pipe-based Context Switching 15448.6 80738.2 52.3
> Pipe Throughput 111814.6 450891.2 40.3
> Process Creation 569.3 2948.5 51.8
> Shell Scripts (8 concurrent) 44.8 378.1 84.4
> System Call Overhead 114433.5 537443.2 47.0
> =========
> FINAL SCORE 100.9
>
>
>
> --
> Professional hosting without compromise
> www.clustered.net
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
>
>
next prev parent reply other threads:[~2010-05-18 18:38 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-18 17:34 Poor SMP performance pv_ops domU John Morrison
2010-05-18 18:38 ` Jeremy Fitzhardinge [this message]
2010-05-19 16:24 ` John Morrison
2010-05-19 17:44 ` Jeremy Fitzhardinge
[not found] ` <7AA26B35-634A-41B4-AD2E-54E3F33BD4BA@clustered.net>
2010-05-19 19:48 ` Jeremy Fitzhardinge
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BF2DEBD.7040108@goop.org \
--to=jeremy@goop.org \
--cc=john@clustered.net \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.