All of lore.kernel.org
 help / color / mirror / Atom feed
* Benchmarking Xen (results and questions)
@ 2005-08-03 23:21 David_Wolinsky
  2005-08-04  1:43 ` Andrew Theurer
  0 siblings, 1 reply; 10+ messages in thread
From: David_Wolinsky @ 2005-08-03 23:21 UTC (permalink / raw)
  To: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2904 bytes --]

Hi all,

Here are some benchmarks that I've done using Xen.

However, before I get started, let me explain some of configuration
details...

Xen Version	SPECjbb	
	WebBench	
Linux Distribution	Debian 3.1	
HT	disabled	
Linux Kernel	2.6.12.2	
Host Patch	CK3s	


Here are the initial benchmarks

	SPECJBB	WebBench				
	1 Thread	1 Client	2 Clients	4 Clients
8 Clients	
	BOPS	TPS	TPS	TPS	TPS	
Host	32403.5	213.45	416.86	814.62	1523.78	
1 VM	32057	205.4	380.91	569.24	733.8	
2 VM	24909.25	NA	399.29	695.1	896.04	
4 VM	17815.75	NA	NA	742.78	950.63	
8 VM	10216.25	NA	NA	NA	1002.81	

(and some more notes.... BOPS - business operations per second, TPS -
transactions per second...
SPECjbb tests CPU and Memory
WebBench (the way we configured it) tests Network I/O and Disk I/O

Values = AVG * VM count		
Domain configurations		
	1 VM - 1660 MB - SPECJBB 1500MB	
	2 VM - 1280 MB - SPECJBB - 1024MB	
	4 VM - 640 MB - SPECJBB - 512 MB	
	8 VM - 320 MB - SPECJBB  - 256 MB	

Seeing how the SPECjbb numbers declined so bizarrely, I did some
scheduling tests and found this out...

Test1:  Examine Xen's scheduling to determine if context switching is
causing the overhead					
		Period	Slice	BOPs	
Modified	8 VM	1 ms	125 us	6858	
	8 VM	10 ms	1.25 ms	14287	
	8 VM	100 ms	12.5 ms	18912	
	8 VM	1 Sec	.125 Sec	20695	
	8 VM	2 Sec	.25 Sec	21072	
	8 VM	10 Sec	1.25 Sec	21797	
	8 VM	100 Sec	12.5 Sec	11402	

I later learned that there was a period limit of 4 seconds, thus
invalidating 10 and 100 seconds.  However, this graph suggests that Xen
needs some load and scheduling balancing done.
I also did a memory test to determine if that could be the issue... I
made a custom stream to run for a 2 minute period... and got these
numbers

		Copy	Scale	Add	Triad	
Host		3266.4	3215.47	3012.28	3021.79	
Modified	1 VM	3262.34	3220.34	3016.13	3025.28	

So we can see memory is not the issue...

Now onto WebBench - After comparing the WebBench to the SPECjbb results,
we get something interesting... NUMBERS increase as we increase the
virtual machien count... So I would really like some idea on why this
is.  My understanding is this...  When using the shared memory network
drivers, there must be a local buffer, and when the buffer fills up, it
puts the remaining into a global buffer, and when that fills up it puts
it into a disk buffer?  (These are all assumptions please correct me...)
If that is the case is there an easy way to increase the local buffer to
attempt to get better numbers?  I also am looking into doing some tests
that deal with multiple small transactions and 1 large transactions...
I ran these all against a physical and image backed disk.  Please any
suggestions.

(Note... I was running this on a 1 gigabit switch with only webbench
running)...

If there are any questions, I would be glad to respond.

Thanks,
David

[-- Attachment #1.2: Type: text/html, Size: 7528 bytes --]

[-- Attachment #2: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-03 23:48 Ian Pratt
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Pratt @ 2005-08-03 23:48 UTC (permalink / raw)
  To: David_Wolinsky, xen-devel


David,

Which xen version is this? I'm guessing unstable.
Is this with sedf or bvt? I'm guessing sedf since you're playing around
with periods.

It would be interesting to retry a couple of datapoints with sched=bvt
on the xen command line.

Also, I'd definitely recommend enabling HyperThreading and dedicating
one of the logical CPUs to dom0.

Also, are you sure the drop-off in performance isn't just caused because
of the reduced memory size when you have more VMs? It's probably better
to do such experiments with the same memory size throughout.

Best,
Ian
 

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com 
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> David_Wolinsky@Dell.com
> Sent: 04 August 2005 00:21
> To: xen-devel@lists.xensource.com
> Subject: [Xen-devel] Benchmarking Xen (results and questions)
> 
> Hi all, 
> 
> Here are some benchmarks that I've done using Xen. 
> 
> However, before I get started, let me explain some of 
> configuration details... 
> 
> Xen Version     SPECjbb
>         WebBench       
> Linux Distribution      Debian 3.1     
> HT      disabled       
> Linux Kernel    2.6.12.2       
> Host Patch      CK3s   
> 
> 
> Here are the initial benchmarks 
> 
>         SPECJBB WebBench                               
>         1 Thread        1 Client        2 Clients       4 
> Clients       8 Clients      
>         BOPS    TPS     TPS     TPS     TPS    
> Host    32403.5 213.45  416.86  814.62  1523.78
> 1 VM    32057   205.4   380.91  569.24  733.8  
> 2 VM    24909.25        NA      399.29  695.1   896.04 
> 4 VM    17815.75        NA      NA      742.78  950.63 
> 8 VM    10216.25        NA      NA      NA      1002.81
> 
> 
> (and some more notes.... BOPS - business operations per second, 
> TPS - transactions per second... 
> SPECjbb tests CPU and Memory 
> WebBench (the way we configured it) tests Network I/O and Disk I/O 
> 
> Values = AVG * VM count        
> Domain configurations          
>         1 VM - 1660 MB - SPECJBB 1500MB
>         2 VM - 1280 MB - SPECJBB - 1024MB      
>         4 VM - 640 MB - SPECJBB - 512 MB       
>         8 VM - 320 MB - SPECJBB  - 256 MB      
> 
> Seeing how the SPECjbb numbers declined so bizarrely, I did 
> some scheduling tests and found this out... 
> 
> Test1:  Examine Xen's scheduling to determine if context 
> switching is causing the overhead                                     
>                 Period  Slice   BOPs   
> Modified        8 VM    1 ms    125 us  6858   
>         8 VM    10 ms   1.25 ms 14287  
>         8 VM    100 ms  12.5 ms 18912  
>         8 VM    1 Sec   .125 Sec        20695  
>         8 VM    2 Sec   .25 Sec 21072  
>         8 VM    10 Sec  1.25 Sec        21797  
>         8 VM    100 Sec 12.5 Sec        11402  
> 
> I later learned that there was a period limit of 4 seconds, 
> thus invalidating 10 and 100 seconds.  However, this graph 
> suggests that Xen needs some load and scheduling balancing done.
> 
> I also did a memory test to determine if that could be the 
> issue... I made a custom stream to run for a 2 minute period... 
> and got these numbers
> 
>                 Copy    Scale   Add     Triad  
> Host            3266.4  3215.47 3012.28 3021.79
> Modified        1 VM    3262.34 3220.34 3016.13 3025.28
> 
> 
> So we can see memory is not the issue... 
> 
> Now onto WebBench - After comparing the WebBench to the 
> SPECjbb results, we get something interesting... NUMBERS 
> increase as we increase the virtual machien count... So I would 
> really like some idea on why this is.  My understanding is 
> this...  When using the shared memory network drivers, there 
> must be a local buffer, and when the buffer fills up, it puts 
> the remaining into a global buffer, and when that fills up it 
> puts it into a disk buffer?  (These are all assumptions 
> please correct me...)  If that is the case is there an easy way 
> to increase the local buffer to attempt to get better 
> numbers?  I also am looking into doing some tests that deal 
> with multiple small transactions and 1 large transactions...  I 
> ran these all against a physical and image backed disk.  
> Please any suggestions.
> 
> (Note... I was running this on a 1 gigabit switch with only 
> webbench running)... 
> 
> If there are any questions, I would be glad to respond. 
> 
> Thanks, 
> David 
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-03 23:55 David_Wolinsky
  0 siblings, 0 replies; 10+ messages in thread
From: David_Wolinsky @ 2005-08-03 23:55 UTC (permalink / raw)
  To: m+Ian.Pratt, xen-devel

Sorry this must've gotten clipped out

Xen Version	SPECjbb	Change set 5814
	WebBench	Change set 5818 

(both unstable)  and yes SEDF scheduling and currently we only have
results in 32-bit mode.

SPECjbb only reacts by a marginal amount by decreasing ram... Ie, the
difference between at 256MB machine and a 1500MB is less than 5%.  One
way this test could be tested further is by increasing the thread counts
on each of VMs so that we can compare 8 SPECjbb processes on 8 VMs to 8
SPECjbb threads on 1 VM (etc...).

I had initially tried BVT; however, at that time it would crash the
system if I tried to change any of the settings, I do not know if this
is currently the issue.

That's a very interesting idea regarding the Hyperthreading, it hadn't
occurred to me.

Also, if you have time, could you elaborate on my WebBench results?

Thanks,
David

-----Original Message-----
From: Ian Pratt [mailto:m+Ian.Pratt@cl.cam.ac.uk] 
Sent: Wednesday, August 03, 2005 6:48 PM
To: Wolinsky, David; xen-devel@lists.xensource.com
Cc: ian.pratt@cl.cam.ac.uk
Subject: RE: [Xen-devel] Benchmarking Xen (results and questions)


David,

Which xen version is this? I'm guessing unstable.
Is this with sedf or bvt? I'm guessing sedf since you're playing around
with periods.

It would be interesting to retry a couple of datapoints with sched=bvt
on the xen command line.

Also, I'd definitely recommend enabling HyperThreading and dedicating
one of the logical CPUs to dom0.

Also, are you sure the drop-off in performance isn't just caused because
of the reduced memory size when you have more VMs? It's probably better
to do such experiments with the same memory size throughout.

Best,
Ian
 

> -----Original Message-----
> From: xen-devel-bounces@lists.xensource.com
> [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> David_Wolinsky@Dell.com
> Sent: 04 August 2005 00:21
> To: xen-devel@lists.xensource.com
> Subject: [Xen-devel] Benchmarking Xen (results and questions)
> 
> Hi all,
> 
> Here are some benchmarks that I've done using Xen. 
> 
> However, before I get started, let me explain some of configuration 
> details...
> 
> Xen Version     SPECjbb
>         WebBench       
> Linux Distribution      Debian 3.1     
> HT      disabled       
> Linux Kernel    2.6.12.2       
> Host Patch      CK3s   
> 
> 
> Here are the initial benchmarks
> 
>         SPECJBB WebBench                               
>         1 Thread        1 Client        2 Clients       4 
> Clients       8 Clients      
>         BOPS    TPS     TPS     TPS     TPS    
> Host    32403.5 213.45  416.86  814.62  1523.78
> 1 VM    32057   205.4   380.91  569.24  733.8  
> 2 VM    24909.25        NA      399.29  695.1   896.04 
> 4 VM    17815.75        NA      NA      742.78  950.63 
> 8 VM    10216.25        NA      NA      NA      1002.81
> 
> 
> (and some more notes.... BOPS - business operations per second, TPS - 
> transactions per second...
> SPECjbb tests CPU and Memory
> WebBench (the way we configured it) tests Network I/O and Disk I/O
> 
> Values = AVG * VM count        
> Domain configurations          
>         1 VM - 1660 MB - SPECJBB 1500MB
>         2 VM - 1280 MB - SPECJBB - 1024MB      
>         4 VM - 640 MB - SPECJBB - 512 MB       
>         8 VM - 320 MB - SPECJBB  - 256 MB      
> 
> Seeing how the SPECjbb numbers declined so bizarrely, I did some 
> scheduling tests and found this out...
> 
> Test1:  Examine Xen's scheduling to determine if context 
> switching is causing the overhead                                     
>                 Period  Slice   BOPs   
> Modified        8 VM    1 ms    125 us  6858   
>         8 VM    10 ms   1.25 ms 14287  
>         8 VM    100 ms  12.5 ms 18912  
>         8 VM    1 Sec   .125 Sec        20695  
>         8 VM    2 Sec   .25 Sec 21072  
>         8 VM    10 Sec  1.25 Sec        21797  
>         8 VM    100 Sec 12.5 Sec        11402  
> 
> I later learned that there was a period limit of 4 seconds, thus 
> invalidating 10 and 100 seconds.  However, this graph suggests that 
> Xen needs some load and scheduling balancing done.
> 
> I also did a memory test to determine if that could be the issue... I 
> made a custom stream to run for a 2 minute period...
> and got these numbers
> 
>                 Copy    Scale   Add     Triad  
> Host            3266.4  3215.47 3012.28 3021.79
> Modified        1 VM    3262.34 3220.34 3016.13 3025.28
> 
> 
> So we can see memory is not the issue... 
> 
> Now onto WebBench - After comparing the WebBench to the SPECjbb 
> results, we get something interesting... NUMBERS increase as we 
> increase the virtual machien count... So I would really like some idea

> on why this is.  My understanding is this...  When using the shared 
> memory network drivers, there must be a local buffer, and when the 
> buffer fills up, it puts the remaining into a global buffer, and when 
> that fills up it puts it into a disk buffer?  (These are all 
> assumptions please correct me...)  If that is the case is there an 
> easy way to increase the local buffer to attempt to get better 
> numbers?  I also am looking into doing some tests that deal with 
> multiple small transactions and 1 large transactions...  I ran these 
> all against a physical and image backed disk.
> Please any suggestions.
> 
> (Note... I was running this on a 1 gigabit switch with only webbench 
> running)...
> 
> If there are any questions, I would be glad to respond. 
> 
> Thanks,
> David
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-04  0:19 Ian Pratt
  0 siblings, 0 replies; 10+ messages in thread
From: Ian Pratt @ 2005-08-04  0:19 UTC (permalink / raw)
  To: David_Wolinsky, xen-devel


> I had initially tried BVT; however, at that time it would 
> crash the system if I tried to change any of the settings, I 
> do not know if this is currently the issue.

Any scheduling period more than 50ms really shouldn't result in any
significant context switch overhead.

I suspect what's going on in your tests is that dom0 (that is doing all
the IO) is being being CPU starved. Putting it on a separate hyperthread
will certainly help confirm this diagnosis. [hyperthreading is very
helpful to Xen, and by default it dedicates the first hyperthread of the
first CPU to dom0]
 
> Also, if you have time, could you elaborate on my WebBench results?

It would be useful if you could explain a bit more about your webbench
setup, e.g. are you testing the clietns rather than the web server?

Best,
Ian

> Thanks,
> David
> 
> -----Original Message-----
> From: Ian Pratt [mailto:m+Ian.Pratt@cl.cam.ac.uk]
> Sent: Wednesday, August 03, 2005 6:48 PM
> To: Wolinsky, David; xen-devel@lists.xensource.com
> Cc: ian.pratt@cl.cam.ac.uk
> Subject: RE: [Xen-devel] Benchmarking Xen (results and questions)
> 
> 
> David,
> 
> Which xen version is this? I'm guessing unstable.
> Is this with sedf or bvt? I'm guessing sedf since you're 
> playing around
> with periods.
> 
> It would be interesting to retry a couple of datapoints with sched=bvt
> on the xen command line.
> 
> Also, I'd definitely recommend enabling HyperThreading and dedicating
> one of the logical CPUs to dom0.
> 
> Also, are you sure the drop-off in performance isn't just 
> caused because
> of the reduced memory size when you have more VMs? It's 
> probably better
> to do such experiments with the same memory size throughout.
> 
> Best,
> Ian
>  
> 
> > -----Original Message-----
> > From: xen-devel-bounces@lists.xensource.com
> > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of 
> > David_Wolinsky@Dell.com
> > Sent: 04 August 2005 00:21
> > To: xen-devel@lists.xensource.com
> > Subject: [Xen-devel] Benchmarking Xen (results and questions)
> > 
> > Hi all,
> > 
> > Here are some benchmarks that I've done using Xen. 
> > 
> > However, before I get started, let me explain some of configuration 
> > details...
> > 
> > Xen Version     SPECjbb
> >         WebBench       
> > Linux Distribution      Debian 3.1     
> > HT      disabled       
> > Linux Kernel    2.6.12.2       
> > Host Patch      CK3s   
> > 
> > 
> > Here are the initial benchmarks
> > 
> >         SPECJBB WebBench                               
> >         1 Thread        1 Client        2 Clients       4 
> > Clients       8 Clients      
> >         BOPS    TPS     TPS     TPS     TPS    
> > Host    32403.5 213.45  416.86  814.62  1523.78
> > 1 VM    32057   205.4   380.91  569.24  733.8  
> > 2 VM    24909.25        NA      399.29  695.1   896.04 
> > 4 VM    17815.75        NA      NA      742.78  950.63 
> > 8 VM    10216.25        NA      NA      NA      1002.81
> > 
> > 
> > (and some more notes.... BOPS - business operations per 
> second, TPS - 
> > transactions per second...
> > SPECjbb tests CPU and Memory
> > WebBench (the way we configured it) tests Network I/O and Disk I/O
> > 
> > Values = AVG * VM count        
> > Domain configurations          
> >         1 VM - 1660 MB - SPECJBB 1500MB
> >         2 VM - 1280 MB - SPECJBB - 1024MB      
> >         4 VM - 640 MB - SPECJBB - 512 MB       
> >         8 VM - 320 MB - SPECJBB  - 256 MB      
> > 
> > Seeing how the SPECjbb numbers declined so bizarrely, I did some 
> > scheduling tests and found this out...
> > 
> > Test1:  Examine Xen's scheduling to determine if context 
> > switching is causing the overhead                           
>           
> >                 Period  Slice   BOPs   
> > Modified        8 VM    1 ms    125 us  6858   
> >         8 VM    10 ms   1.25 ms 14287  
> >         8 VM    100 ms  12.5 ms 18912  
> >         8 VM    1 Sec   .125 Sec        20695  
> >         8 VM    2 Sec   .25 Sec 21072  
> >         8 VM    10 Sec  1.25 Sec        21797  
> >         8 VM    100 Sec 12.5 Sec        11402  
> > 
> > I later learned that there was a period limit of 4 seconds, thus 
> > invalidating 10 and 100 seconds.  However, this graph suggests that 
> > Xen needs some load and scheduling balancing done.
> > 
> > I also did a memory test to determine if that could be the 
> issue... I 
> > made a custom stream to run for a 2 minute period...
> > and got these numbers
> > 
> >                 Copy    Scale   Add     Triad  
> > Host            3266.4  3215.47 3012.28 3021.79
> > Modified        1 VM    3262.34 3220.34 3016.13 3025.28
> > 
> > 
> > So we can see memory is not the issue... 
> > 
> > Now onto WebBench - After comparing the WebBench to the SPECjbb 
> > results, we get something interesting... NUMBERS increase as we 
> > increase the virtual machien count... So I would really 
> like some idea
> 
> > on why this is.  My understanding is this...  When using the shared 
> > memory network drivers, there must be a local buffer, and when the 
> > buffer fills up, it puts the remaining into a global 
> buffer, and when 
> > that fills up it puts it into a disk buffer?  (These are all 
> > assumptions please correct me...)  If that is the case is there an 
> > easy way to increase the local buffer to attempt to get better 
> > numbers?  I also am looking into doing some tests that deal with 
> > multiple small transactions and 1 large transactions...  I 
> ran these 
> > all against a physical and image backed disk.
> > Please any suggestions.
> > 
> > (Note... I was running this on a 1 gigabit switch with only 
> webbench 
> > running)...
> > 
> > If there are any questions, I would be glad to respond. 
> > 
> > Thanks,
> > David
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Benchmarking Xen (results and questions)
  2005-08-03 23:21 Benchmarking Xen (results and questions) David_Wolinsky
@ 2005-08-04  1:43 ` Andrew Theurer
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Theurer @ 2005-08-04  1:43 UTC (permalink / raw)
  To: David_Wolinsky; +Cc: xen-devel

David_Wolinsky@Dell.com wrote:

> Hi all,
>
> Here are some benchmarks that I've done using Xen.
>
> However, before I get started, let me explain some of configuration 
> details…
>
> Xen Version SPECjbb
> WebBench
> Linux Distribution Debian 3.1
> HT disabled
> Linux Kernel 2.6.12.2
> Host Patch CK3s
>
> Here are the initial benchmarks
>
> SPECJBB WebBench
> 1 Thread 1 Client 2 Clients 4 Clients 8 Clients
> BOPS TPS TPS TPS TPS
> Host 32403.5 213.45 416.86 814.62 1523.78
> 1 VM 32057 205.4 380.91 569.24 733.8
> 2 VM 24909.25 NA 399.29 695.1 896.04
> 4 VM 17815.75 NA NA 742.78 950.63
> 8 VM 10216.25 NA NA NA 1002.81
>
> (and some more notes…. BOPS - business operations per second, TPS - 
> transactions per second…
> SPECjbb tests CPU and Memory
> WebBench (the way we configured it) tests Network I/O and Disk I/O
>
> Values = AVG * VM count
> Domain configurations
> 1 VM - 1660 MB - SPECJBB 1500MB
> 2 VM - 1280 MB - SPECJBB - 1024MB
> 4 VM - 640 MB - SPECJBB - 512 MB
> 8 VM - 320 MB - SPECJBB - 256 MB
>
> Seeing how the SPECjbb numbers declined so bizarrely, I did some 
> scheduling tests and found this out…
>
> Test1: Examine Xen's scheduling to determine if context switching is 
> causing the overhead
> Period Slice BOPs
> Modified 8 VM 1 ms 125 us 6858
> 8 VM 10 ms 1.25 ms 14287
> 8 VM 100 ms 12.5 ms 18912
> 8 VM 1 Sec .125 Sec 20695
> 8 VM 2 Sec .25 Sec 21072
> 8 VM 10 Sec 1.25 Sec 21797
> 8 VM 100 Sec 12.5 Sec 11402
>
Did you run each JBB test config several times to ensure consistent 
results? What JVM is this?

Would it be possible to format these more appropriately for email? I am 
having a little trouble reading this :)

Thanks,

-Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-04 14:46 David_Wolinsky
  2005-08-04 15:11 ` Andrew Theurer
  0 siblings, 1 reply; 10+ messages in thread
From: David_Wolinsky @ 2005-08-04 14:46 UTC (permalink / raw)
  To: habanero; +Cc: xen-devel

That's funny, They were sent out formatted nicely...  Tests were run
multiple times for consistency purposes...  Using BEA Jrockit 1.5.

	SPECJBB	WebBench			
	1 Thread	1 Client	2 Clients	4 Clients
8 Clients
	BOPS	TPS	TPS	TPS	TPS
Host	32403.5	213.45	416.86	814.62	1523.78
1 VM	32057		205.4		380.91	569.24	733.8
2 VM	24909.25	NA		399.29	695.1		896.04
4 VM	17815.75	NA		NA		742.78	950.63
8 VM	10216.25	NA		NA		NA
1002.81

	Period	Slice		BOPs
8 VM	1 ms		125 us	6858
8 VM	10 ms		1.25 ms	14287
8 VM	100 ms	12.5 ms	18912
8 VM	1 Sec		.125 Sec	20695
8 VM	2 Sec		.25 Sec	21072
8 VM	10 Sec	1.25 Sec	21797
8 VM	100 Sec	12.5 Sec	11402

Hope it works this time.  If not, I'll submit as an attachment.

David

-----Original Message-----
From: Andrew Theurer [mailto:habanero@us.ibm.com] 
Sent: Wednesday, August 03, 2005 8:43 PM
To: Wolinsky, David
Cc: xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] Benchmarking Xen (results and questions)

David_Wolinsky@Dell.com wrote:

> Hi all,
>
> Here are some benchmarks that I've done using Xen.
>
> However, before I get started, let me explain some of configuration 
> details...
>
> Xen Version SPECjbb
> WebBench
> Linux Distribution Debian 3.1
> HT disabled
> Linux Kernel 2.6.12.2
> Host Patch CK3s
>
> Here are the initial benchmarks
>
> SPECJBB WebBench
> 1 Thread 1 Client 2 Clients 4 Clients 8 Clients BOPS TPS TPS TPS TPS 
> Host 32403.5 213.45 416.86 814.62 1523.78
> 1 VM 32057 205.4 380.91 569.24 733.8
> 2 VM 24909.25 NA 399.29 695.1 896.04
> 4 VM 17815.75 NA NA 742.78 950.63
> 8 VM 10216.25 NA NA NA 1002.81
>
> (and some more notes.... BOPS - business operations per second, TPS - 
> transactions per second... SPECjbb tests CPU and Memory WebBench (the 
> way we configured it) tests Network I/O and Disk I/O
>
> Values = AVG * VM count
> Domain configurations
> 1 VM - 1660 MB - SPECJBB 1500MB
> 2 VM - 1280 MB - SPECJBB - 1024MB
> 4 VM - 640 MB - SPECJBB - 512 MB
> 8 VM - 320 MB - SPECJBB - 256 MB
>
> Seeing how the SPECjbb numbers declined so bizarrely, I did some 
> scheduling tests and found this out...
>
> Test1: Examine Xen's scheduling to determine if context switching is 
> causing the overhead Period Slice BOPs Modified 8 VM 1 ms 125 us 6858
> 8 VM 10 ms 1.25 ms 14287
> 8 VM 100 ms 12.5 ms 18912
> 8 VM 1 Sec .125 Sec 20695
> 8 VM 2 Sec .25 Sec 21072
> 8 VM 10 Sec 1.25 Sec 21797
> 8 VM 100 Sec 12.5 Sec 11402
>
Did you run each JBB test config several times to ensure consistent
results? What JVM is this?

Would it be possible to format these more appropriately for email? I am
having a little trouble reading this :)

Thanks,

-Andrew

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Benchmarking Xen (results and questions)
  2005-08-04 14:46 David_Wolinsky
@ 2005-08-04 15:11 ` Andrew Theurer
  0 siblings, 0 replies; 10+ messages in thread
From: Andrew Theurer @ 2005-08-04 15:11 UTC (permalink / raw)
  To: David_Wolinsky; +Cc: xen-devel

This is much better.  I think my email client just didn't format your 
first (html?) message.  For JBB, I suspect the degrade is mostly cache 
thrashing, and the increased timeslice = better cache warmth.  Perhaps 
there a lot of overhead in the domain context switch as well.  What is 
the cpu cache size?

I will soon be doing tests like these on a dozen or so benchmarks, so 
hopefully we can compare results.  Have you looked at using xenoprofile 
while running your tests?

As for the WebBench numbers, do you have cpu utilization stats for Host 
and each of the domains?  Is it possible the Host and 1 VM are not using 
much cpu at all, and adding domains/clients just increases cpu util & TPS?

-Andrew

David_Wolinsky@Dell.com wrote:

>That's funny, They were sent out formatted nicely...  Tests were run
>multiple times for consistency purposes...  Using BEA Jrockit 1.5.
>
>	SPECJBB	WebBench			
>	1 Thread	1 Client	2 Clients	4 Clients
>8 Clients
>	BOPS	TPS	TPS	TPS	TPS
>Host	32403.5	213.45	416.86	814.62	1523.78
>1 VM	32057		205.4		380.91	569.24	733.8
>2 VM	24909.25	NA		399.29	695.1		896.04
>4 VM	17815.75	NA		NA		742.78	950.63
>8 VM	10216.25	NA		NA		NA
>1002.81
>
>	Period	Slice		BOPs
>8 VM	1 ms		125 us	6858
>8 VM	10 ms		1.25 ms	14287
>8 VM	100 ms	12.5 ms	18912
>8 VM	1 Sec		.125 Sec	20695
>8 VM	2 Sec		.25 Sec	21072
>8 VM	10 Sec	1.25 Sec	21797
>8 VM	100 Sec	12.5 Sec	11402
>
>Hope it works this time.  If not, I'll submit as an attachment.
>
>David
>
>-----Original Message-----
>From: Andrew Theurer [mailto:habanero@us.ibm.com] 
>Sent: Wednesday, August 03, 2005 8:43 PM
>To: Wolinsky, David
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Benchmarking Xen (results and questions)
>
>David_Wolinsky@Dell.com wrote:
>
>  
>
>>Hi all,
>>
>>Here are some benchmarks that I've done using Xen.
>>
>>However, before I get started, let me explain some of configuration 
>>details...
>>
>>Xen Version SPECjbb
>>WebBench
>>Linux Distribution Debian 3.1
>>HT disabled
>>Linux Kernel 2.6.12.2
>>Host Patch CK3s
>>
>>Here are the initial benchmarks
>>
>>SPECJBB WebBench
>>1 Thread 1 Client 2 Clients 4 Clients 8 Clients BOPS TPS TPS TPS TPS 
>>Host 32403.5 213.45 416.86 814.62 1523.78
>>1 VM 32057 205.4 380.91 569.24 733.8
>>2 VM 24909.25 NA 399.29 695.1 896.04
>>4 VM 17815.75 NA NA 742.78 950.63
>>8 VM 10216.25 NA NA NA 1002.81
>>
>>(and some more notes.... BOPS - business operations per second, TPS - 
>>transactions per second... SPECjbb tests CPU and Memory WebBench (the 
>>way we configured it) tests Network I/O and Disk I/O
>>
>>Values = AVG * VM count
>>Domain configurations
>>1 VM - 1660 MB - SPECJBB 1500MB
>>2 VM - 1280 MB - SPECJBB - 1024MB
>>4 VM - 640 MB - SPECJBB - 512 MB
>>8 VM - 320 MB - SPECJBB - 256 MB
>>
>>Seeing how the SPECjbb numbers declined so bizarrely, I did some 
>>scheduling tests and found this out...
>>
>>Test1: Examine Xen's scheduling to determine if context switching is 
>>causing the overhead Period Slice BOPs Modified 8 VM 1 ms 125 us 6858
>>8 VM 10 ms 1.25 ms 14287
>>8 VM 100 ms 12.5 ms 18912
>>8 VM 1 Sec .125 Sec 20695
>>8 VM 2 Sec .25 Sec 21072
>>8 VM 10 Sec 1.25 Sec 21797
>>8 VM 100 Sec 12.5 Sec 11402
>>
>>    
>>
>Did you run each JBB test config several times to ensure consistent
>results? What JVM is this?
>
>Would it be possible to format these more appropriately for email? I am
>having a little trouble reading this :)
>
>Thanks,
>
>-Andrew
>
>  
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-04 22:44 David_Wolinsky
  0 siblings, 0 replies; 10+ messages in thread
From: David_Wolinsky @ 2005-08-04 22:44 UTC (permalink / raw)
  To: habanero; +Cc: xen-devel

I don't recall, the second time I sent it, it said plain text... I don't
know what it was.

That's an interesting idea regarding jbb, I'll inquire with more
veteraned JBB guys and see what their opinion is.  I do know though,
that JBB does strives to not be too cacheable.

-----Original Message-----
From: Andrew Theurer [mailto:habanero@us.ibm.com] 
Sent: Thursday, August 04, 2005 10:12 AM
To: Wolinsky, David
Cc: xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] Benchmarking Xen (results and questions)

This is much better.  I think my email client just didn't format your
first (html?) message.  For JBB, I suspect the degrade is mostly cache
thrashing, and the increased timeslice = better cache warmth.  Perhaps
there a lot of overhead in the domain context switch as well.  What is
the cpu cache size?

I will soon be doing tests like these on a dozen or so benchmarks, so
hopefully we can compare results.  Have you looked at using xenoprofile
while running your tests?

As for the WebBench numbers, do you have cpu utilization stats for Host
and each of the domains?  Is it possible the Host and 1 VM are not using
much cpu at all, and adding domains/clients just increases cpu util &
TPS?

-Andrew

David_Wolinsky@Dell.com wrote:

>That's funny, They were sent out formatted nicely...  Tests were run 
>multiple times for consistency purposes...  Using BEA Jrockit 1.5.
>
>	SPECJBB	WebBench			
>	1 Thread	1 Client	2 Clients	4 Clients
>8 Clients
>	BOPS	TPS	TPS	TPS	TPS
>Host	32403.5	213.45	416.86	814.62	1523.78
>1 VM	32057		205.4		380.91	569.24	733.8
>2 VM	24909.25	NA		399.29	695.1		896.04
>4 VM	17815.75	NA		NA		742.78	950.63
>8 VM	10216.25	NA		NA		NA
>1002.81
>
>	Period	Slice		BOPs
>8 VM	1 ms		125 us	6858
>8 VM	10 ms		1.25 ms	14287
>8 VM	100 ms	12.5 ms	18912
>8 VM	1 Sec		.125 Sec	20695
>8 VM	2 Sec		.25 Sec	21072
>8 VM	10 Sec	1.25 Sec	21797
>8 VM	100 Sec	12.5 Sec	11402
>
>Hope it works this time.  If not, I'll submit as an attachment.
>
>David
>
>-----Original Message-----
>From: Andrew Theurer [mailto:habanero@us.ibm.com]
>Sent: Wednesday, August 03, 2005 8:43 PM
>To: Wolinsky, David
>Cc: xen-devel@lists.xensource.com
>Subject: Re: [Xen-devel] Benchmarking Xen (results and questions)
>
>David_Wolinsky@Dell.com wrote:
>
>  
>
>>Hi all,
>>
>>Here are some benchmarks that I've done using Xen.
>>
>>However, before I get started, let me explain some of configuration 
>>details...
>>
>>Xen Version SPECjbb
>>WebBench
>>Linux Distribution Debian 3.1
>>HT disabled
>>Linux Kernel 2.6.12.2
>>Host Patch CK3s
>>
>>Here are the initial benchmarks
>>
>>SPECJBB WebBench
>>1 Thread 1 Client 2 Clients 4 Clients 8 Clients BOPS TPS TPS TPS TPS 
>>Host 32403.5 213.45 416.86 814.62 1523.78
>>1 VM 32057 205.4 380.91 569.24 733.8
>>2 VM 24909.25 NA 399.29 695.1 896.04
>>4 VM 17815.75 NA NA 742.78 950.63
>>8 VM 10216.25 NA NA NA 1002.81
>>
>>(and some more notes.... BOPS - business operations per second, TPS - 
>>transactions per second... SPECjbb tests CPU and Memory WebBench (the 
>>way we configured it) tests Network I/O and Disk I/O
>>
>>Values = AVG * VM count
>>Domain configurations
>>1 VM - 1660 MB - SPECJBB 1500MB
>>2 VM - 1280 MB - SPECJBB - 1024MB
>>4 VM - 640 MB - SPECJBB - 512 MB
>>8 VM - 320 MB - SPECJBB - 256 MB
>>
>>Seeing how the SPECjbb numbers declined so bizarrely, I did some 
>>scheduling tests and found this out...
>>
>>Test1: Examine Xen's scheduling to determine if context switching is 
>>causing the overhead Period Slice BOPs Modified 8 VM 1 ms 125 us 6858
>>8 VM 10 ms 1.25 ms 14287
>>8 VM 100 ms 12.5 ms 18912
>>8 VM 1 Sec .125 Sec 20695
>>8 VM 2 Sec .25 Sec 21072
>>8 VM 10 Sec 1.25 Sec 21797
>>8 VM 100 Sec 12.5 Sec 11402
>>
>>    
>>
>Did you run each JBB test config several times to ensure consistent 
>results? What JVM is this?
>
>Would it be possible to format these more appropriately for email? I am

>having a little trouble reading this :)
>
>Thanks,
>
>-Andrew
>
>  
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: Benchmarking Xen (results and questions)
@ 2005-08-04 23:55 Ian Pratt
  2005-08-05 14:44 ` David Hopwood
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Pratt @ 2005-08-04 23:55 UTC (permalink / raw)
  To: Andrew Theurer, David_Wolinsky; +Cc: xen-devel

> This is much better.  I think my email client just didn't 
> format your first (html?) message.  For JBB, I suspect the 
> degrade is mostly cache thrashing, and the increased 
> timeslice = better cache warmth.  Perhaps there a lot of 
> overhead in the domain context switch as well.  What is the 
> cpu cache size?

Slices over 50ms won't yield much benefit -- it doesn't take a great
deal of time to warm a typical 1MB cache. The actual explicit cost of
performing a context switch is measured in terms of microseconds.

James Bulpin's PhD thesis provides a lot of hard data on stuff like this
for modern x86 CPUs.


Ian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Benchmarking Xen (results and questions)
  2005-08-04 23:55 Ian Pratt
@ 2005-08-05 14:44 ` David Hopwood
  0 siblings, 0 replies; 10+ messages in thread
From: David Hopwood @ 2005-08-05 14:44 UTC (permalink / raw)
  To: xen-devel

Ian Pratt wrote:
> James Bulpin's PhD thesis provides a lot of hard data on stuff like this
> for modern x86 CPUs.

<http://www.cl.cam.ac.uk/TechReports/UCAM-CL-TR-619.pdf>

-- 
David Hopwood <david.nospam.hopwood@blueyonder.co.uk>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-08-05 14:44 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-03 23:21 Benchmarking Xen (results and questions) David_Wolinsky
2005-08-04  1:43 ` Andrew Theurer
  -- strict thread matches above, loose matches on Subject: below --
2005-08-03 23:48 Ian Pratt
2005-08-03 23:55 David_Wolinsky
2005-08-04  0:19 Ian Pratt
2005-08-04 14:46 David_Wolinsky
2005-08-04 15:11 ` Andrew Theurer
2005-08-04 22:44 David_Wolinsky
2005-08-04 23:55 Ian Pratt
2005-08-05 14:44 ` David Hopwood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.