* [Qemu-devel] QEMU throughput is down with SMP
@ 2010-09-30 0:50 Venkateswararao Jujjuri (JV)
2010-09-30 9:13 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: Venkateswararao Jujjuri (JV) @ 2010-09-30 0:50 UTC (permalink / raw)
To: Qemu-development List
Code: Mainline QEMU (git://git.qemu.org/qemu.git)
Machine: LS21 blade.
Disk: Local disk through VirtIO.
Did not select any cache option. Defaulting to writethrough.
Command tested:
3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k count=100000
QEMU with smp=1
19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
QEMU with smp=4
15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
Is this expected?
Thanks,
JV
=== Details ===
smp = 1
time dd if=/dev/zero of=/pmnt/my_pw bs=4k count=100000 & time dd if=/dev/zero
of=/pmnt/my_pw1 bs=4k count=100000 & time dd if=/dev/zero of=/pmnt/my_pw2 bs=4k
count=100000 &
[root@localhost ~]# 100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 21.2747 s, 19.3 MB/s
real 0m21.377s
user 0m0.040s
sys 0m1.655s
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 21.2799 s, 19.2 MB/s
real 0m21.374s
user 0m0.046s
sys 0m1.660s
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 22.0735 s, 18.6 MB/s
real 0m22.153s
user 0m0.043s
sys 0m1.642s
smp = 4
[root@localhost ~]# time dd if=/dev/zero of=/pmnt/my_pw bs=4k count=100000 &
time dd if=/dev/zero of=/pmnt/my_pw1 bs=4k count=100000 & time dd if=/dev/zero
of=/pmnt/my_pw2 bs=4k count=100000 &
[root@localhost ~]# 100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 26.8055 s, 15.3 MB/s
real 0m26.869s
user 0m0.079s
sys 0m3.333s
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 28.9583 s, 14.1 MB/s
real 0m29.018s
user 0m0.053s
sys 0m4.313s
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 30.1739 s, 13.6 MB/s
real 0m30.238s
user 0m0.065s
sys 0m4.124s
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-09-30 0:50 [Qemu-devel] QEMU throughput is down with SMP Venkateswararao Jujjuri (JV)
@ 2010-09-30 9:13 ` Stefan Hajnoczi
2010-09-30 19:19 ` Venkateswararao Jujjuri (JV)
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2010-09-30 9:13 UTC (permalink / raw)
To: Venkateswararao Jujjuri (JV); +Cc: Qemu-development List
On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
<jvrao@linux.vnet.ibm.com> wrote:
> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
> Machine: LS21 blade.
> Disk: Local disk through VirtIO.
> Did not select any cache option. Defaulting to writethrough.
>
> Command tested:
> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k count=100000
>
> QEMU with smp=1
> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>
> QEMU with smp=4
> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>
> Is this expected?
Did you configure with --enable-io-thread?
Also, try using dd oflag=direct to eliminate effects introduced by the
guest page cache and really hit the disk.
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-09-30 9:13 ` Stefan Hajnoczi
@ 2010-09-30 19:19 ` Venkateswararao Jujjuri (JV)
2010-10-01 8:41 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: Venkateswararao Jujjuri (JV) @ 2010-09-30 19:19 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Qemu-development List
On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
> <jvrao@linux.vnet.ibm.com> wrote:
>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
>> Machine: LS21 blade.
>> Disk: Local disk through VirtIO.
>> Did not select any cache option. Defaulting to writethrough.
>>
>> Command tested:
>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k count=100000
>>
>> QEMU with smp=1
>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>>
>> QEMU with smp=4
>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>>
>> Is this expected?
>
> Did you configure with --enable-io-thread?
Yes I did.
>
> Also, try using dd oflag=direct to eliminate effects introduced by the
> guest page cache and really hit the disk.
With oflag=direct , I see no difference and the throughput is so slow and I
would not
expect to see any difference.
It is 225 kb/s for each thread either with smp=1 or with smp=4.
- JV
>
> Stefan
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-09-30 19:19 ` Venkateswararao Jujjuri (JV)
@ 2010-10-01 8:41 ` Stefan Hajnoczi
2010-10-01 13:38 ` Ryan Harper
0 siblings, 1 reply; 7+ messages in thread
From: Stefan Hajnoczi @ 2010-10-01 8:41 UTC (permalink / raw)
To: Venkateswararao Jujjuri (JV); +Cc: Qemu-development List
On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
<jvrao@linux.vnet.ibm.com> wrote:
> On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
>>
>> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
>> <jvrao@linux.vnet.ibm.com> wrote:
>>>
>>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
>>> Machine: LS21 blade.
>>> Disk: Local disk through VirtIO.
>>> Did not select any cache option. Defaulting to writethrough.
>>>
>>> Command tested:
>>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k
>>> count=100000
>>>
>>> QEMU with smp=1
>>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>>>
>>> QEMU with smp=4
>>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>>>
>>> Is this expected?
>>
>> Did you configure with --enable-io-thread?
>
> Yes I did.
>>
>> Also, try using dd oflag=direct to eliminate effects introduced by the
>> guest page cache and really hit the disk.
>
> With oflag=direct , I see no difference and the throughput is so slow and I
> would not
> expect to see any difference.
> It is 225 kb/s for each thread either with smp=1 or with smp=4.
If I understand correctly you are getting:
QEMU oflag=direct with smp=1
225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
QEMU oflag=direct with smp=4
225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
This suggests the degradation for smp=4 is guest kernel page cache or
buffered I/O related. Perhaps lockholder preemption?
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-10-01 8:41 ` Stefan Hajnoczi
@ 2010-10-01 13:38 ` Ryan Harper
2010-10-01 15:04 ` Venkateswararao Jujjuri (JV)
0 siblings, 1 reply; 7+ messages in thread
From: Ryan Harper @ 2010-10-01 13:38 UTC (permalink / raw)
To: Stefan Hajnoczi; +Cc: Venkateswararao Jujjuri (JV), Qemu-development List
* Stefan Hajnoczi <stefanha@gmail.com> [2010-10-01 03:48]:
> On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
> <jvrao@linux.vnet.ibm.com> wrote:
> > On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
> >>
> >> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
> >> <jvrao@linux.vnet.ibm.com> wrote:
> >>>
> >>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
> >>> Machine: LS21 blade.
> >>> Disk: Local disk through VirtIO.
> >>> Did not select any cache option. Defaulting to writethrough.
> >>>
> >>> Command tested:
> >>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k
> >>> count=100000
> >>>
> >>> QEMU with smp=1
> >>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
> >>>
> >>> QEMU with smp=4
> >>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
> >>>
> >>> Is this expected?
> >>
> >> Did you configure with --enable-io-thread?
> >
> > Yes I did.
> >>
> >> Also, try using dd oflag=direct to eliminate effects introduced by the
> >> guest page cache and really hit the disk.
> >
> > With oflag=direct , I see no difference and the throughput is so slow and I
> > would not
> > expect to see any difference.
> > It is 225 kb/s for each thread either with smp=1 or with smp=4.
>
> If I understand correctly you are getting:
>
> QEMU oflag=direct with smp=1
> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>
> QEMU oflag=direct with smp=4
> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>
> This suggests the degradation for smp=4 is guest kernel page cache or
> buffered I/O related. Perhaps lockholder preemption?
or just a single spindle maxed out because the blade hard drive doesn't
have writecache enabled (it's disabled by default).
--
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
ryanh@us.ibm.com
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-10-01 13:38 ` Ryan Harper
@ 2010-10-01 15:04 ` Venkateswararao Jujjuri (JV)
2010-10-01 15:09 ` Stefan Hajnoczi
0 siblings, 1 reply; 7+ messages in thread
From: Venkateswararao Jujjuri (JV) @ 2010-10-01 15:04 UTC (permalink / raw)
To: Ryan Harper; +Cc: Stefan Hajnoczi, Qemu-development List
On 10/1/2010 6:38 AM, Ryan Harper wrote:
> * Stefan Hajnoczi<stefanha@gmail.com> [2010-10-01 03:48]:
>> On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
>> <jvrao@linux.vnet.ibm.com> wrote:
>>> On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
>>>>
>>>> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
>>>> <jvrao@linux.vnet.ibm.com> wrote:
>>>>>
>>>>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
>>>>> Machine: LS21 blade.
>>>>> Disk: Local disk through VirtIO.
>>>>> Did not select any cache option. Defaulting to writethrough.
>>>>>
>>>>> Command tested:
>>>>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k
>>>>> count=100000
>>>>>
>>>>> QEMU with smp=1
>>>>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>>>>>
>>>>> QEMU with smp=4
>>>>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>>>>>
>>>>> Is this expected?
>>>>
>>>> Did you configure with --enable-io-thread?
>>>
>>> Yes I did.
>>>>
>>>> Also, try using dd oflag=direct to eliminate effects introduced by the
>>>> guest page cache and really hit the disk.
>>>
>>> With oflag=direct , I see no difference and the throughput is so slow and I
>>> would not
>>> expect to see any difference.
>>> It is 225 kb/s for each thread either with smp=1 or with smp=4.
>>
>> If I understand correctly you are getting:
>>
>> QEMU oflag=direct with smp=1
>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>
>> QEMU oflag=direct with smp=4
>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>
>> This suggests the degradation for smp=4 is guest kernel page cache or
>> buffered I/O related. Perhaps lockholder preemption?
>
> or just a single spindle maxed out because the blade hard drive doesn't
> have writecache enabled (it's disabled by default).
Yes, I am sure we are hitting the max limit on the blade local disk.
Question is why the smp=4 degraded the performance in the cached mode.
I am running latest kernel from upstream on the guest(2.6.36-rc5)..and using
block IO.
Do we have any know issues in there which could explain performance degradation?
I am trying to get to a test which proves that the QEMU SMP improves/scales.
I would like to use it in validating our new VirtFS threading code (yet to hit
mailing list).
Thanks,
JV
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Qemu-devel] QEMU throughput is down with SMP
2010-10-01 15:04 ` Venkateswararao Jujjuri (JV)
@ 2010-10-01 15:09 ` Stefan Hajnoczi
0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2010-10-01 15:09 UTC (permalink / raw)
To: Venkateswararao Jujjuri (JV); +Cc: Ryan Harper, Qemu-development List
On Fri, Oct 1, 2010 at 4:04 PM, Venkateswararao Jujjuri (JV)
<jvrao@linux.vnet.ibm.com> wrote:
> On 10/1/2010 6:38 AM, Ryan Harper wrote:
>>
>> * Stefan Hajnoczi<stefanha@gmail.com> [2010-10-01 03:48]:
>>>
>>> On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
>>> <jvrao@linux.vnet.ibm.com> wrote:
>>>>
>>>> On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
>>>>>
>>>>> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
>>>>> <jvrao@linux.vnet.ibm.com> wrote:
>>>>>>
>>>>>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
>>>>>> Machine: LS21 blade.
>>>>>> Disk: Local disk through VirtIO.
>>>>>> Did not select any cache option. Defaulting to writethrough.
>>>>>>
>>>>>> Command tested:
>>>>>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k
>>>>>> count=100000
>>>>>>
>>>>>> QEMU with smp=1
>>>>>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>>>>>>
>>>>>> QEMU with smp=4
>>>>>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>>>>>>
>>>>>> Is this expected?
>>>>>
>>>>> Did you configure with --enable-io-thread?
>>>>
>>>> Yes I did.
>>>>>
>>>>> Also, try using dd oflag=direct to eliminate effects introduced by the
>>>>> guest page cache and really hit the disk.
>>>>
>>>> With oflag=direct , I see no difference and the throughput is so slow
>>>> and I
>>>> would not
>>>> expect to see any difference.
>>>> It is 225 kb/s for each thread either with smp=1 or with smp=4.
>>>
>>> If I understand correctly you are getting:
>>>
>>> QEMU oflag=direct with smp=1
>>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>>
>>> QEMU oflag=direct with smp=4
>>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>>
>>> This suggests the degradation for smp=4 is guest kernel page cache or
>>> buffered I/O related. Perhaps lockholder preemption?
>>
>> or just a single spindle maxed out because the blade hard drive doesn't
>> have writecache enabled (it's disabled by default).
>
> Yes, I am sure we are hitting the max limit on the blade local disk.
> Question is why the smp=4 degraded the performance in the cached mode.
>
> I am running latest kernel from upstream on the guest(2.6.36-rc5)..and using
> block IO.
> Do we have any know issues in there which could explain performance
> degradation?
I suggested that lockholder preemption might be the issue. If you
check /proc/lock_stat in a guest debug kernel after seeing poor
performance, do the lock statistics look suspicious (very long hold
times)?
Stefan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2010-10-01 15:09 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-30 0:50 [Qemu-devel] QEMU throughput is down with SMP Venkateswararao Jujjuri (JV)
2010-09-30 9:13 ` Stefan Hajnoczi
2010-09-30 19:19 ` Venkateswararao Jujjuri (JV)
2010-10-01 8:41 ` Stefan Hajnoczi
2010-10-01 13:38 ` Ryan Harper
2010-10-01 15:04 ` Venkateswararao Jujjuri (JV)
2010-10-01 15:09 ` Stefan Hajnoczi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).