[Qemu-devel] qemu vm big network latency when met heavy io

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] qemu vm big network latency when met heavy io
@ 2014-01-06  6:55 =?gb18030?B?0rbJ3Oih?=
  2014-01-08  4:44 ` Stefan Hajnoczi
  0 siblings, 1 reply; 7+ messages in thread
From: =?gb18030?B?0rbJ3Oih?= @ 2014-01-06  6:55 UTC (permalink / raw)
  To: =?gb18030?B?cWVtdS1kaXNjdXNz?=, =?gb18030?B?cWVtdS1kZXZlbA==?=

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="gb18030", Size: 1395 bytes --]

hi, all:


There is a problem when I use ceph rbd for qemu storage. I launch 4 virtual machines, and start 5G random write test at the same time. Under such heavy I/O, the network to 
virtual machine almost unusable, the network latency is extremely big.


I had test another situation, when I use 'virsh attach-device' command to attach rbd which mapped in my host machine(which run virtual machines), the problem was not show again.


So, I think this must be qemu-rbd 's problem.


Here is my testing environment:


# virsh version
Compiled against library: libvirt 1.2.0
Using library: libvirt 1.2.0
Using API: QEMU 1.2.0
Running hypervisor: QEMU 1.7.0


In vm's xml, I define the rbd like this:
    <disk type='network' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source protocol='rbd' name='qemu/rbd-vm4'>
        <host name='10.120.111.111' port='6789'/>
      </source>
      <auth username='libvirt'>
        <secret type='ceph' uuid='38b66185-4117-47a6-90bd-64111c3fc5d2'/>
      </auth>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
    </disk>




testing tool is : fio
io depth is : 32
io engine is : libaio
io direct is open




Is there anyone met such a problem? 




regards


Alan Ye 


------------------
Alan Ye

[-- Attachment #2: Type: text/html, Size: 6374 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Qemu-devel] qemu vm big network latency when met heavy io
  2014-01-06  6:55 [Qemu-devel] qemu vm big network latency when met heavy io =?gb18030?B?0rbJ3Oih?=
@ 2014-01-08  4:44 ` Stefan Hajnoczi
       [not found]   ` <CANzgAZnvTmKc1NKoJPRLTT-L-ppYWoUko6JKHfpMzOGboLh8YA@mail.gmail.com>
       [not found]   ` <tencent_1499A979046903942D656C7F@qq.com>
  0 siblings, 2 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2014-01-08  4:44 UTC (permalink / raw)
  To: 叶绍琛; +Cc: josh.durgin, qemu-devel, qemu-discuss

On Mon, Jan 06, 2014 at 02:55:54PM +0800, 叶绍琛 wrote:
> hi, all:
> 
> 
> There is a problem when I use ceph rbd for qemu storage. I launch 4 virtual machines, and start 5G random write test at the same time. Under such heavy I/O, the network to 
> virtual machine almost unusable, the network latency is extremely big.
> 
> 
> I had test another situation, when I use 'virsh attach-device' command to attach rbd which mapped in my host machine(which run virtual machines), the problem was not show again.

Does this mean you are comparing QEMU's rbd block driver against the
Linux kernel rbd driver?

> So, I think this must be qemu-rbd 's problem.

Please try running jitterd to confirm that the guest vCPU is getting
sufficient execution time:
http://codemonkey.ws/cgit/jitterd.git/tree/jitterd.c

This test will confirm that network I/O is performing poorly, not a
starved guest CPU.

> Here is my testing environment:
> 
> 
> # virsh version
> Compiled against library: libvirt 1.2.0
> Using library: libvirt 1.2.0
> Using API: QEMU 1.2.0
> Running hypervisor: QEMU 1.7.0
> 
> 
> In vm's xml, I define the rbd like this:
>     <disk type='network' device='disk'>
>       <driver name='qemu' type='raw' cache='none'/>
>       <source protocol='rbd' name='qemu/rbd-vm4'>
>         <host name='10.120.111.111' port='6789'/>
>       </source>
>       <auth username='libvirt'>
>         <secret type='ceph' uuid='38b66185-4117-47a6-90bd-64111c3fc5d2'/>
>       </auth>
>       <target dev='vdb' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
>     </disk>
> 
> 
> 
> 
> testing tool is : fio
> io depth is : 32
> io engine is : libaio
> io direct is open
> 
> 
> 
> 
> Is there anyone met such a problem? 
> 
> 
> 
> 
> regards
> 
> 
> Alan Ye 
> 
> 
> ------------------
> Alan Ye

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <CANzgAZnvTmKc1NKoJPRLTT-L-ppYWoUko6JKHfpMzOGboLh8YA@mail.gmail.com>]

* Re: [Qemu-devel] [Qemu-discuss] qemu vm big network latency when met heavy io
       [not found]   ` <CANzgAZnvTmKc1NKoJPRLTT-L-ppYWoUko6JKHfpMzOGboLh8YA@mail.gmail.com>
@ 2014-01-09  2:47     ` Stefan Hajnoczi
  0 siblings, 0 replies; 7+ messages in thread
From: Stefan Hajnoczi @ 2014-01-09  2:47 UTC (permalink / raw)
  To: Alan Ye; +Cc: qemu-devel

On Thu, Jan 9, 2014 at 10:15 AM, Alan Ye <shaochenye@gmail.com> wrote:
> Yes.

Okay, thanks for clarifying.

Please try jitterd to confirm that network latency is the problem
rather than a starved guest vCPU.

Please keep qemu-devel@nongnu.org CCed so the discussion stays on the
mailing list.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <tencent_1499A979046903942D656C7F@qq.com>]

[parent not found: <CAJSP0QV0G=reFHP4N=njGUcZxP-x-Ef=L0U3zj5DHfRGjrgvrw@mail.gmail.com>]

[parent not found: <tencent_2A1DD90F3E8281BD36D67A63@qq.com>]

[parent not found: <CAJSP0QW1m7UL27_CROjXsG2sBse+Md9fSWMgXzTZWVaxvhaz7Q@mail.gmail.com>]

[parent not found: <tencent_74BAFF5462FB3EA25C557F97@qq.com>]

[parent not found: <CAJSP0QXgjkHf3mFzSOm5rLjd5+uHpHDPTcKPFruZ0naTiN5yHg@mail.gmail.com>]

* Re: [Qemu-devel] 回复： 回复： 回复：  qemu vm big network latency when met heavy io
       [not found]             ` <CAJSP0QXgjkHf3mFzSOm5rLjd5+uHpHDPTcKPFruZ0naTiN5yHg@mail.gmail.com>
@ 2014-01-14  6:24               ` Josh Durgin
       [not found]                 ` <tencent_35E422F81535374D5992C5AB@qq.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Durgin @ 2014-01-14  6:24 UTC (permalink / raw)
  To: Stefan Hajnoczi, 叶绍琛; +Cc: qemu-devel

On 01/12/2014 06:39 PM, Stefan Hajnoczi wrote:
> On Fri, Jan 10, 2014 at 11:50 AM, 叶绍琛 <yeshaochen@foxmail.com> wrote:
> 
> Please use Reply-all to keep the CC list in tact.  That way the
> conversation stays on the mailing list and others can participate.
> 
>>> Is the sum of guests' RAM less than the total physical RAM on the host
>> The host run 3 vms, each vm use one vcpu core and 1G ram.
>> # free -m
>>
>>               total       used       free     shared    buffers     cached
>>
>> Mem:         32242       4808      27434          0        278       2058
>>
>> -/+ buffers/cache:       2471      29771
>>
>> Swap:         4095          0       4095
>>
>>
>> The host has 8 cores.
>> # cat /proc/cpuinfo | grep processor
>> processor : 0
>> processor : 1
>> processor : 2
>> processor : 3
>> processor : 4
>> processor : 5
>> processor : 6
>> processor : 7
>>
>> so, both of two question's answer is 'yes'.
>> When I runing random write test, the host use 0 swap.
> 
> Great.  That means the host is not overcommitted.
> 
> It's likely that the problem is a bug in QEMU's rbd driver or librados.
> 
> Josh: Perhaps something you're interested in looking into?

Yes, thanks for bringing it to my attention. It does sound like a bug in
QEMU's rbd driver or ceph's userspace libraries.

Could you share what version of librbd you're using, and your
/etc/ceph/ceph.conf?

Thanks,
Josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <tencent_35E422F81535374D5992C5AB@qq.com>]

* Re: [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ICBx?= =?gb18030?q?emu_vm_big_network_latency_when_met_heavy_io?=
       [not found]                 ` <tencent_35E422F81535374D5992C5AB@qq.com>
@ 2014-01-14  7:58                   ` Josh Durgin
       [not found]                     ` <tencent_043538FC1C65245E76ED4CB0@qq.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Durgin @ 2014-01-14  7:58 UTC (permalink / raw)
  To: =?gb18030?Q?=D2=B6=C9=DC=E8=A1?=; +Cc: qemu-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030; format=flowed, Size: 5156 bytes --]

On 01/13/2014 10:39 PM, Ò¶ÉÜè¡ wrote:
> Hi Josh
>
> Thanks for your reply.
>
> librbd version : 0.67.5-1
>
> /etc/ceph/ceph.conf config file:
> the ip and hostname was hiddened.
> [global]
>          ;open auth.
>          auth cluster required = cephx
>          auth service required = cephx
>          auth client required = cephx
>          ;global pid & log setting.
>          admin socket = /home/ceph/var/run/$cluster-$name.asok
>
> [mon]
>          keyring = /home/ceph/var/lib/$type/$cluster-$id/keyring
>          mon data = /home/ceph/var/lib/$type/$cluster-$id
>          mon cluster log file = /home/ceph/log/$cluster.log
> [mon.a]
>          host = cld-xx
>          mon addr = x.x.x.x:6789
>          user = ceph
> [mon.b]
>          host = cld-xx
>          mon addr = x.x.x.x:6789
>          user = ceph
> [mon.c]
>          host = cld-xx
>          mon addr = x.x.x.x:6789
>          user = ceph
> [mon.d]
>          host = cld-xx
>          mon addr = x.x.x.x:6789
>          user = ceph
> [mon.e]
>          host = cld-xx
>          mon addr = x.x.x.x:6789
>          user = ceph
>
> [osd]
>          keyring = /home/ceph/var/lib/$type/$cluster-$id/keyring
>          osd data = /home/ceph/var/lib/$type/$cluster-$id
>          osd journal = /home/ceph/var/lib/$type/$cluster-$id/journal
>          osd journal size = 1000
>          osd mkfs type = xfs
>          osd mount options xfs = rw,noatime,inode64
> [osd.0]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sdb1
> [osd.1]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sda1
> [osd.2]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sdb1
> [osd.3]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sda1
> [osd.4]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sdb1
> [osd.5]
>          host = cld-xx
>          addr = x.x.x.x
>          user = ceph
>          devs = /dev/sda1
>
> [client.libvirt]
>          rbd cache = true

Turning on rbd caching in ceph.conf, while telling QEMU it is not
writeback in libvirt's xml like:

     <driver name='qemu' type='raw' cache='none'/>

is not safe since QEMU won't propagate flush requests from the guest
to librbd's cache. Since QEMU 1.3 setting the QEMU cache mode will
also set the librbd cache mode appropriately, so you don't need to
have the setting in your ceph.conf file at all.

Can you verify that your QEMU binary was built against librbd v0.60 or
later?

It would also help to see whether you can reproduce the problem when
QEMU is sending flushes (cache=writeback), and with the cache
disabled (cache=none, no 'rbd cache' setting in ceph.conf).

The next step would be gathering a log from librbd (debug rbd = 20,
debug objectcacher=20, debug objecter=20, debug ms = 1) when this
problem is occurring, and correlating it with a timestamped log of
the network latency.

Thanks,
Josh

>          auth support = cephx none
>          mon host =
> x.x.x.x:6789;x.x.x.x:6789;x.x.x.x:6789;x.x.x.x:6789;x.x.x.x:6789
>
> Regards
>
> Alan ye
>
> ------------------
> Ò¶ÉÜè¡
> Alan Ye
>
>
>
> ------------------ ÔÊ¼ÓÊ¼þ ------------------
> *·¢¼þÈË:* "Josh Durgin";<josh.durgin@inktank.com>;
> *·¢ËÍÊ±¼ä:* 2014Äê1ÔÂ14ÈÕ(ÐÇÆÚ¶þ) ÏÂÎç2:24
> *ÊÕ¼þÈË:* "Stefan Hajnoczi"<stefanha@gmail.com>; "Ò¶ÉÜè¡"
> <yeshaochen@foxmail.com>;
> *³ËÍ:* "qemu-devel"<qemu-devel@nongnu.org>;
> *Ö÷Ìâ:* Re: »Ø¸´£º »Ø¸´£º »Ø¸´£º [Qemu-devel] qemu vm big network
> latency when met heavy io
>
> On 01/12/2014 06:39 PM, Stefan Hajnoczi wrote:
>  > On Fri, Jan 10, 2014 at 11:50 AM, Ò¶ÉÜè¡ <yeshaochen@foxmail.com> wrote:
>  >
>  > Please use Reply-all to keep the CC list in tact.  That way the
>  > conversation stays on the mailing list and others can participate.
>  >
>  >>> Is the sum of guests' RAM less than the total physical RAM on the host
>  >> The host run 3 vms, each vm use one vcpu core and 1G ram.
>  >> # free -m
>  >>
>  >>               total       used       free     shared    buffers
> cached
>  >>
>  >> Mem:         32242       4808      27434          0        278
> 2058
>  >>
>  >> -/+ buffers/cache:       2471      29771
>  >>
>  >> Swap:         4095          0       4095
>  >>
>  >>
>  >> The host has 8 cores.
>  >> # cat /proc/cpuinfo | grep processor
>  >> processor : 0
>  >> processor : 1
>  >> processor : 2
>  >> processor : 3
>  >> processor : 4
>  >> processor : 5
>  >> processor : 6
>  >> processor : 7
>  >>
>  >> so, both of two question's answer is 'yes'.
>  >> When I runing random write test, the host use 0 swap.
>  >
>  > Great.  That means the host is not overcommitted.
>  >
>  > It's likely that the problem is a bug in QEMU's rbd driver or librados.
>  >
>  > Josh: Perhaps something you're interested in looking into?
>
> Yes, thanks for bringing it to my attention. It does sound like a bug in
> QEMU's rbd driver or ceph's userspace libraries.
>
> Could you share what version of librbd you're using, and your
> /etc/ceph/ceph.conf?
>
> Thanks,
> Josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <tencent_043538FC1C65245E76ED4CB0@qq.com>]

* Re: [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ILvY?= =?gb18030?q?=B8=B4=A3=BA__qemu_vm_big_network_latency_when_met_heavy_io?=
       [not found]                     ` <tencent_043538FC1C65245E76ED4CB0@qq.com>
@ 2014-01-16  2:25                       ` Josh Durgin
       [not found]                         ` <tencent_0C119822426FB8B81E60F6EB@qq.com>
  0 siblings, 1 reply; 7+ messages in thread
From: Josh Durgin @ 2014-01-16  2:25 UTC (permalink / raw)
  To: =?gb18030?Q?=D2=B6=C9=DC=E8=A1?=; +Cc: qemu-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030; format=flowed, Size: 1460 bytes --]

On 01/15/2014 01:40 AM, Ò¶ÉÜè¡ wrote:
> Hi Josh
>
> there is some issues
>
> 1. use 'none' cache mode in xml, and unset 'rbd cache=true' in
> ceph.conf, the network latency issue not show.
>
> 2. use 'writethrough' cache mode in xml, and unset 'rbd cache=true' in
> ceph.conf, the network latency issue not show.
>
> 3. use 'writeback' cache mode in xml, and set 'rbd cache=true' in
> ceph.conf, the network latency issue was showed.
>
> 4. use 'writeback' cache mode in xml, and unset 'rbd cache=true' in
> ceph.conf, the network latency issue was showed.
>
> according to above info, there must something wrong in librbd 's write
> cache.
>
> This must be a BUG in librbd.

It could still be the rbd driver in QEMU, but it's certainly only 
happening when writeback
caching is used.

Could you verify that librbd's asynchronous flush is being used by 
making sure
rbd_aio_flush appears in 'strings /path/to/qemu/binary | grep rbd_aio'?

If qemu isn't using rbd_aio_flush, but the synchronous version, 
rbd_flush, that's the cause of
the problem, and you just need to recompile qemu.

Otherwise, it's a new bug, and we can try to figure out what is taking 
up time by looking at
a log from librbd. Can you add this to the [client.libvirt] section of 
ceph.conf and attach the logs generated to a new issue on 
http://tracker.ceph.com:

debug ms = 1
debug objectcacher = 20
debug rbd = 20
log file = /path/writeable/by/user/running/qemu.$pid.log

Thanks,
Josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

[parent not found: <tencent_0C119822426FB8B81E60F6EB@qq.com>]

* Re: [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ILvYuLSjuiC72Li0o7ogIHFlbXUgdm0gYmlnIG5ldHdvcmsgbGF0ZW5jeSB3aGVu?= =?gb18030?q?_met_heavy_io?=
       [not found]                         ` <tencent_0C119822426FB8B81E60F6EB@qq.com>
@ 2014-01-17  6:57                           ` Josh Durgin
  0 siblings, 0 replies; 7+ messages in thread
From: Josh Durgin @ 2014-01-17  6:57 UTC (permalink / raw)
  To: =?gb18030?Q?=D2=B6=C9=DC=E8=A1?=; +Cc: qemu-devel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=gb18030; format=flowed, Size: 1267 bytes --]

On 01/15/2014 10:12 PM, Ò¶ÉÜè¡ wrote:
> Hi Josh
>
> # strings /usr/bin/qemu-system-x86_64 | grep rbd_aio
> rbd_aio_write
> rbd_aio_flush
> rbd_aio_read
> rbd_aio_create_completion
> rbd_aio_release
> rbd_aio_discard
> rbd_aio_get_return_value
>
> So, librbd's asynchronous flush is being used.
>
> I set log settings, fetch the log and ping log with timestamped, all
> logs are in the attachment.

Excellent, thanks.

> It seems that I doesn't have permission to generate a new issue
> on tracker.ceph.com, when I click the 'Register' it show a 'internal
> error' page.

Seems to be working for me. In any case, I created
http://tracker.ceph.com/issues/7165 to track the problem.

Looking through the logs, it may have been already solved in
a couple commits after 0.67.5. Namely, the cache was starting the flush
of too much dirty data at once while holding a lock, preventing other
I/O from the guest from starting, and thus blocking the qemu thread
handling the I/O.

I added the relevant commits to the wip-objectcacher-flusher-dumpling
branch in ceph.git. Could you install librbd from that branch and see
if it fixes the problem? Instructions for getting these packages are:

http://ceph.com/docs/master/install/get-packages/#add-ceph-development

Thanks,
Josh

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-17  6:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-06  6:55 [Qemu-devel] qemu vm big network latency when met heavy io =?gb18030?B?0rbJ3Oih?=
2014-01-08  4:44 ` Stefan Hajnoczi
     [not found]   ` <CANzgAZnvTmKc1NKoJPRLTT-L-ppYWoUko6JKHfpMzOGboLh8YA@mail.gmail.com>
2014-01-09  2:47     ` [Qemu-devel] [Qemu-discuss] " Stefan Hajnoczi
     [not found]   ` <tencent_1499A979046903942D656C7F@qq.com>
     [not found]     ` <CAJSP0QV0G=reFHP4N=njGUcZxP-x-Ef=L0U3zj5DHfRGjrgvrw@mail.gmail.com>
     [not found]       ` <tencent_2A1DD90F3E8281BD36D67A63@qq.com>
     [not found]         ` <CAJSP0QW1m7UL27_CROjXsG2sBse+Md9fSWMgXzTZWVaxvhaz7Q@mail.gmail.com>
     [not found]           ` <tencent_74BAFF5462FB3EA25C557F97@qq.com>
     [not found]             ` <CAJSP0QXgjkHf3mFzSOm5rLjd5+uHpHDPTcKPFruZ0naTiN5yHg@mail.gmail.com>
2014-01-14  6:24               ` [Qemu-devel] 回复： 回复： 回复： " Josh Durgin
     [not found]                 ` <tencent_35E422F81535374D5992C5AB@qq.com>
2014-01-14  7:58                   ` [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ICBx?= =?gb18030?q?emu_vm_big_network_latency_when_met_heavy_io?= Josh Durgin
     [not found]                     ` <tencent_043538FC1C65245E76ED4CB0@qq.com>
2014-01-16  2:25                       ` [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ILvY?= =?gb18030?q?=B8=B4=A3=BA__qemu_vm_big_network_latency_when_met_heavy_io?= Josh Durgin
     [not found]                         ` <tencent_0C119822426FB8B81E60F6EB@qq.com>
2014-01-17  6:57                           ` [Qemu-devel] =?gb18030?b?u9i4tKO6ILvYuLSjuiC72Li0o7ogu9i4tKO6ILvYuLSjuiC72Li0o7ogIHFlbXUgdm0gYmlnIG5ldHdvcmsgbGF0ZW5jeSB3aGVu?= =?gb18030?q?_met_heavy_io?= Josh Durgin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).