* Ceph and KVM live migration
@ 2012-07-01 0:42 Vladimir Bashkirtsev
2012-07-01 1:17 ` Josh Durgin
0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-01 0:42 UTC (permalink / raw)
To: ceph-devel
Dear all,
Currently I testing KVMs running on ceph and particularly testing recent
cache feature. Performance is of course vastly improved but still have
occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
will deal with it later. Right now I've got myself a question which I
could not get answered myself: if I do live migration of KVM while there
some uncommitted data in ceph cache will this cache be committed prior
cut-over to another host? Reading through the list I've got an
impression that it may be left uncommitted and thus it may cause data
corruption. I just would like a simple confirmation if code which
commits cache on cut-over to new host does exist and no data corruption
due to RBD cache+live migration should happen.
Regards,
Vladimir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-01 0:42 Ceph and KVM live migration Vladimir Bashkirtsev
@ 2012-07-01 1:17 ` Josh Durgin
2012-07-01 2:15 ` Vladimir Bashkirtsev
0 siblings, 1 reply; 8+ messages in thread
From: Josh Durgin @ 2012-07-01 1:17 UTC (permalink / raw)
To: Vladimir Bashkirtsev; +Cc: ceph-devel
On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
> Dear all,
>
> Currently I testing KVMs running on ceph and particularly testing recent
> cache feature. Performance is of course vastly improved but still have
> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
> will deal with it later. Right now I've got myself a question which I
> could not get answered myself: if I do live migration of KVM while there
> some uncommitted data in ceph cache will this cache be committed prior
> cut-over to another host? Reading through the list I've got an
> impression that it may be left uncommitted and thus it may cause data
> corruption. I just would like a simple confirmation if code which
> commits cache on cut-over to new host does exist and no data corruption
> due to RBD cache+live migration should happen.
>
> Regards,
> Vladimir
QEMU does a flush on all the disks when it stops the guest on the
original host, so there will be no uncommitted data in the cache.
Josh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-01 1:17 ` Josh Durgin
@ 2012-07-01 2:15 ` Vladimir Bashkirtsev
2012-07-01 2:29 ` Josh Durgin
0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-01 2:15 UTC (permalink / raw)
To: Josh Durgin; +Cc: ceph-devel
On 01/07/12 10:47, Josh Durgin wrote:
> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
>> Dear all,
>>
>> Currently I testing KVMs running on ceph and particularly testing recent
>> cache feature. Performance is of course vastly improved but still have
>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
>> will deal with it later. Right now I've got myself a question which I
>> could not get answered myself: if I do live migration of KVM while there
>> some uncommitted data in ceph cache will this cache be committed prior
>> cut-over to another host? Reading through the list I've got an
>> impression that it may be left uncommitted and thus it may cause data
>> corruption. I just would like a simple confirmation if code which
>> commits cache on cut-over to new host does exist and no data corruption
>> due to RBD cache+live migration should happen.
>>
>> Regards,
>> Vladimir
>
> QEMU does a flush on all the disks when it stops the guest on the
> original host, so there will be no uncommitted data in the cache.
>
> Josh
Thank you for quick and precise answer. Now when I actually attempted to
live migrate ceph based VM I get:
Unable to migrate guest: Invalid relative path
'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
I guess KVM does not like having :rbd_cache=true (migration works
without it). I know that it is most likely KVM problem but still decided
to ask here in case if you know about it. Any ideas how to fix it?
Regards,
Vladimir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-01 2:15 ` Vladimir Bashkirtsev
@ 2012-07-01 2:29 ` Josh Durgin
2012-07-01 3:21 ` Vladimir Bashkirtsev
0 siblings, 1 reply; 8+ messages in thread
From: Josh Durgin @ 2012-07-01 2:29 UTC (permalink / raw)
To: Vladimir Bashkirtsev; +Cc: ceph-devel
On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote:
> On 01/07/12 10:47, Josh Durgin wrote:
>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
>>> Dear all,
>>>
>>> Currently I testing KVMs running on ceph and particularly testing recent
>>> cache feature. Performance is of course vastly improved but still have
>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
>>> will deal with it later. Right now I've got myself a question which I
>>> could not get answered myself: if I do live migration of KVM while there
>>> some uncommitted data in ceph cache will this cache be committed prior
>>> cut-over to another host? Reading through the list I've got an
>>> impression that it may be left uncommitted and thus it may cause data
>>> corruption. I just would like a simple confirmation if code which
>>> commits cache on cut-over to new host does exist and no data corruption
>>> due to RBD cache+live migration should happen.
>>>
>>> Regards,
>>> Vladimir
>>
>> QEMU does a flush on all the disks when it stops the guest on the
>> original host, so there will be no uncommitted data in the cache.
>>
>> Josh
> Thank you for quick and precise answer. Now when I actually attempted to
> live migrate ceph based VM I get:
>
> Unable to migrate guest: Invalid relative path
> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
>
> I guess KVM does not like having :rbd_cache=true (migration works
> without it). I know that it is most likely KVM problem but still decided
> to ask here in case if you know about it. Any ideas how to fix it?
>
> Regards,
> Vladimir
Is the destination librbd older and not supporting the cache option?
Migrating with rbd_cache=true and other options specified like that
worked in my testing.
Josh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-01 2:29 ` Josh Durgin
@ 2012-07-01 3:21 ` Vladimir Bashkirtsev
2012-07-02 18:21 ` Gregory Farnum
0 siblings, 1 reply; 8+ messages in thread
From: Vladimir Bashkirtsev @ 2012-07-01 3:21 UTC (permalink / raw)
To: Josh Durgin; +Cc: ceph-devel
On 01/07/12 11:59, Josh Durgin wrote:
> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote:
>> On 01/07/12 10:47, Josh Durgin wrote:
>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
>>>> Dear all,
>>>>
>>>> Currently I testing KVMs running on ceph and particularly testing
>>>> recent
>>>> cache feature. Performance is of course vastly improved but still have
>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
>>>> will deal with it later. Right now I've got myself a question which I
>>>> could not get answered myself: if I do live migration of KVM while
>>>> there
>>>> some uncommitted data in ceph cache will this cache be committed prior
>>>> cut-over to another host? Reading through the list I've got an
>>>> impression that it may be left uncommitted and thus it may cause data
>>>> corruption. I just would like a simple confirmation if code which
>>>> commits cache on cut-over to new host does exist and no data
>>>> corruption
>>>> due to RBD cache+live migration should happen.
>>>>
>>>> Regards,
>>>> Vladimir
>>>
>>> QEMU does a flush on all the disks when it stops the guest on the
>>> original host, so there will be no uncommitted data in the cache.
>>>
>>> Josh
>> Thank you for quick and precise answer. Now when I actually attempted to
>> live migrate ceph based VM I get:
>>
>> Unable to migrate guest: Invalid relative path
>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
>>
>> I guess KVM does not like having :rbd_cache=true (migration works
>> without it). I know that it is most likely KVM problem but still decided
>> to ask here in case if you know about it. Any ideas how to fix it?
>>
>> Regards,
>> Vladimir
>
> Is the destination librbd older and not supporting the cache option?
>
> Migrating with rbd_cache=true and other options specified like that
> worked in my testing.
>
> Josh
Both installations are the same:
qemu 1.0.17
ceph 0.47.3
libvirt 0.9.12
I have googled around and found that if I call migration with --unsafe
option then it should go. And indeed: it works. Apparently this check
introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and
no problems migrating.
Regards,
Vladimir
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-01 3:21 ` Vladimir Bashkirtsev
@ 2012-07-02 18:21 ` Gregory Farnum
2012-07-02 19:00 ` Josh Durgin
2012-07-02 19:02 ` Christian Brunner
0 siblings, 2 replies; 8+ messages in thread
From: Gregory Farnum @ 2012-07-02 18:21 UTC (permalink / raw)
To: Josh Durgin, Vladimir Bashkirtsev; +Cc: ceph-devel
On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev
<vladimir@bashkirtsev.com> wrote:
> On 01/07/12 11:59, Josh Durgin wrote:
>>
>> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote:
>>>
>>> On 01/07/12 10:47, Josh Durgin wrote:
>>>>
>>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> Currently I testing KVMs running on ceph and particularly testing
>>>>> recent
>>>>> cache feature. Performance is of course vastly improved but still have
>>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
>>>>> will deal with it later. Right now I've got myself a question which I
>>>>> could not get answered myself: if I do live migration of KVM while
>>>>> there
>>>>> some uncommitted data in ceph cache will this cache be committed prior
>>>>> cut-over to another host? Reading through the list I've got an
>>>>> impression that it may be left uncommitted and thus it may cause data
>>>>> corruption. I just would like a simple confirmation if code which
>>>>> commits cache on cut-over to new host does exist and no data corruption
>>>>> due to RBD cache+live migration should happen.
>>>>>
>>>>> Regards,
>>>>> Vladimir
>>>>
>>>>
>>>> QEMU does a flush on all the disks when it stops the guest on the
>>>> original host, so there will be no uncommitted data in the cache.
>>>>
>>>> Josh
>>>
>>> Thank you for quick and precise answer. Now when I actually attempted to
>>> live migrate ceph based VM I get:
>>>
>>> Unable to migrate guest: Invalid relative path
>>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
>>>
>>> I guess KVM does not like having :rbd_cache=true (migration works
>>> without it). I know that it is most likely KVM problem but still decided
>>> to ask here in case if you know about it. Any ideas how to fix it?
>>>
>>> Regards,
>>> Vladimir
>>
>>
>> Is the destination librbd older and not supporting the cache option?
>>
>> Migrating with rbd_cache=true and other options specified like that
>> worked in my testing.
>>
>> Josh
>
> Both installations are the same:
> qemu 1.0.17
> ceph 0.47.3
> libvirt 0.9.12
>
> I have googled around and found that if I call migration with --unsafe
> option then it should go. And indeed: it works. Apparently this check
> introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no
> problems migrating.
Have we checked if the live migrate actually does do the cache flushes
when you use the unsafe flag? That worries me a little!
In either case, I created a bug so we can try and make QEMU play nice:
http://tracker.newdream.net/issues/2685
-Greg
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-02 18:21 ` Gregory Farnum
@ 2012-07-02 19:00 ` Josh Durgin
2012-07-02 19:02 ` Christian Brunner
1 sibling, 0 replies; 8+ messages in thread
From: Josh Durgin @ 2012-07-02 19:00 UTC (permalink / raw)
To: Gregory Farnum; +Cc: Vladimir Bashkirtsev, ceph-devel
On 07/02/2012 11:21 AM, Gregory Farnum wrote:
> On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev
> <vladimir@bashkirtsev.com> wrote:
>> On 01/07/12 11:59, Josh Durgin wrote:
>>>
>>> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote:
>>>>
>>>> On 01/07/12 10:47, Josh Durgin wrote:
>>>>>
>>>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
>>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Currently I testing KVMs running on ceph and particularly testing
>>>>>> recent
>>>>>> cache feature. Performance is of course vastly improved but still have
>>>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
>>>>>> will deal with it later. Right now I've got myself a question which I
>>>>>> could not get answered myself: if I do live migration of KVM while
>>>>>> there
>>>>>> some uncommitted data in ceph cache will this cache be committed prior
>>>>>> cut-over to another host? Reading through the list I've got an
>>>>>> impression that it may be left uncommitted and thus it may cause data
>>>>>> corruption. I just would like a simple confirmation if code which
>>>>>> commits cache on cut-over to new host does exist and no data corruption
>>>>>> due to RBD cache+live migration should happen.
>>>>>>
>>>>>> Regards,
>>>>>> Vladimir
>>>>>
>>>>>
>>>>> QEMU does a flush on all the disks when it stops the guest on the
>>>>> original host, so there will be no uncommitted data in the cache.
>>>>>
>>>>> Josh
>>>>
>>>> Thank you for quick and precise answer. Now when I actually attempted to
>>>> live migrate ceph based VM I get:
>>>>
>>>> Unable to migrate guest: Invalid relative path
>>>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
>>>>
>>>> I guess KVM does not like having :rbd_cache=true (migration works
>>>> without it). I know that it is most likely KVM problem but still decided
>>>> to ask here in case if you know about it. Any ideas how to fix it?
>>>>
>>>> Regards,
>>>> Vladimir
>>>
>>>
>>> Is the destination librbd older and not supporting the cache option?
>>>
>>> Migrating with rbd_cache=true and other options specified like that
>>> worked in my testing.
>>>
>>> Josh
>>
>> Both installations are the same:
>> qemu 1.0.17
>> ceph 0.47.3
>> libvirt 0.9.12
>>
>> I have googled around and found that if I call migration with --unsafe
>> option then it should go. And indeed: it works. Apparently this check
>> introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no
>> problems migrating.
>
> Have we checked if the live migrate actually does do the cache flushes
> when you use the unsafe flag? That worries me a little!
The unsafe flag is purely a libvirt mechanism for bypassing libvirt's
format whitelist. It does not affect qemu at all.
> In either case, I created a bug so we can try and make QEMU play nice:
> http://tracker.newdream.net/issues/2685
The issue is with libvirt, not qemu. I sent a patch fixing it to the
libvirt list:
http://www.redhat.com/archives/libvir-list/2012-July/msg00021.html
Josh
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration
2012-07-02 18:21 ` Gregory Farnum
2012-07-02 19:00 ` Josh Durgin
@ 2012-07-02 19:02 ` Christian Brunner
1 sibling, 0 replies; 8+ messages in thread
From: Christian Brunner @ 2012-07-02 19:02 UTC (permalink / raw)
To: Gregory Farnum; +Cc: Josh Durgin, Vladimir Bashkirtsev, ceph-devel
[-- Attachment #1: Type: text/plain, Size: 2989 bytes --]
On Mon, Jul 02, 2012 at 11:21:40AM -0700, Gregory Farnum wrote:
> On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev
> <vladimir@bashkirtsev.com> wrote:
> > On 01/07/12 11:59, Josh Durgin wrote:
> >>
> >> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote:
> >>>
> >>> On 01/07/12 10:47, Josh Durgin wrote:
> >>>>
> >>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote:
> >>>>>
> >>>>> Dear all,
> >>>>>
> >>>>> Currently I testing KVMs running on ceph and particularly testing
> >>>>> recent
> >>>>> cache feature. Performance is of course vastly improved but still have
> >>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I
> >>>>> will deal with it later. Right now I've got myself a question which I
> >>>>> could not get answered myself: if I do live migration of KVM while
> >>>>> there
> >>>>> some uncommitted data in ceph cache will this cache be committed prior
> >>>>> cut-over to another host? Reading through the list I've got an
> >>>>> impression that it may be left uncommitted and thus it may cause data
> >>>>> corruption. I just would like a simple confirmation if code which
> >>>>> commits cache on cut-over to new host does exist and no data corruption
> >>>>> due to RBD cache+live migration should happen.
> >>>>>
> >>>>> Regards,
> >>>>> Vladimir
> >>>>
> >>>>
> >>>> QEMU does a flush on all the disks when it stops the guest on the
> >>>> original host, so there will be no uncommitted data in the cache.
> >>>>
> >>>> Josh
> >>>
> >>> Thank you for quick and precise answer. Now when I actually attempted to
> >>> live migrate ceph based VM I get:
> >>>
> >>> Unable to migrate guest: Invalid relative path
> >>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument
> >>>
> >>> I guess KVM does not like having :rbd_cache=true (migration works
> >>> without it). I know that it is most likely KVM problem but still decided
> >>> to ask here in case if you know about it. Any ideas how to fix it?
> >>>
> >>> Regards,
> >>> Vladimir
> >>
> >>
> >> Is the destination librbd older and not supporting the cache option?
> >>
> >> Migrating with rbd_cache=true and other options specified like that
> >> worked in my testing.
> >>
> >> Josh
> >
> > Both installations are the same:
> > qemu 1.0.17
> > ceph 0.47.3
> > libvirt 0.9.12
> >
> > I have googled around and found that if I call migration with --unsafe
> > option then it should go. And indeed: it works. Apparently this check
> > introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no
> > problems migrating.
>
> Have we checked if the live migrate actually does do the cache flushes
> when you use the unsafe flag? That worries me a little!
>
> In either case, I created a bug so we can try and make QEMU play nice:
> http://tracker.newdream.net/issues/2685
I took a quick look at the libvirt code and I think this is an issue in
libvirt only. The unsafe flag is not handed over to qemu.
You could try the attached patch (untested).
Christian
[-- Attachment #2: 0001-libvirt-allow-migration-for-network-protocols.patch --]
[-- Type: text/plain, Size: 1207 bytes --]
From 36314693f8b9be1f3c77621543adf01d7c51cb88 Mon Sep 17 00:00:00 2001
From: Christian Brunner <chb@muc.de>
Date: Tue, 19 Jun 2012 12:23:38 +0200
Subject: [PATCH] libvirt: allow migration for network protocols
Live migration should be possible with most (all ?) network
protocols, as qemu does a flush right before the migration.
Signed-off-by: Christian Brunner <chb@muc.de>
---
src/qemu/qemu_migration.c | 6 ++++++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index aee613e..6392b98 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -848,6 +848,12 @@ qemuMigrationIsSafe(virDomainDefPtr def)
return false;
}
+ if (disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD ||
+ disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_NBD ||
+ disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG ) {
+ continue;
+ }
+
qemuReportError(VIR_ERR_MIGRATE_UNSAFE, "%s",
_("Migration may lead to data corruption if disks"
" use cache != none"));
--
1.7.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-07-02 19:05 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-01 0:42 Ceph and KVM live migration Vladimir Bashkirtsev
2012-07-01 1:17 ` Josh Durgin
2012-07-01 2:15 ` Vladimir Bashkirtsev
2012-07-01 2:29 ` Josh Durgin
2012-07-01 3:21 ` Vladimir Bashkirtsev
2012-07-02 18:21 ` Gregory Farnum
2012-07-02 19:00 ` Josh Durgin
2012-07-02 19:02 ` Christian Brunner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.