* Ceph and KVM live migration @ 2012-07-01 0:42 Vladimir Bashkirtsev 2012-07-01 1:17 ` Josh Durgin 0 siblings, 1 reply; 8+ messages in thread From: Vladimir Bashkirtsev @ 2012-07-01 0:42 UTC (permalink / raw) To: ceph-devel Dear all, Currently I testing KVMs running on ceph and particularly testing recent cache feature. Performance is of course vastly improved but still have occasional KVM hold ups - not sure who is at blame ceph of KVM. But I will deal with it later. Right now I've got myself a question which I could not get answered myself: if I do live migration of KVM while there some uncommitted data in ceph cache will this cache be committed prior cut-over to another host? Reading through the list I've got an impression that it may be left uncommitted and thus it may cause data corruption. I just would like a simple confirmation if code which commits cache on cut-over to new host does exist and no data corruption due to RBD cache+live migration should happen. Regards, Vladimir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-01 0:42 Ceph and KVM live migration Vladimir Bashkirtsev @ 2012-07-01 1:17 ` Josh Durgin 2012-07-01 2:15 ` Vladimir Bashkirtsev 0 siblings, 1 reply; 8+ messages in thread From: Josh Durgin @ 2012-07-01 1:17 UTC (permalink / raw) To: Vladimir Bashkirtsev; +Cc: ceph-devel On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: > Dear all, > > Currently I testing KVMs running on ceph and particularly testing recent > cache feature. Performance is of course vastly improved but still have > occasional KVM hold ups - not sure who is at blame ceph of KVM. But I > will deal with it later. Right now I've got myself a question which I > could not get answered myself: if I do live migration of KVM while there > some uncommitted data in ceph cache will this cache be committed prior > cut-over to another host? Reading through the list I've got an > impression that it may be left uncommitted and thus it may cause data > corruption. I just would like a simple confirmation if code which > commits cache on cut-over to new host does exist and no data corruption > due to RBD cache+live migration should happen. > > Regards, > Vladimir QEMU does a flush on all the disks when it stops the guest on the original host, so there will be no uncommitted data in the cache. Josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-01 1:17 ` Josh Durgin @ 2012-07-01 2:15 ` Vladimir Bashkirtsev 2012-07-01 2:29 ` Josh Durgin 0 siblings, 1 reply; 8+ messages in thread From: Vladimir Bashkirtsev @ 2012-07-01 2:15 UTC (permalink / raw) To: Josh Durgin; +Cc: ceph-devel On 01/07/12 10:47, Josh Durgin wrote: > On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: >> Dear all, >> >> Currently I testing KVMs running on ceph and particularly testing recent >> cache feature. Performance is of course vastly improved but still have >> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I >> will deal with it later. Right now I've got myself a question which I >> could not get answered myself: if I do live migration of KVM while there >> some uncommitted data in ceph cache will this cache be committed prior >> cut-over to another host? Reading through the list I've got an >> impression that it may be left uncommitted and thus it may cause data >> corruption. I just would like a simple confirmation if code which >> commits cache on cut-over to new host does exist and no data corruption >> due to RBD cache+live migration should happen. >> >> Regards, >> Vladimir > > QEMU does a flush on all the disks when it stops the guest on the > original host, so there will be no uncommitted data in the cache. > > Josh Thank you for quick and precise answer. Now when I actually attempted to live migrate ceph based VM I get: Unable to migrate guest: Invalid relative path 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument I guess KVM does not like having :rbd_cache=true (migration works without it). I know that it is most likely KVM problem but still decided to ask here in case if you know about it. Any ideas how to fix it? Regards, Vladimir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-01 2:15 ` Vladimir Bashkirtsev @ 2012-07-01 2:29 ` Josh Durgin 2012-07-01 3:21 ` Vladimir Bashkirtsev 0 siblings, 1 reply; 8+ messages in thread From: Josh Durgin @ 2012-07-01 2:29 UTC (permalink / raw) To: Vladimir Bashkirtsev; +Cc: ceph-devel On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: > On 01/07/12 10:47, Josh Durgin wrote: >> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: >>> Dear all, >>> >>> Currently I testing KVMs running on ceph and particularly testing recent >>> cache feature. Performance is of course vastly improved but still have >>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I >>> will deal with it later. Right now I've got myself a question which I >>> could not get answered myself: if I do live migration of KVM while there >>> some uncommitted data in ceph cache will this cache be committed prior >>> cut-over to another host? Reading through the list I've got an >>> impression that it may be left uncommitted and thus it may cause data >>> corruption. I just would like a simple confirmation if code which >>> commits cache on cut-over to new host does exist and no data corruption >>> due to RBD cache+live migration should happen. >>> >>> Regards, >>> Vladimir >> >> QEMU does a flush on all the disks when it stops the guest on the >> original host, so there will be no uncommitted data in the cache. >> >> Josh > Thank you for quick and precise answer. Now when I actually attempted to > live migrate ceph based VM I get: > > Unable to migrate guest: Invalid relative path > 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument > > I guess KVM does not like having :rbd_cache=true (migration works > without it). I know that it is most likely KVM problem but still decided > to ask here in case if you know about it. Any ideas how to fix it? > > Regards, > Vladimir Is the destination librbd older and not supporting the cache option? Migrating with rbd_cache=true and other options specified like that worked in my testing. Josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-01 2:29 ` Josh Durgin @ 2012-07-01 3:21 ` Vladimir Bashkirtsev 2012-07-02 18:21 ` Gregory Farnum 0 siblings, 1 reply; 8+ messages in thread From: Vladimir Bashkirtsev @ 2012-07-01 3:21 UTC (permalink / raw) To: Josh Durgin; +Cc: ceph-devel On 01/07/12 11:59, Josh Durgin wrote: > On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: >> On 01/07/12 10:47, Josh Durgin wrote: >>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: >>>> Dear all, >>>> >>>> Currently I testing KVMs running on ceph and particularly testing >>>> recent >>>> cache feature. Performance is of course vastly improved but still have >>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I >>>> will deal with it later. Right now I've got myself a question which I >>>> could not get answered myself: if I do live migration of KVM while >>>> there >>>> some uncommitted data in ceph cache will this cache be committed prior >>>> cut-over to another host? Reading through the list I've got an >>>> impression that it may be left uncommitted and thus it may cause data >>>> corruption. I just would like a simple confirmation if code which >>>> commits cache on cut-over to new host does exist and no data >>>> corruption >>>> due to RBD cache+live migration should happen. >>>> >>>> Regards, >>>> Vladimir >>> >>> QEMU does a flush on all the disks when it stops the guest on the >>> original host, so there will be no uncommitted data in the cache. >>> >>> Josh >> Thank you for quick and precise answer. Now when I actually attempted to >> live migrate ceph based VM I get: >> >> Unable to migrate guest: Invalid relative path >> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument >> >> I guess KVM does not like having :rbd_cache=true (migration works >> without it). I know that it is most likely KVM problem but still decided >> to ask here in case if you know about it. Any ideas how to fix it? >> >> Regards, >> Vladimir > > Is the destination librbd older and not supporting the cache option? > > Migrating with rbd_cache=true and other options specified like that > worked in my testing. > > Josh Both installations are the same: qemu 1.0.17 ceph 0.47.3 libvirt 0.9.12 I have googled around and found that if I call migration with --unsafe option then it should go. And indeed: it works. Apparently this check introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no problems migrating. Regards, Vladimir ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-01 3:21 ` Vladimir Bashkirtsev @ 2012-07-02 18:21 ` Gregory Farnum 2012-07-02 19:00 ` Josh Durgin 2012-07-02 19:02 ` Christian Brunner 0 siblings, 2 replies; 8+ messages in thread From: Gregory Farnum @ 2012-07-02 18:21 UTC (permalink / raw) To: Josh Durgin, Vladimir Bashkirtsev; +Cc: ceph-devel On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev <vladimir@bashkirtsev.com> wrote: > On 01/07/12 11:59, Josh Durgin wrote: >> >> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: >>> >>> On 01/07/12 10:47, Josh Durgin wrote: >>>> >>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: >>>>> >>>>> Dear all, >>>>> >>>>> Currently I testing KVMs running on ceph and particularly testing >>>>> recent >>>>> cache feature. Performance is of course vastly improved but still have >>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I >>>>> will deal with it later. Right now I've got myself a question which I >>>>> could not get answered myself: if I do live migration of KVM while >>>>> there >>>>> some uncommitted data in ceph cache will this cache be committed prior >>>>> cut-over to another host? Reading through the list I've got an >>>>> impression that it may be left uncommitted and thus it may cause data >>>>> corruption. I just would like a simple confirmation if code which >>>>> commits cache on cut-over to new host does exist and no data corruption >>>>> due to RBD cache+live migration should happen. >>>>> >>>>> Regards, >>>>> Vladimir >>>> >>>> >>>> QEMU does a flush on all the disks when it stops the guest on the >>>> original host, so there will be no uncommitted data in the cache. >>>> >>>> Josh >>> >>> Thank you for quick and precise answer. Now when I actually attempted to >>> live migrate ceph based VM I get: >>> >>> Unable to migrate guest: Invalid relative path >>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument >>> >>> I guess KVM does not like having :rbd_cache=true (migration works >>> without it). I know that it is most likely KVM problem but still decided >>> to ask here in case if you know about it. Any ideas how to fix it? >>> >>> Regards, >>> Vladimir >> >> >> Is the destination librbd older and not supporting the cache option? >> >> Migrating with rbd_cache=true and other options specified like that >> worked in my testing. >> >> Josh > > Both installations are the same: > qemu 1.0.17 > ceph 0.47.3 > libvirt 0.9.12 > > I have googled around and found that if I call migration with --unsafe > option then it should go. And indeed: it works. Apparently this check > introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no > problems migrating. Have we checked if the live migrate actually does do the cache flushes when you use the unsafe flag? That worries me a little! In either case, I created a bug so we can try and make QEMU play nice: http://tracker.newdream.net/issues/2685 -Greg ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-02 18:21 ` Gregory Farnum @ 2012-07-02 19:00 ` Josh Durgin 2012-07-02 19:02 ` Christian Brunner 1 sibling, 0 replies; 8+ messages in thread From: Josh Durgin @ 2012-07-02 19:00 UTC (permalink / raw) To: Gregory Farnum; +Cc: Vladimir Bashkirtsev, ceph-devel On 07/02/2012 11:21 AM, Gregory Farnum wrote: > On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev > <vladimir@bashkirtsev.com> wrote: >> On 01/07/12 11:59, Josh Durgin wrote: >>> >>> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: >>>> >>>> On 01/07/12 10:47, Josh Durgin wrote: >>>>> >>>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: >>>>>> >>>>>> Dear all, >>>>>> >>>>>> Currently I testing KVMs running on ceph and particularly testing >>>>>> recent >>>>>> cache feature. Performance is of course vastly improved but still have >>>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I >>>>>> will deal with it later. Right now I've got myself a question which I >>>>>> could not get answered myself: if I do live migration of KVM while >>>>>> there >>>>>> some uncommitted data in ceph cache will this cache be committed prior >>>>>> cut-over to another host? Reading through the list I've got an >>>>>> impression that it may be left uncommitted and thus it may cause data >>>>>> corruption. I just would like a simple confirmation if code which >>>>>> commits cache on cut-over to new host does exist and no data corruption >>>>>> due to RBD cache+live migration should happen. >>>>>> >>>>>> Regards, >>>>>> Vladimir >>>>> >>>>> >>>>> QEMU does a flush on all the disks when it stops the guest on the >>>>> original host, so there will be no uncommitted data in the cache. >>>>> >>>>> Josh >>>> >>>> Thank you for quick and precise answer. Now when I actually attempted to >>>> live migrate ceph based VM I get: >>>> >>>> Unable to migrate guest: Invalid relative path >>>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument >>>> >>>> I guess KVM does not like having :rbd_cache=true (migration works >>>> without it). I know that it is most likely KVM problem but still decided >>>> to ask here in case if you know about it. Any ideas how to fix it? >>>> >>>> Regards, >>>> Vladimir >>> >>> >>> Is the destination librbd older and not supporting the cache option? >>> >>> Migrating with rbd_cache=true and other options specified like that >>> worked in my testing. >>> >>> Josh >> >> Both installations are the same: >> qemu 1.0.17 >> ceph 0.47.3 >> libvirt 0.9.12 >> >> I have googled around and found that if I call migration with --unsafe >> option then it should go. And indeed: it works. Apparently this check >> introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no >> problems migrating. > > Have we checked if the live migrate actually does do the cache flushes > when you use the unsafe flag? That worries me a little! The unsafe flag is purely a libvirt mechanism for bypassing libvirt's format whitelist. It does not affect qemu at all. > In either case, I created a bug so we can try and make QEMU play nice: > http://tracker.newdream.net/issues/2685 The issue is with libvirt, not qemu. I sent a patch fixing it to the libvirt list: http://www.redhat.com/archives/libvir-list/2012-July/msg00021.html Josh ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Ceph and KVM live migration 2012-07-02 18:21 ` Gregory Farnum 2012-07-02 19:00 ` Josh Durgin @ 2012-07-02 19:02 ` Christian Brunner 1 sibling, 0 replies; 8+ messages in thread From: Christian Brunner @ 2012-07-02 19:02 UTC (permalink / raw) To: Gregory Farnum; +Cc: Josh Durgin, Vladimir Bashkirtsev, ceph-devel [-- Attachment #1: Type: text/plain, Size: 2989 bytes --] On Mon, Jul 02, 2012 at 11:21:40AM -0700, Gregory Farnum wrote: > On Sat, Jun 30, 2012 at 8:21 PM, Vladimir Bashkirtsev > <vladimir@bashkirtsev.com> wrote: > > On 01/07/12 11:59, Josh Durgin wrote: > >> > >> On 06/30/2012 07:15 PM, Vladimir Bashkirtsev wrote: > >>> > >>> On 01/07/12 10:47, Josh Durgin wrote: > >>>> > >>>> On 06/30/2012 05:42 PM, Vladimir Bashkirtsev wrote: > >>>>> > >>>>> Dear all, > >>>>> > >>>>> Currently I testing KVMs running on ceph and particularly testing > >>>>> recent > >>>>> cache feature. Performance is of course vastly improved but still have > >>>>> occasional KVM hold ups - not sure who is at blame ceph of KVM. But I > >>>>> will deal with it later. Right now I've got myself a question which I > >>>>> could not get answered myself: if I do live migration of KVM while > >>>>> there > >>>>> some uncommitted data in ceph cache will this cache be committed prior > >>>>> cut-over to another host? Reading through the list I've got an > >>>>> impression that it may be left uncommitted and thus it may cause data > >>>>> corruption. I just would like a simple confirmation if code which > >>>>> commits cache on cut-over to new host does exist and no data corruption > >>>>> due to RBD cache+live migration should happen. > >>>>> > >>>>> Regards, > >>>>> Vladimir > >>>> > >>>> > >>>> QEMU does a flush on all the disks when it stops the guest on the > >>>> original host, so there will be no uncommitted data in the cache. > >>>> > >>>> Josh > >>> > >>> Thank you for quick and precise answer. Now when I actually attempted to > >>> live migrate ceph based VM I get: > >>> > >>> Unable to migrate guest: Invalid relative path > >>> 'rbd/mail.logics.net.au:rbd_cache=true': Invalid argument > >>> > >>> I guess KVM does not like having :rbd_cache=true (migration works > >>> without it). I know that it is most likely KVM problem but still decided > >>> to ask here in case if you know about it. Any ideas how to fix it? > >>> > >>> Regards, > >>> Vladimir > >> > >> > >> Is the destination librbd older and not supporting the cache option? > >> > >> Migrating with rbd_cache=true and other options specified like that > >> worked in my testing. > >> > >> Josh > > > > Both installations are the same: > > qemu 1.0.17 > > ceph 0.47.3 > > libvirt 0.9.12 > > > > I have googled around and found that if I call migration with --unsafe > > option then it should go. And indeed: it works. Apparently this check > > introduced in libvirt 0.9.12 . Did quick downgrade to libvirt 0.9.11 and no > > problems migrating. > > Have we checked if the live migrate actually does do the cache flushes > when you use the unsafe flag? That worries me a little! > > In either case, I created a bug so we can try and make QEMU play nice: > http://tracker.newdream.net/issues/2685 I took a quick look at the libvirt code and I think this is an issue in libvirt only. The unsafe flag is not handed over to qemu. You could try the attached patch (untested). Christian [-- Attachment #2: 0001-libvirt-allow-migration-for-network-protocols.patch --] [-- Type: text/plain, Size: 1207 bytes --] From 36314693f8b9be1f3c77621543adf01d7c51cb88 Mon Sep 17 00:00:00 2001 From: Christian Brunner <chb@muc.de> Date: Tue, 19 Jun 2012 12:23:38 +0200 Subject: [PATCH] libvirt: allow migration for network protocols Live migration should be possible with most (all ?) network protocols, as qemu does a flush right before the migration. Signed-off-by: Christian Brunner <chb@muc.de> --- src/qemu/qemu_migration.c | 6 ++++++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index aee613e..6392b98 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -848,6 +848,12 @@ qemuMigrationIsSafe(virDomainDefPtr def) return false; } + if (disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_RBD || + disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_NBD || + disk->protocol == VIR_DOMAIN_DISK_PROTOCOL_SHEEPDOG ) { + continue; + } + qemuReportError(VIR_ERR_MIGRATE_UNSAFE, "%s", _("Migration may lead to data corruption if disks" " use cache != none")); -- 1.7.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-07-02 19:05 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-07-01 0:42 Ceph and KVM live migration Vladimir Bashkirtsev 2012-07-01 1:17 ` Josh Durgin 2012-07-01 2:15 ` Vladimir Bashkirtsev 2012-07-01 2:29 ` Josh Durgin 2012-07-01 3:21 ` Vladimir Bashkirtsev 2012-07-02 18:21 ` Gregory Farnum 2012-07-02 19:00 ` Josh Durgin 2012-07-02 19:02 ` Christian Brunner
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.