* XCP: sr driver question wrt vm-migrate @ 2010-06-07 8:26 YAMAMOTO Takashi 2010-06-07 12:29 ` Jonathan Ludlam 0 siblings, 1 reply; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-07 8:26 UTC (permalink / raw) To: xen-devel hi, on vm-migrate, xapi attaches a vdi on the migrate-to host before detaching it on the migrate-from host. unfortunately it doesn't work for our product, which doesn't provide a way to attach a volume to multiple hosts at the same time. is VDI_ACTIVATE something what i can use as a workaround? or any other suggestions? YAMAMOTO Takashi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: sr driver question wrt vm-migrate 2010-06-07 8:26 XCP: sr driver question wrt vm-migrate YAMAMOTO Takashi @ 2010-06-07 12:29 ` Jonathan Ludlam 2010-06-08 7:11 ` YAMAMOTO Takashi 0 siblings, 1 reply; 12+ messages in thread From: Jonathan Ludlam @ 2010-06-07 12:29 UTC (permalink / raw) To: YAMAMOTO Takashi; +Cc: xen-devel@lists.xensource.com Yup, vdi activate is the way forward. If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: VM start: host A: VDI.attach host A: VDI.activate VM migrate: host B: VDI.attach (VM pauses on host A) host A: VDI.deactivate host B: VDI.activate (VM unpauses on host B) host A: VDI.detach VM shutdown: host B: VDI.deactivate host B: VDI.detach so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. Hope this helps, Jon On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: > hi, > > on vm-migrate, xapi attaches a vdi on the migrate-to host > before detaching it on the migrate-from host. > unfortunately it doesn't work for our product, which doesn't > provide a way to attach a volume to multiple hosts at the same time. > is VDI_ACTIVATE something what i can use as a workaround? > or any other suggestions? > > YAMAMOTO Takashi > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: sr driver question wrt vm-migrate 2010-06-07 12:29 ` Jonathan Ludlam @ 2010-06-08 7:11 ` YAMAMOTO Takashi 2010-06-16 6:19 ` YAMAMOTO Takashi 0 siblings, 1 reply; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-08 7:11 UTC (permalink / raw) To: Jonathan.Ludlam; +Cc: xen-devel hi, i'll try deferring the attach operation to vdi_activate. thanks! YAMAMOTO Takashi > Yup, vdi activate is the way forward. > > If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: > > VM start: > > host A: VDI.attach > host A: VDI.activate > > VM migrate: > > host B: VDI.attach > > (VM pauses on host A) > > host A: VDI.deactivate > host B: VDI.activate > > (VM unpauses on host B) > > host A: VDI.detach > > VM shutdown: > > host B: VDI.deactivate > host B: VDI.detach > > so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. > > Hope this helps, > > Jon > > > On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: > >> hi, >> >> on vm-migrate, xapi attaches a vdi on the migrate-to host >> before detaching it on the migrate-from host. >> unfortunately it doesn't work for our product, which doesn't >> provide a way to attach a volume to multiple hosts at the same time. >> is VDI_ACTIVATE something what i can use as a workaround? >> or any other suggestions? >> >> YAMAMOTO Takashi >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: sr driver question wrt vm-migrate 2010-06-08 7:11 ` YAMAMOTO Takashi @ 2010-06-16 6:19 ` YAMAMOTO Takashi 2010-06-16 12:06 ` Jonathan Ludlam 0 siblings, 1 reply; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-16 6:19 UTC (permalink / raw) To: Jonathan.Ludlam; +Cc: xen-devel hi, after making my sr driver defer the attach operation as you suggested, i got migration work. thanks! however, when repeating live migration between two hosts for testing, i got the following error. it doesn't seem so reproducable. do you have any idea? YAMAMOTO Takashi + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f An error occurred during the migration process. vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration > hi, > > i'll try deferring the attach operation to vdi_activate. > thanks! > > YAMAMOTO Takashi > >> Yup, vdi activate is the way forward. >> >> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >> >> VM start: >> >> host A: VDI.attach >> host A: VDI.activate >> >> VM migrate: >> >> host B: VDI.attach >> >> (VM pauses on host A) >> >> host A: VDI.deactivate >> host B: VDI.activate >> >> (VM unpauses on host B) >> >> host A: VDI.detach >> >> VM shutdown: >> >> host B: VDI.deactivate >> host B: VDI.detach >> >> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >> >> Hope this helps, >> >> Jon >> >> >> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >> >>> hi, >>> >>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>> before detaching it on the migrate-from host. >>> unfortunately it doesn't work for our product, which doesn't >>> provide a way to attach a volume to multiple hosts at the same time. >>> is VDI_ACTIVATE something what i can use as a workaround? >>> or any other suggestions? >>> >>> YAMAMOTO Takashi >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: sr driver question wrt vm-migrate 2010-06-16 6:19 ` YAMAMOTO Takashi @ 2010-06-16 12:06 ` Jonathan Ludlam 2010-06-17 9:52 ` YAMAMOTO Takashi 0 siblings, 1 reply; 12+ messages in thread From: Jonathan Ludlam @ 2010-06-16 12:06 UTC (permalink / raw) To: YAMAMOTO Takashi; +Cc: xen-devel@lists.xensource.com This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. Cheers, Jon On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: > hi, > > after making my sr driver defer the attach operation as you suggested, > i got migration work. thanks! > > however, when repeating live migration between two hosts for testing, > i got the following error. it doesn't seem so reproducable. > do you have any idea? > > YAMAMOTO Takashi > > + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f > An error occurred during the migration process. > vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) > source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) > destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) > msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration > >> hi, >> >> i'll try deferring the attach operation to vdi_activate. >> thanks! >> >> YAMAMOTO Takashi >> >>> Yup, vdi activate is the way forward. >>> >>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>> >>> VM start: >>> >>> host A: VDI.attach >>> host A: VDI.activate >>> >>> VM migrate: >>> >>> host B: VDI.attach >>> >>> (VM pauses on host A) >>> >>> host A: VDI.deactivate >>> host B: VDI.activate >>> >>> (VM unpauses on host B) >>> >>> host A: VDI.detach >>> >>> VM shutdown: >>> >>> host B: VDI.deactivate >>> host B: VDI.detach >>> >>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>> >>> Hope this helps, >>> >>> Jon >>> >>> >>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>> >>>> hi, >>>> >>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>> before detaching it on the migrate-from host. >>>> unfortunately it doesn't work for our product, which doesn't >>>> provide a way to attach a volume to multiple hosts at the same time. >>>> is VDI_ACTIVATE something what i can use as a workaround? >>>> or any other suggestions? >>>> >>>> YAMAMOTO Takashi >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: sr driver question wrt vm-migrate 2010-06-16 12:06 ` Jonathan Ludlam @ 2010-06-17 9:52 ` YAMAMOTO Takashi 2010-06-18 2:45 ` XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) YAMAMOTO Takashi 0 siblings, 1 reply; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-17 9:52 UTC (permalink / raw) To: Jonathan.Ludlam; +Cc: xen-devel hi, thanks. i'll take a look at the log if it happens again. YAMAMOTO Takashi > This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. > > Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. > > Cheers, > > Jon > > > On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: > >> hi, >> >> after making my sr driver defer the attach operation as you suggested, >> i got migration work. thanks! >> >> however, when repeating live migration between two hosts for testing, >> i got the following error. it doesn't seem so reproducable. >> do you have any idea? >> >> YAMAMOTO Takashi >> >> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >> An error occurred during the migration process. >> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) >> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) >> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) >> msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration >> >>> hi, >>> >>> i'll try deferring the attach operation to vdi_activate. >>> thanks! >>> >>> YAMAMOTO Takashi >>> >>>> Yup, vdi activate is the way forward. >>>> >>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>>> >>>> VM start: >>>> >>>> host A: VDI.attach >>>> host A: VDI.activate >>>> >>>> VM migrate: >>>> >>>> host B: VDI.attach >>>> >>>> (VM pauses on host A) >>>> >>>> host A: VDI.deactivate >>>> host B: VDI.activate >>>> >>>> (VM unpauses on host B) >>>> >>>> host A: VDI.detach >>>> >>>> VM shutdown: >>>> >>>> host B: VDI.deactivate >>>> host B: VDI.detach >>>> >>>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>>> >>>> Hope this helps, >>>> >>>> Jon >>>> >>>> >>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>>> >>>>> hi, >>>>> >>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>>> before detaching it on the migrate-from host. >>>>> unfortunately it doesn't work for our product, which doesn't >>>>> provide a way to attach a volume to multiple hosts at the same time. >>>>> is VDI_ACTIVATE something what i can use as a workaround? >>>>> or any other suggestions? >>>>> >>>>> YAMAMOTO Takashi >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-17 9:52 ` YAMAMOTO Takashi @ 2010-06-18 2:45 ` YAMAMOTO Takashi 2010-06-18 12:53 ` Jonathan Ludlam 0 siblings, 1 reply; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-18 2:45 UTC (permalink / raw) To: Jonathan.Ludlam; +Cc: xen-devel hi, i got another error on vm-migrate. "signal -7" in the log seems intersting. does this ring your bell? YAMAMOTO Takashi + date Thu Jun 17 21:51:44 JST 2010 + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f Lost connection to the server. /var/log/messages: Jun 17 21:51:40 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|-port.vif4958.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 Jun 17 21:51:41 s1 xapi: [ warn|s1|2416799 unix-RPC|VM.pool_migrate R:832813c0722b|hotplug] Warning, deleting 'vif' entry from /xapi/4958/hotplug/vif/1 Jun 17 21:51:41 s1 xapi: [error|s1|90 xal_listen|VM (domid: 4958) device_event = ChangeUncooperative false D:0e953bd99071|event] device_event could not be processed because VM record not in database Jun 17 21:51:47 s1 xapi: [ warn|s1|2417066 inet_rpc|VM.pool_migrate R:fee54e870a4e|xapi] memory_required_bytes = 1080033280 > memory_static_max = 1073741824; clipping Jun 17 21:51:57 s1 xenguest: Determined the following parameters from xenstore: Jun 17 21:51:57 s1 xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0 Jun 17 21:52:43 s1 scripts-vif: Called as "add vif" domid:4959 devid:1 mode:vswitch Jun 17 21:52:44 s1 scripts-vif: Called as "online vif" domid:4959 devid:1 mode:vswitch Jun 17 21:52:46 s1 scripts-vif: Adding vif4959.1 to xenbr0 with address fe:ff:ff:ff:ff:ff Jun 17 21:52:46 s1 ovs-vsctl: Called as br-to-vlan xenbr0 Jun 17 21:52:49 s1 ovs-cfg-mod: 00001|cfg|INFO|using "/etc/ovs-vswitchd.conf" as configuration file, "/etc/.ovs-vswitchd.conf.~lock~" as lock file Jun 17 21:52:49 s1 ovs-cfg-mod: 00002|cfg_mod|INFO|configuration changes: Jun 17 21:52:49 s1 ovs-cfg-mod: 00003|cfg_mod|INFO|+bridge.xenbr0.port=vif4959.1 Jun 17 21:52:49 s1 ovs-cfg-mod: 00004|cfg_mod|INFO|+port.vif4959.1.net-uuid=9ca059b1-ac1e-8d3f-ff19-e5e74f7b7392 Jun 17 21:52:49 s1 ovs-cfg-mod: 00005|cfg_mod|INFO|+port.vif4959.1.vif-mac=2e:17:01:b0:05:fb Jun 17 21:52:49 s1 ovs-cfg-mod: 00006|cfg_mod|INFO|+port.vif4959.1.vif-uuid=271f0001-06ca-c9ca-cabc-dc79f412d925 Jun 17 21:52:49 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|+port.vif4959.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] received signal -7 Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] xapi watchdog exiting. Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] Fatal: xapi died with signal -7: not restarting (watchdog never restarts on this signal) Jun 17 21:55:11 s1 python: PERFMON: caught IOError: (socket error (111, 'Connection refused')) - restarting XAPI session Jun 17 22:00:02 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session Jun 17 22:04:48 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session Jun 17 22:09:58 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session Jun 17 22:14:52 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session Jun 17 22:19:38 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > hi, > > thanks. i'll take a look at the log if it happens again. > > YAMAMOTO Takashi > >> This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. >> >> Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. >> >> Cheers, >> >> Jon >> >> >> On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: >> >>> hi, >>> >>> after making my sr driver defer the attach operation as you suggested, >>> i got migration work. thanks! >>> >>> however, when repeating live migration between two hosts for testing, >>> i got the following error. it doesn't seem so reproducable. >>> do you have any idea? >>> >>> YAMAMOTO Takashi >>> >>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >>> An error occurred during the migration process. >>> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) >>> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) >>> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) >>> msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration >>> >>>> hi, >>>> >>>> i'll try deferring the attach operation to vdi_activate. >>>> thanks! >>>> >>>> YAMAMOTO Takashi >>>> >>>>> Yup, vdi activate is the way forward. >>>>> >>>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>>>> >>>>> VM start: >>>>> >>>>> host A: VDI.attach >>>>> host A: VDI.activate >>>>> >>>>> VM migrate: >>>>> >>>>> host B: VDI.attach >>>>> >>>>> (VM pauses on host A) >>>>> >>>>> host A: VDI.deactivate >>>>> host B: VDI.activate >>>>> >>>>> (VM unpauses on host B) >>>>> >>>>> host A: VDI.detach >>>>> >>>>> VM shutdown: >>>>> >>>>> host B: VDI.deactivate >>>>> host B: VDI.detach >>>>> >>>>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>>>> >>>>> Hope this helps, >>>>> >>>>> Jon >>>>> >>>>> >>>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>>>> >>>>>> hi, >>>>>> >>>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>>>> before detaching it on the migrate-from host. >>>>>> unfortunately it doesn't work for our product, which doesn't >>>>>> provide a way to attach a volume to multiple hosts at the same time. >>>>>> is VDI_ACTIVATE something what i can use as a workaround? >>>>>> or any other suggestions? >>>>>> >>>>>> YAMAMOTO Takashi >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>>> >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-18 2:45 ` XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) YAMAMOTO Takashi @ 2010-06-18 12:53 ` Jonathan Ludlam 2010-06-18 15:21 ` Jeremy Fitzhardinge 2010-06-21 2:41 ` YAMAMOTO Takashi 0 siblings, 2 replies; 12+ messages in thread From: Jonathan Ludlam @ 2010-06-18 12:53 UTC (permalink / raw) To: YAMAMOTO Takashi; +Cc: xen-devel@lists.xensource.com Very strange indeed. -7 is SIGKILL. Firstly, is this 0.1.1 or 0.5-RC? If it's 0.1.1 could you retry it on 0.5 just to check if it's already been fixed? Secondly, can you check whether xapi has started properly on both machines (ie. the init script completed successfully)? I believe that if the init script doesn't detect that xapi has started correctly it might kill it. This is about the only thing we can think of that might cause the problem you described. Cheers, Jon On 18 Jun 2010, at 03:45, YAMAMOTO Takashi wrote: > hi, > > i got another error on vm-migrate. > "signal -7" in the log seems intersting. does this ring your bell? > > YAMAMOTO Takashi > > + date > Thu Jun 17 21:51:44 JST 2010 > + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f > Lost connection to the server. > > /var/log/messages: > > Jun 17 21:51:40 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|-port.vif4958.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 > Jun 17 21:51:41 s1 xapi: [ warn|s1|2416799 unix-RPC|VM.pool_migrate R:832813c0722b|hotplug] Warning, deleting 'vif' entry from /xapi/4958/hotplug/vif/1 > Jun 17 21:51:41 s1 xapi: [error|s1|90 xal_listen|VM (domid: 4958) device_event = ChangeUncooperative false D:0e953bd99071|event] device_event could not be processed because VM record not in database > Jun 17 21:51:47 s1 xapi: [ warn|s1|2417066 inet_rpc|VM.pool_migrate R:fee54e870a4e|xapi] memory_required_bytes = 1080033280 > memory_static_max = 1073741824; clipping > Jun 17 21:51:57 s1 xenguest: Determined the following parameters from xenstore: > Jun 17 21:51:57 s1 xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0 > Jun 17 21:52:43 s1 scripts-vif: Called as "add vif" domid:4959 devid:1 mode:vswitch > Jun 17 21:52:44 s1 scripts-vif: Called as "online vif" domid:4959 devid:1 mode:vswitch > Jun 17 21:52:46 s1 scripts-vif: Adding vif4959.1 to xenbr0 with address fe:ff:ff:ff:ff:ff > Jun 17 21:52:46 s1 ovs-vsctl: Called as br-to-vlan xenbr0 > Jun 17 21:52:49 s1 ovs-cfg-mod: 00001|cfg|INFO|using "/etc/ovs-vswitchd.conf" as configuration file, "/etc/.ovs-vswitchd.conf.~lock~" as lock file > Jun 17 21:52:49 s1 ovs-cfg-mod: 00002|cfg_mod|INFO|configuration changes: > Jun 17 21:52:49 s1 ovs-cfg-mod: 00003|cfg_mod|INFO|+bridge.xenbr0.port=vif4959.1 > Jun 17 21:52:49 s1 ovs-cfg-mod: 00004|cfg_mod|INFO|+port.vif4959.1.net-uuid=9ca059b1-ac1e-8d3f-ff19-e5e74f7b7392 > Jun 17 21:52:49 s1 ovs-cfg-mod: 00005|cfg_mod|INFO|+port.vif4959.1.vif-mac=2e:17:01:b0:05:fb > Jun 17 21:52:49 s1 ovs-cfg-mod: 00006|cfg_mod|INFO|+port.vif4959.1.vif-uuid=271f0001-06ca-c9ca-cabc-dc79f412d925 > Jun 17 21:52:49 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|+port.vif4959.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 > Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] received signal -7 > Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] xapi watchdog exiting. > Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] Fatal: xapi died with signal -7: not restarting (watchdog never restarts on this signal) > Jun 17 21:55:11 s1 python: PERFMON: caught IOError: (socket error (111, 'Connection refused')) - restarting XAPI session > Jun 17 22:00:02 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > Jun 17 22:04:48 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > Jun 17 22:09:58 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > Jun 17 22:14:52 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > Jun 17 22:19:38 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session > >> hi, >> >> thanks. i'll take a look at the log if it happens again. >> >> YAMAMOTO Takashi >> >>> This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. >>> >>> Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. >>> >>> Cheers, >>> >>> Jon >>> >>> >>> On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: >>> >>>> hi, >>>> >>>> after making my sr driver defer the attach operation as you suggested, >>>> i got migration work. thanks! >>>> >>>> however, when repeating live migration between two hosts for testing, >>>> i got the following error. it doesn't seem so reproducable. >>>> do you have any idea? >>>> >>>> YAMAMOTO Takashi >>>> >>>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >>>> An error occurred during the migration process. >>>> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) >>>> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) >>>> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) >>>> msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration >>>> >>>>> hi, >>>>> >>>>> i'll try deferring the attach operation to vdi_activate. >>>>> thanks! >>>>> >>>>> YAMAMOTO Takashi >>>>> >>>>>> Yup, vdi activate is the way forward. >>>>>> >>>>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>>>>> >>>>>> VM start: >>>>>> >>>>>> host A: VDI.attach >>>>>> host A: VDI.activate >>>>>> >>>>>> VM migrate: >>>>>> >>>>>> host B: VDI.attach >>>>>> >>>>>> (VM pauses on host A) >>>>>> >>>>>> host A: VDI.deactivate >>>>>> host B: VDI.activate >>>>>> >>>>>> (VM unpauses on host B) >>>>>> >>>>>> host A: VDI.detach >>>>>> >>>>>> VM shutdown: >>>>>> >>>>>> host B: VDI.deactivate >>>>>> host B: VDI.detach >>>>>> >>>>>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>>>>> >>>>>> Hope this helps, >>>>>> >>>>>> Jon >>>>>> >>>>>> >>>>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>>>>> >>>>>>> hi, >>>>>>> >>>>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>>>>> before detaching it on the migrate-from host. >>>>>>> unfortunately it doesn't work for our product, which doesn't >>>>>>> provide a way to attach a volume to multiple hosts at the same time. >>>>>>> is VDI_ACTIVATE something what i can use as a workaround? >>>>>>> or any other suggestions? >>>>>>> >>>>>>> YAMAMOTO Takashi >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Xen-devel mailing list >>>>>>> Xen-devel@lists.xensource.com >>>>>>> http://lists.xensource.com/xen-devel >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>>> >>>>> _______________________________________________ >>>>> Xen-devel mailing list >>>>> Xen-devel@lists.xensource.com >>>>> http://lists.xensource.com/xen-devel >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-18 12:53 ` Jonathan Ludlam @ 2010-06-18 15:21 ` Jeremy Fitzhardinge 2010-06-18 21:54 ` Vincent Hanquez 2010-06-21 2:41 ` YAMAMOTO Takashi 1 sibling, 1 reply; 12+ messages in thread From: Jeremy Fitzhardinge @ 2010-06-18 15:21 UTC (permalink / raw) To: Jonathan Ludlam; +Cc: YAMAMOTO Takashi, xen-devel@lists.xensource.com On 06/18/2010 01:53 PM, Jonathan Ludlam wrote: > Very strange indeed. -7 is SIGKILL. > It's actually SIGBUS, which is even more odd. It generally means you've truncated a file underneath an mmap mapping, then touched the overhanging unbacked pages. J > Firstly, is this 0.1.1 or 0.5-RC? If it's 0.1.1 could you retry it on 0.5 just to check if it's already been fixed? > > Secondly, can you check whether xapi has started properly on both machines (ie. the init script completed successfully)? I believe that if the init script doesn't detect that xapi has started correctly it might kill it. This is about the only thing we can think of that might cause the problem you described. > > Cheers, > > Jon > > > On 18 Jun 2010, at 03:45, YAMAMOTO Takashi wrote: > > >> hi, >> >> i got another error on vm-migrate. >> "signal -7" in the log seems intersting. does this ring your bell? >> >> YAMAMOTO Takashi >> >> + date >> Thu Jun 17 21:51:44 JST 2010 >> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >> Lost connection to the server. >> >> /var/log/messages: >> >> Jun 17 21:51:40 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|-port.vif4958.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 >> Jun 17 21:51:41 s1 xapi: [ warn|s1|2416799 unix-RPC|VM.pool_migrate R:832813c0722b|hotplug] Warning, deleting 'vif' entry from /xapi/4958/hotplug/vif/1 >> Jun 17 21:51:41 s1 xapi: [error|s1|90 xal_listen|VM (domid: 4958) device_event = ChangeUncooperative false D:0e953bd99071|event] device_event could not be processed because VM record not in database >> Jun 17 21:51:47 s1 xapi: [ warn|s1|2417066 inet_rpc|VM.pool_migrate R:fee54e870a4e|xapi] memory_required_bytes = 1080033280 > memory_static_max = 1073741824; clipping >> Jun 17 21:51:57 s1 xenguest: Determined the following parameters from xenstore: >> Jun 17 21:51:57 s1 xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0 >> Jun 17 21:52:43 s1 scripts-vif: Called as "add vif" domid:4959 devid:1 mode:vswitch >> Jun 17 21:52:44 s1 scripts-vif: Called as "online vif" domid:4959 devid:1 mode:vswitch >> Jun 17 21:52:46 s1 scripts-vif: Adding vif4959.1 to xenbr0 with address fe:ff:ff:ff:ff:ff >> Jun 17 21:52:46 s1 ovs-vsctl: Called as br-to-vlan xenbr0 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00001|cfg|INFO|using "/etc/ovs-vswitchd.conf" as configuration file, "/etc/.ovs-vswitchd.conf.~lock~" as lock file >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00002|cfg_mod|INFO|configuration changes: >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00003|cfg_mod|INFO|+bridge.xenbr0.port=vif4959.1 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00004|cfg_mod|INFO|+port.vif4959.1.net-uuid=9ca059b1-ac1e-8d3f-ff19-e5e74f7b7392 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00005|cfg_mod|INFO|+port.vif4959.1.vif-mac=2e:17:01:b0:05:fb >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00006|cfg_mod|INFO|+port.vif4959.1.vif-uuid=271f0001-06ca-c9ca-cabc-dc79f412d925 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|+port.vif4959.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] received signal -7 >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] xapi watchdog exiting. >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] Fatal: xapi died with signal -7: not restarting (watchdog never restarts on this signal) >> Jun 17 21:55:11 s1 python: PERFMON: caught IOError: (socket error (111, 'Connection refused')) - restarting XAPI session >> Jun 17 22:00:02 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:04:48 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:09:58 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:14:52 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:19:38 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> >> >>> hi, >>> >>> thanks. i'll take a look at the log if it happens again. >>> >>> YAMAMOTO Takashi >>> >>> >>>> This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. >>>> >>>> Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. >>>> >>>> Cheers, >>>> >>>> Jon >>>> >>>> >>>> On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: >>>> >>>> >>>>> hi, >>>>> >>>>> after making my sr driver defer the attach operation as you suggested, >>>>> i got migration work. thanks! >>>>> >>>>> however, when repeating live migration between two hosts for testing, >>>>> i got the following error. it doesn't seem so reproducable. >>>>> do you have any idea? >>>>> >>>>> YAMAMOTO Takashi >>>>> >>>>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >>>>> An error occurred during the migration process. >>>>> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) >>>>> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) >>>>> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) >>>>> msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration >>>>> >>>>> >>>>>> hi, >>>>>> >>>>>> i'll try deferring the attach operation to vdi_activate. >>>>>> thanks! >>>>>> >>>>>> YAMAMOTO Takashi >>>>>> >>>>>> >>>>>>> Yup, vdi activate is the way forward. >>>>>>> >>>>>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>>>>>> >>>>>>> VM start: >>>>>>> >>>>>>> host A: VDI.attach >>>>>>> host A: VDI.activate >>>>>>> >>>>>>> VM migrate: >>>>>>> >>>>>>> host B: VDI.attach >>>>>>> >>>>>>> (VM pauses on host A) >>>>>>> >>>>>>> host A: VDI.deactivate >>>>>>> host B: VDI.activate >>>>>>> >>>>>>> (VM unpauses on host B) >>>>>>> >>>>>>> host A: VDI.detach >>>>>>> >>>>>>> VM shutdown: >>>>>>> >>>>>>> host B: VDI.deactivate >>>>>>> host B: VDI.detach >>>>>>> >>>>>>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>>>>>> >>>>>>> Hope this helps, >>>>>>> >>>>>>> Jon >>>>>>> >>>>>>> >>>>>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>>>>>> >>>>>>> >>>>>>>> hi, >>>>>>>> >>>>>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>>>>>> before detaching it on the migrate-from host. >>>>>>>> unfortunately it doesn't work for our product, which doesn't >>>>>>>> provide a way to attach a volume to multiple hosts at the same time. >>>>>>>> is VDI_ACTIVATE something what i can use as a workaround? >>>>>>>> or any other suggestions? >>>>>>>> >>>>>>>> YAMAMOTO Takashi >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Xen-devel mailing list >>>>>>>> Xen-devel@lists.xensource.com >>>>>>>> http://lists.xensource.com/xen-devel >>>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Xen-devel mailing list >>>>>>> Xen-devel@lists.xensource.com >>>>>>> http://lists.xensource.com/xen-devel >>>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-18 15:21 ` Jeremy Fitzhardinge @ 2010-06-18 21:54 ` Vincent Hanquez 2010-06-19 7:54 ` Jeremy Fitzhardinge 0 siblings, 1 reply; 12+ messages in thread From: Vincent Hanquez @ 2010-06-18 21:54 UTC (permalink / raw) To: Jeremy Fitzhardinge Cc: YAMAMOTO Takashi, xen-devel@lists.xensource.com, Jonathan Ludlam On 18/06/10 16:21, Jeremy Fitzhardinge wrote: > On 06/18/2010 01:53 PM, Jonathan Ludlam wrote: >> Very strange indeed. -7 is SIGKILL. >> > > It's actually SIGBUS, which is even more odd. It generally means you've ocaml has different signal values for unknown reason. $ ocaml Objective Caml version 3.11.2 # Sys.sigkill;; - : int = -7 -- Vincent ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-18 21:54 ` Vincent Hanquez @ 2010-06-19 7:54 ` Jeremy Fitzhardinge 0 siblings, 0 replies; 12+ messages in thread From: Jeremy Fitzhardinge @ 2010-06-19 7:54 UTC (permalink / raw) To: Vincent Hanquez Cc: YAMAMOTO Takashi, xen-devel@lists.xensource.com, Jonathan Ludlam On 06/18/2010 10:54 PM, Vincent Hanquez wrote: > On 18/06/10 16:21, Jeremy Fitzhardinge wrote: >> On 06/18/2010 01:53 PM, Jonathan Ludlam wrote: >>> Very strange indeed. -7 is SIGKILL. >>> >> >> It's actually SIGBUS, which is even more odd. It generally means you've > > ocaml has different signal values for unknown reason. > > $ ocaml > Objective Caml version 3.11.2 > # Sys.sigkill;; > - : int = -7 How... odd. J ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) 2010-06-18 12:53 ` Jonathan Ludlam 2010-06-18 15:21 ` Jeremy Fitzhardinge @ 2010-06-21 2:41 ` YAMAMOTO Takashi 1 sibling, 0 replies; 12+ messages in thread From: YAMAMOTO Takashi @ 2010-06-21 2:41 UTC (permalink / raw) To: Jonathan.Ludlam; +Cc: xen-devel hi, > Very strange indeed. -7 is SIGKILL. > > Firstly, is this 0.1.1 or 0.5-RC? If it's 0.1.1 could you retry it on 0.5 just to check if it's already been fixed? it's 0.1.1. i'm not sure when i can try 0.5-RC. > > Secondly, can you check whether xapi has started properly on both machines (ie. the init script completed successfully)? I believe that if the init script doesn't detect that xapi has started correctly it might kill it. This is about the only thing we can think of that might cause the problem you described. i was running a script which repeats vm-migrate a vm between two hosts. the error was after many of successful vm-migrate runs. so i don't think it's related to the init script. YAMAMOTO Takashi > > Cheers, > > Jon > > > On 18 Jun 2010, at 03:45, YAMAMOTO Takashi wrote: > >> hi, >> >> i got another error on vm-migrate. >> "signal -7" in the log seems intersting. does this ring your bell? >> >> YAMAMOTO Takashi >> >> + date >> Thu Jun 17 21:51:44 JST 2010 >> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >> Lost connection to the server. >> >> /var/log/messages: >> >> Jun 17 21:51:40 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|-port.vif4958.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 >> Jun 17 21:51:41 s1 xapi: [ warn|s1|2416799 unix-RPC|VM.pool_migrate R:832813c0722b|hotplug] Warning, deleting 'vif' entry from /xapi/4958/hotplug/vif/1 >> Jun 17 21:51:41 s1 xapi: [error|s1|90 xal_listen|VM (domid: 4958) device_event = ChangeUncooperative false D:0e953bd99071|event] device_event could not be processed because VM record not in database >> Jun 17 21:51:47 s1 xapi: [ warn|s1|2417066 inet_rpc|VM.pool_migrate R:fee54e870a4e|xapi] memory_required_bytes = 1080033280 > memory_static_max = 1073741824; clipping >> Jun 17 21:51:57 s1 xenguest: Determined the following parameters from xenstore: >> Jun 17 21:51:57 s1 xenguest: vcpu/number:1 vcpu/affinity:0 vcpu/weight:0 vcpu/cap:0 nx: 0 viridian: 1 apic: 1 acpi: 1 pae: 1 acpi_s4: 0 acpi_s3: 0 >> Jun 17 21:52:43 s1 scripts-vif: Called as "add vif" domid:4959 devid:1 mode:vswitch >> Jun 17 21:52:44 s1 scripts-vif: Called as "online vif" domid:4959 devid:1 mode:vswitch >> Jun 17 21:52:46 s1 scripts-vif: Adding vif4959.1 to xenbr0 with address fe:ff:ff:ff:ff:ff >> Jun 17 21:52:46 s1 ovs-vsctl: Called as br-to-vlan xenbr0 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00001|cfg|INFO|using "/etc/ovs-vswitchd.conf" as configuration file, "/etc/.ovs-vswitchd.conf.~lock~" as lock file >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00002|cfg_mod|INFO|configuration changes: >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00003|cfg_mod|INFO|+bridge.xenbr0.port=vif4959.1 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00004|cfg_mod|INFO|+port.vif4959.1.net-uuid=9ca059b1-ac1e-8d3f-ff19-e5e74f7b7392 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00005|cfg_mod|INFO|+port.vif4959.1.vif-mac=2e:17:01:b0:05:fb >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00006|cfg_mod|INFO|+port.vif4959.1.vif-uuid=271f0001-06ca-c9ca-cabc-dc79f412d925 >> Jun 17 21:52:49 s1 ovs-cfg-mod: 00007|cfg_mod|INFO|+port.vif4959.1.vm-uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] received signal -7 >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] xapi watchdog exiting. >> Jun 17 21:52:51 s1 xapi: [ info|s1|0 thread_zero||watchdog] Fatal: xapi died with signal -7: not restarting (watchdog never restarts on this signal) >> Jun 17 21:55:11 s1 python: PERFMON: caught IOError: (socket error (111, 'Connection refused')) - restarting XAPI session >> Jun 17 22:00:02 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:04:48 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:09:58 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:14:52 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> Jun 17 22:19:38 s1 python: PERFMON: caught socket.error: (111 Connection refused) - restarting XAPI session >> >>> hi, >>> >>> thanks. i'll take a look at the log if it happens again. >>> >>> YAMAMOTO Takashi >>> >>>> This is usually the result of a failure earier on. Could you grep through the logs to get the whole trace of what went on? Best thing to do is grep for VM.pool_migrate, then find the task reference (the hex string beginning with 'R:' immediately after the 'VM.pool_migrate') and grep for this string in the logs on both the source and destination machines. >>>> >>>> Have a look through these, and if it's still not obvious what went wrong, post them to the list and we can have a look. >>>> >>>> Cheers, >>>> >>>> Jon >>>> >>>> >>>> On 16 Jun 2010, at 07:19, YAMAMOTO Takashi wrote: >>>> >>>>> hi, >>>>> >>>>> after making my sr driver defer the attach operation as you suggested, >>>>> i got migration work. thanks! >>>>> >>>>> however, when repeating live migration between two hosts for testing, >>>>> i got the following error. it doesn't seem so reproducable. >>>>> do you have any idea? >>>>> >>>>> YAMAMOTO Takashi >>>>> >>>>> + xe vm-migrate live=true uuid=23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 host=67b8b07b-8c50-4677-a511-beb196ea766f >>>>> An error occurred during the migration process. >>>>> vm: 23ecfa58-aa30-ea6a-f9fe-7cb2a5487592 (CentOS53x64-1) >>>>> source: eea41bdd-d2ce-4a9a-bc51-1ca286320296 (s6) >>>>> destination: 67b8b07b-8c50-4677-a511-beb196ea766f (s1) >>>>> msg: Caught exception INTERNAL_ERROR: [ Xapi_vm_migrate.Remote_failed("unmarshalling result code from remote") ] at last minute during migration >>>>> >>>>>> hi, >>>>>> >>>>>> i'll try deferring the attach operation to vdi_activate. >>>>>> thanks! >>>>>> >>>>>> YAMAMOTO Takashi >>>>>> >>>>>>> Yup, vdi activate is the way forward. >>>>>>> >>>>>>> If you advertise VDI_ACTIVATE and VDI_DEACTIVATE in the 'get_driver_info' response, xapi will call the following during the start-migrate-shutdown lifecycle: >>>>>>> >>>>>>> VM start: >>>>>>> >>>>>>> host A: VDI.attach >>>>>>> host A: VDI.activate >>>>>>> >>>>>>> VM migrate: >>>>>>> >>>>>>> host B: VDI.attach >>>>>>> >>>>>>> (VM pauses on host A) >>>>>>> >>>>>>> host A: VDI.deactivate >>>>>>> host B: VDI.activate >>>>>>> >>>>>>> (VM unpauses on host B) >>>>>>> >>>>>>> host A: VDI.detach >>>>>>> >>>>>>> VM shutdown: >>>>>>> >>>>>>> host B: VDI.deactivate >>>>>>> host B: VDI.detach >>>>>>> >>>>>>> so the disk is never activated on both hosts at once, but it does still go through a period when it is attached to both hosts at once. So you could, for example, check that the disk *could* be attached on the vdi_attach SMAPI call, and actually attach it properly on the vdi_activate call. >>>>>>> >>>>>>> Hope this helps, >>>>>>> >>>>>>> Jon >>>>>>> >>>>>>> >>>>>>> On 7 Jun 2010, at 09:26, YAMAMOTO Takashi wrote: >>>>>>> >>>>>>>> hi, >>>>>>>> >>>>>>>> on vm-migrate, xapi attaches a vdi on the migrate-to host >>>>>>>> before detaching it on the migrate-from host. >>>>>>>> unfortunately it doesn't work for our product, which doesn't >>>>>>>> provide a way to attach a volume to multiple hosts at the same time. >>>>>>>> is VDI_ACTIVATE something what i can use as a workaround? >>>>>>>> or any other suggestions? >>>>>>>> >>>>>>>> YAMAMOTO Takashi >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Xen-devel mailing list >>>>>>>> Xen-devel@lists.xensource.com >>>>>>>> http://lists.xensource.com/xen-devel >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Xen-devel mailing list >>>>>>> Xen-devel@lists.xensource.com >>>>>>> http://lists.xensource.com/xen-devel >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel > ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2010-06-21 2:41 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-06-07 8:26 XCP: sr driver question wrt vm-migrate YAMAMOTO Takashi 2010-06-07 12:29 ` Jonathan Ludlam 2010-06-08 7:11 ` YAMAMOTO Takashi 2010-06-16 6:19 ` YAMAMOTO Takashi 2010-06-16 12:06 ` Jonathan Ludlam 2010-06-17 9:52 ` YAMAMOTO Takashi 2010-06-18 2:45 ` XCP: signal -7 (Re: XCP: sr driver question wrt vm-migrate) YAMAMOTO Takashi 2010-06-18 12:53 ` Jonathan Ludlam 2010-06-18 15:21 ` Jeremy Fitzhardinge 2010-06-18 21:54 ` Vincent Hanquez 2010-06-19 7:54 ` Jeremy Fitzhardinge 2010-06-21 2:41 ` YAMAMOTO Takashi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).